Skip to content

feat: speech-to-text NVIDIA/non-NVIDIA split — CPU-capable alternative for laptops #11

@LTSCommerce

Description

@LTSCommerce

Problem

play-speech-to-text.yml currently installs faster-whisper with CUDA support, which is only useful on machines with a discrete NVIDIA GPU. On laptops without NVIDIA (integrated Intel/AMD graphics only), CUDA is unavailable and the CUDA backend is pointless — wastes install time and may produce confusing errors or silent fallback to CPU anyway.

Current Behaviour

Single installation path: faster-whisper + CUDA regardless of hardware.

Expected Behaviour

Detect whether NVIDIA GPU is present and choose the appropriate backend:

  • NVIDIA GPU detected → faster-whisper with CUDA (current behaviour, keep as-is)
  • No NVIDIA GPU → CPU/ROCm-compatible alternative

Investigation Needed

Research and evaluate speech-to-text systems that work well on CPU or integrated graphics:

  • faster-whisper with CPU backend — same tool, just skip CUDA deps; check if performance is acceptable on modern laptop CPUs
  • whisper.cpp — pure C++ implementation, no Python deps, runs well on CPU, supports Metal/OpenCL
  • vosk — lightweight offline STT, very low resource usage, runs on CPU
  • RealtimeSTT with CPU — check if the current wsi-stream wrapper can work without CUDA
  • sherpa-onnx — ONNX-based, good CPU performance, supports Whisper models

Acceptance Criteria

  • Playbook detects NVIDIA GPU (reuse logic from check_hardware / lspci | grep -i nvidia)
  • NVIDIA path: faster-whisper + CUDA (current)
  • Non-NVIDIA path: working alternative with acceptable latency on laptop CPU
  • Both paths use the same wsi-stream interface (or document differences)
  • play-speech-to-text.yml removed from auto_run_common in run.bash until this is resolved (currently auto-runs on all hardware)

Related

  • playbooks/imports/optional/common/play-speech-to-text.yml
  • files/home/.local/bin/wsi-stream
  • run.bash auto_run_common array

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions