UX-Research Video Transcriber

CLI tool for automated transcription and classification of voice comments from UX research sessions.

Downloads or reads user session recordings → extracts speech → recognizes text with timestamps → groups quotes by user journey stage using an LLM.

What it does

Reads a list of sources from a file (cloud.mail.ru URLs or local video paths)
Downloads videos via cloud.mail.ru public API (local files — no download needed)
Extracts audio track via ffmpeg (WAV 16kHz mono)
Detects voice segments using Silero VAD
Transcribes speech using Whisper (tiny/base, runs locally)
Groups quotes by UX journey stage using MiniMax LLM API (one request per video)
Saves output: one .txt file per video

User journey stages:

Code	Stage	Includes
A	Search & Selection	Search bar, catalog, filters, sorting, listing
B	Exploration	Product card view
C	Checkout	Pickup point, date/time, payment method

Requirements

System

Python 3.10+
ffmpeg installed and available in PATH

# macOS
brew install ffmpeg

# Ubuntu / Debian
sudo apt install ffmpeg

# Verify
ffmpeg -version

Whisper CLI

The openai-whisper package registers a whisper console command on install. The application uses the Python API (import whisper), but the CLI is handy for manually checking transcription of individual files:

# Manually transcribe a file (for debugging)
whisper audio.wav --model base --language Russian

# Verify installation
whisper --help

The whisper command appears automatically after pip install openai-whisper — no separate installation needed.

Optional

CUDA-compatible GPU — speeds up Whisper; without GPU it runs on CPU (slower)

Installation

# 1. Clone the repository
git clone <repo_url>
cd uxvt

# 2. Create a virtual environment
python -m venv .venv
source .venv/bin/activate       # Linux / macOS
# .venv\Scripts\activate        # Windows

# 3. Install dependencies
pip install -e ".[dev]"

# 4. Verify tools
ffmpeg -version
whisper --help

Configuration

1. Create `.env` file

cp .env.example .env

Open .env and fill in:

# Required
MINIMAX_API_KEY=your_minimax_api_key_here

# Optional (defaults are fine for most cases)
MINIMAX_API_URL=https://api.minimax.chat/v1/text/chatcompletion_v2
MINIMAX_MODEL=MiniMax-Text-01
MINIMAX_MAX_RETRIES=5
MINIMAX_RETRY_BASE_DELAY_SEC=1.0
MINIMAX_RETRY_MAX_DELAY_SEC=60.0
MINIMAX_MAX_PROMPT_CHARS=8000
MINIMAX_BATCH_SIZE=50
WHISPER_MODEL=base
OUTPUT_DIR=./output
TMP_DIR=./tmp
LOG_LEVEL=INFO
VERBOSE=true

⚠️ .env is listed in .gitignore and will never be committed.

Parameters

Variable	Description	Default
`MINIMAX_API_KEY`	MiniMax API key — required	—
`MINIMAX_MAX_RETRIES`	Max retry attempts for MiniMax API	`5`
`MINIMAX_RETRY_BASE_DELAY_SEC`	Initial retry delay in seconds (Fibonacci backoff)	`1.0`
`MINIMAX_RETRY_MAX_DELAY_SEC`	Maximum retry delay in seconds	`60.0`
`MINIMAX_MAX_PROMPT_CHARS`	Prompt character limit (batching kicks in if exceeded)	`8000`
`WHISPER_MODEL`	Whisper model: `tiny` (faster) / `base` (more accurate)	`base`
`OUTPUT_DIR`	Output folder for `.txt` report files	`./output`
`TMP_DIR`	Temp folder for intermediate files (auto-deleted)	`./tmp`
`LOG_LEVEL`	Structured log level	`INFO`
`VERBOSE`	Show progress output in console	`true`

Usage

Input file format

Create a plain text file (e.g. links.txt) — one entry per line. Each entry is either a cloud.mail.ru URL or a local path to a video file:

# Lines starting with # are comments and are ignored

# cloud.mail.ru URLs
https://cloud.mail.ru/public/1DKe/UZ3SBsysM
https://cloud.mail.ru/public/Q4zk/Q1p7QNij1

# Local files
/home/user/videos/5_galina_zoozavr.MP4
./videos/session_03.mp4

# Empty lines are also ignored

Basic run

python -m uxvt links.txt

Custom output directory

python -m uxvt links.txt --output-dir ./reports

Faster Whisper model (less accurate, quicker)

WHISPER_MODEL=tiny python -m uxvt links.txt

Silent mode (no progress output)

VERBOSE=false python -m uxvt links.txt

Console output example

[1/2] 🔗 https://cloud.mail.ru/public/1DKe/UZ3SBsysM
[1/2] ⬇  Downloading... (562 MB)
[1/2] ✓  Downloaded → tmp/5-galina-wildberries.mp4
[1/2] 🎵 Extracting audio (ffmpeg)...
[1/2] ✓  Audio ready → tmp/5-galina-wildberries.wav
[1/2] 🔊 Voice detection (Silero VAD)...
[1/2] ✓  Voice segments found: 12
[1/2] 📝 Transcription (Whisper base)...
[1/2] ✓  Quotes recognized: 8
[1/2] 🤖 Classification by UJ stage (MiniMax)...
[1/2] ✓  Classification done (A:3, B:2, C:3)
[1/2] 💾 Saved → output/report_5-galina-wildberries.txt
[1/2] 🗑  Temp files deleted

[2/2] 📁 /home/user/videos/5_galina_zoozavr.MP4  ← local file
[2/2] 🎵 Extracting audio (ffmpeg)...
...

════════════════════════════════════════
✅ Processed: 2  ❌ Errors: 0  ⏭  Skipped: 0
Results saved to: ./output/

Output file format

For each video, a output/report_{slug}.txt file is created:

Video: 5 Galina - Wildberries
Source: https://cloud.mail.ru/public/1DKe/UZ3SBsysM
Processed: 2026-03-15

============================================================
STAGE A: Search & Selection
============================================================
[00:01:12] "Looking for cat food, let me hit search"
[00:02:45] "The filters don't really work well"

============================================================
STAGE B: Product Exploration
============================================================
no comments for this stage

============================================================
STAGE C: Checkout
============================================================
[00:08:03] "I'll pick it up at the pickup point"

============================================================
TOTAL QUOTES: 3

If no speech is detected in the video:

Video: ...
...

no voice comments detected

If the MiniMax API is unavailable — quotes are saved to the report without stage classification, with a warning.

Testing

Run all tests

pytest -q

Unit tests only (fast, no real videos needed)

pytest tests/unit/ -v

Integration tests only (require files in `videos/`)

pytest tests/integration/ -v

Specific module

pytest tests/unit/test_classifier.py -v
pytest tests/unit/test_downloader.py -v -k "slug"

Full check before commit

ruff format . && ruff check . && mypy . && pytest -q

Test videos

Integration tests use real video files from the videos/ folder:

videos/
├── 5_galina_wildberries.MP4
└── 5_galina_zoozavr.MP4

The videos/ folder is not committed (.gitignore). Place files there manually before running integration tests.

If a file is missing — the test is automatically skipped (pytest.mark.skip) and the rest of the suite continues.

Unit tests do not require real videos — all external dependencies (Whisper, Silero VAD, httpx, MiniMax API) are mocked.

Project structure

uxvt/
├── src/
│   └── uxvt/
│       ├── __main__.py        # entry point: python -m uxvt
│       ├── cli.py             # CLI argument parsing
│       ├── settings.py        # configuration via pydantic-settings
│       ├── pipeline.py        # processing orchestrator
│       ├── downloader.py      # cloud.mail.ru API downloader
│       ├── audio_extractor.py # ffmpeg wrapper
│       ├── vad.py             # Silero VAD wrapper
│       ├── transcriber.py     # Whisper wrapper
│       ├── classifier.py      # MiniMax API + Fibonacci retry
│       ├── reporter.py        # report_*.txt formatting and writing
│       ├── tmp_manager.py     # temp file cleanup
│       ├── progress.py        # verbose console progress output
│       ├── models.py          # dataclasses: VideoSourceResult, ClassifiedReport, etc.
│       └── exceptions.py      # domain exceptions
├── tests/
│   ├── conftest.py
│   ├── unit/
│   │   ├── test_downloader.py
│   │   ├── test_audio_extractor.py
│   │   ├── test_vad.py
│   │   ├── test_transcriber.py
│   │   ├── test_classifier.py
│   │   ├── test_reporter.py
│   │   ├── test_tmp_manager.py
│   │   ├── test_progress.py
│   │   └── test_models.py
│   └── integration/
│       ├── conftest.py        # video_file_path fixture → videos/
│       └── test_pipeline.py
├── videos/                    # test videos (not committed)
├── output/                    # results (not committed)
├── tmp/                       # temp files (not committed)
├── pyproject.toml
├── .env.example
├── .gitignore
└── README.md

Troubleshooting

ffmpeg: command not found → Install ffmpeg and make sure it's in PATH (see Requirements).

ValidationError: MINIMAX_API_KEY missing → Create .env from .env.example and set your API key.

Test skipped: video file not found in videos/ → Place video files in the videos/ folder (not committed to the repository).

Whisper is slow → Switch to WHISPER_MODEL=tiny or use a machine with a GPU. → To manually check transcription of a single file: whisper audio.wav --model tiny --language Russian

whisper: command not found → Install the package: pip install openai-whisper. The whisper command is registered automatically alongside the Python library.

MiniMax API: LLMClassificationError after all retries → Quotes will be saved to the report without stage classification. Check your API key and service availability.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

UX-Research Video Transcriber

What it does

Requirements

System

Whisper CLI

Optional

Installation

Configuration

1. Create `.env` file

Parameters

Usage

Input file format

Basic run

Custom output directory

Faster Whisper model (less accurate, quicker)

Silent mode (no progress output)

Console output example

Output file format

Testing

Run all tests

Unit tests only (fast, no real videos needed)

Integration tests only (require files in `videos/`)

Specific module

Full check before commit

Test videos

Project structure

Troubleshooting

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
output		output
rules		rules
specs		specs
src/uxvt		src/uxvt
tests		tests
tmp		tmp
videos		videos
.env.example		.env.example
.gitignore		.gitignore
.rules		.rules
README-ru.md		README-ru.md
README.md		README.md
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

UX-Research Video Transcriber

What it does

Requirements

System

Whisper CLI

Optional

Installation

Configuration

1. Create .env file

Parameters

Usage

Input file format

Basic run

Custom output directory

Faster Whisper model (less accurate, quicker)

Silent mode (no progress output)

Console output example

Output file format

Testing

Run all tests

Unit tests only (fast, no real videos needed)

Integration tests only (require files in videos/)

Specific module

Full check before commit

Test videos

Project structure

Troubleshooting

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

1. Create `.env` file

Integration tests only (require files in `videos/`)

Packages