For most use cases, the --complete flag provides a comprehensive processing pipeline that merges timestamped videos, fills gaps, analyzes speakers, and generates a timeline.
# Process a directory of timestamped videos with the complete pipeline
uv run main.py process /path/to/videos --complete --output final_video.mp4This command performs the following actions:
- Merges all timestamped videos in chronological order.
- Fills gaps between videos with a blank video (using the default
blank_muted.MP4). - Analyzes speakers to detect conversations.
- Generates a timeline (
.yaml) with detected speech segments. - Creates a processed video with conversations muted.
Note: Video merging is automatic for directories containing timestamped videos. The tool uses blank_muted.MP4 as the default blank video file for gap filling.
- FFmpeg: Required for all media processing tasks.
- Hugging Face Account: An access token is needed to download the speaker detection model.
For GPU acceleration, you need a compatible NVIDIA GPU with the appropriate CUDA Toolkit installed. It is critical that your PyTorch version matches your CUDA version.
- Check your CUDA version:
nvcc --version
- Install the correct PyTorch version:
Visit the PyTorch website to find the correct installation command for your specific CUDA version. For example, for CUDA 12.8, the command is:
uv add torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128
- Clone the repository and install dependencies:
git clone <repository-url> cd audio-processing uv sync
Requirements:
- FFmpeg
- NVIDIA GPU with CUDA (optional, for acceleration)
- Hugging Face token
Setup:
echo "HUGGINGFACE_ACCESS_TOKEN=your_token_here" > .envuv run main.py process /path/to/videos --complete --output final_video.mp4uv run main.py process video.mp4 --generate-timeline- Edit the generated timeline (
*_timeline.yaml). Mark segments to remove by settingtype: all. You may place your manual edits at the front of the file if you prefer; the tool will correctly apply them regardless of position. - Apply edits with the single‑timeline
editcommand:
uv run main.py edit -i video.mp4 -t video_timeline.yaml -o video_edited.mp4What this does:
- Extracts only the
type: allranges from your edited timeline - Blanks those ranges while stream‑copying the rest to keep processing fast
- Merges overlaps so manual
alledits take precedence
uv run main.py process /path/to/videos --merge-only --output merged.mp4uv run main.py compress video.mp4 -o compressed.mp4Rename files based on CSV data and set modified timestamp metadata:
uv run python rename_from_csv.py filenames.csv --dry-run
uv run python rename_from_csv.py filenames.csv # Apply changesCSV format: timestamp,filepath
timestamp,filepath
20240811143022,data/video1.mp4
20240811144530,data/video2.mp4This will:
- Rename files to
YYYYMMDD_compressedformat (e.g.,20240811_compressed.mp4) - Set file modified timestamp to match the original timestamp
For automatic video merging, use timestamp format: YYYYMMDDHHMMSS_*.ext
Example: 20231026183000_camera1.mp4
The tool generates output files with the following naming conventions:
- Merged videos:
YYYYMMDD_merged.mp4(based on first clip's date) - Processed videos:
YYYYMMDD_processed.mp4(based on original timestamp) - Edited videos: Defaults to
<original_stem>_edited<ext>when using theeditcommand - Compressed videos:
YYYYMMDD_compressed.mp4(based on original timestamp)
Examples:
- Input:
20240811143022_video.mp4→ Processed:20240811_processed.mp4 - Input:
20240811143022_video.mp4→ Compressed:20240811_compressed.mp4