Audio/Video Processing Tool

Quick Start

Complete Processing

For most use cases, the --complete flag provides a comprehensive processing pipeline that merges timestamped videos, fills gaps, analyzes speakers, and generates a timeline.

# Process a directory of timestamped videos with the complete pipeline
uv run main.py process /path/to/videos --complete --output final_video.mp4

This command performs the following actions:

Merges all timestamped videos in chronological order.
Fills gaps between videos with a blank video (using the default blank_muted.MP4).
Analyzes speakers to detect conversations.
Generates a timeline (.yaml) with detected speech segments.
Creates a processed video with conversations muted.

Note: Video merging is automatic for directories containing timestamped videos. The tool uses blank_muted.MP4 as the default blank video file for gap filling.

Installation

Prerequisites

FFmpeg: Required for all media processing tasks.
Hugging Face Account: An access token is needed to download the speaker detection model.

GPU and CUDA

For GPU acceleration, you need a compatible NVIDIA GPU with the appropriate CUDA Toolkit installed. It is critical that your PyTorch version matches your CUDA version.

Check your CUDA version:
```
nvcc --version
```
Install the correct PyTorch version: Visit the PyTorch website to find the correct installation command for your specific CUDA version. For example, for CUDA 12.8, the command is:
```
uv add torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128
```

Setup

Clone the repository and install dependencies:

git clone <repository-url>
cd audio-processing
uv sync

Requirements:

FFmpeg
NVIDIA GPU with CUDA (optional, for acceleration)
Hugging Face token

Setup:

echo "HUGGINGFACE_ACCESS_TOKEN=your_token_here" > .env

Usage

Process videos with complete pipeline

uv run main.py process /path/to/videos --complete --output final_video.mp4

Generate timeline for editing

uv run main.py process video.mp4 --generate-timeline

Apply timeline edits

Edit the generated timeline (*_timeline.yaml). Mark segments to remove by setting type: all. You may place your manual edits at the front of the file if you prefer; the tool will correctly apply them regardless of position.
Apply edits with the single‑timeline edit command:

uv run main.py edit -i video.mp4 -t video_timeline.yaml -o video_edited.mp4

What this does:

Extracts only the type: all ranges from your edited timeline
Blanks those ranges while stream‑copying the rest to keep processing fast
Merges overlaps so manual all edits take precedence

Merge timestamped videos only

uv run main.py process /path/to/videos --merge-only --output merged.mp4

Compress videos

uv run main.py compress video.mp4 -o compressed.mp4

Rename files from CSV

Rename files based on CSV data and set modified timestamp metadata:

uv run python rename_from_csv.py filenames.csv --dry-run
uv run python rename_from_csv.py filenames.csv  # Apply changes

CSV format: timestamp,filepath

timestamp,filepath
20240811143022,data/video1.mp4
20240811144530,data/video2.mp4

This will:

Rename files to YYYYMMDD_compressed format (e.g., 20240811_compressed.mp4)
Set file modified timestamp to match the original timestamp

File Naming

Input Files

For automatic video merging, use timestamp format: YYYYMMDDHHMMSS_*.ext

Example: 20231026183000_camera1.mp4

Output Files

The tool generates output files with the following naming conventions:

Merged videos: YYYYMMDD_merged.mp4 (based on first clip's date)
Processed videos: YYYYMMDD_processed.mp4 (based on original timestamp)
Edited videos: Defaults to <original_stem>_edited<ext> when using the edit command
Compressed videos: YYYYMMDD_compressed.mp4 (based on original timestamp)

Examples:

Input: 20240811143022_video.mp4 → Processed: 20240811_processed.mp4
Input: 20240811143022_video.mp4 → Compressed: 20240811_compressed.mp4

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
.vscode		.vscode
src		src
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
apply_blank_analysis.md		apply_blank_analysis.md
blank_muted.MP4		blank_muted.MP4
bug-fix.md		bug-fix.md
debug_merged_segments.py		debug_merged_segments.py
duplicate_timestamps.py		duplicate_timestamps.py
duplicate_videos.py		duplicate_videos.py
main.py		main.py
output.txt		output.txt
pyproject.toml		pyproject.toml
ruff.toml		ruff.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Audio/Video Processing Tool

Quick Start

Complete Processing

Installation

Prerequisites

GPU and CUDA

Setup

Usage

Process videos with complete pipeline

Generate timeline for editing

Apply timeline edits

Merge timestamped videos only

Compress videos

Rename files from CSV

File Naming

Input Files

Output Files

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Audio/Video Processing Tool

Quick Start

Complete Processing

Installation

Prerequisites

GPU and CUDA

Setup

Usage

Process videos with complete pipeline

Generate timeline for editing

Apply timeline edits

Merge timestamped videos only

Compress videos

Rename files from CSV

File Naming

Input Files

Output Files

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages