I bought a video course in German — and then realized I don't speak German.
I needed subtitles for 146 videos averaging 30 minutes each. The obvious solution was OpenAI Whisper API, but at $0.006/min that came out to $33. Not terrible, but it also meant sending content to an external API with content restrictions that could refuse to process certain material.
So I built this instead: spin up an AWS EC2 GPU instance, transcribe everything locally with faster-whisper, translate via Google Translate, and shut it all down automatically. Total cost: ~$1.60 for all 146 videos, no content restrictions, and the transcription quality with distil-large-v3 is excellent.
Transcribes videos to subtitles using AWS EC2 GPU (faster-whisper) and translates to any language via Google Translate. ~$1-3 for 146 videos vs $33 with OpenAI API.
- Extracts audio from videos locally (ffmpeg)
- Uploads audio files to S3
- Launches an EC2 GPU instance (g4dn.xlarge or g6.xlarge)
- Transcribes with faster-whisper on CUDA
- Downloads VTT files, translates via Google Translate
- Saves
<video>.pt.vttalongside each video - Terminates the instance and cleans up S3
- AWS account with EC2 GPU quota ("Running On-Demand G and VT instances" ≥ 4 vCPUs)
- AWS credentials configured (
aws configureor environment variables)
cp .env.example .env
# edit .env with your credentials
docker build -t aws-whisper .
docker run --rm \
--env-file .env \
-v /path/to/videos:/videos \
aws-whisper /videos --no-spotPython dependencies
pip install -r requirements.txtffmpeg
Ubuntu/Debian:
sudo apt install ffmpegmacOS:
brew install ffmpegWindows:
winget install ffmpegAWS CLI
Ubuntu/Debian:
curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o awscliv2.zip
unzip awscliv2.zip && sudo ./aws/install
rm -rf awscliv2.zip aws/macOS:
brew install awscliWindows:
winget install Amazon.AWSCLIConfigure credentials
Via .env file (recommended):
cp .env.example .env
# edit .env with your credentials
set -a; source .env; set +a
python3 aws_transcribe.py /path/to/videos --no-spot
set -ais required so the variables are exported to the environment (and therefore visible to the Python subprocess) — a plainsource .envonly sets shell variables, whichpython3won't see.
Or via aws configure:
aws configureOr via environment variables directly:
export AWS_ACCESS_KEY_ID=...
export AWS_SECRET_ACCESS_KEY=...
export AWS_DEFAULT_REGION=us-east-1# Basic (on-demand, default model distil-large-v3, translate to pt-BR)
python3 aws_transcribe.py /path/to/videos --no-spot
# With SSH access to monitor the instance
python3 aws_transcribe.py /path/to/videos --no-spot --ssh
# Custom model and language
python3 aws_transcribe.py /path/to/videos --model large-v3 --lang es --no-spot
# Faster instance (L4 GPU, ~3x faster than T4)
python3 aws_transcribe.py /path/to/videos --instance g6.xlarge --no-spot
# Secure: use IAM instance profile instead of embedding credentials in user-data
python3 aws_transcribe.py /path/to/videos --instance-profile MyWhisperRole --no-spot
# Resume interrupted job (reuses existing S3 bucket and skips already-transcribed files)
python3 aws_transcribe.py /path/to/videos --bucket whisper-<job-id> --no-spot| Option | Default | Description |
|---|---|---|
--model |
distil-large-v3 |
Whisper model. Options: tiny, base, small, medium, large-v3, distil-large-v3 |
--lang |
pt |
Translation language code (e.g. es, fr, de, ja) |
--instance |
g4dn.xlarge |
EC2 instance type. Recommended: g6.xlarge (L4, ~3x faster) |
--region |
us-east-1 |
AWS region |
--no-spot |
off | Use on-demand instead of spot instances |
--bucket |
auto | Reuse existing S3 bucket (for resuming interrupted jobs) |
--timeout |
180 |
Minutes to wait for transcription to complete |
--ssh |
off | Create SSH key pair and security group to monitor the instance |
--instance-profile |
— | IAM instance profile name (avoids embedding credentials in user-data) |
| Instance | GPU | Speed | Cost/hr | ~146 videos (30 min avg) |
|---|---|---|---|---|
| g4dn.xlarge | T4 | ~6 min/video | $0.526 | ~$7 |
| g6.xlarge | L4 | ~40 sec/video | $0.805 | ~$1.60 |
S3 costs are negligible (<$0.10).
Without --instance-profile, AWS credentials are embedded in the EC2 user-data script
(readable via the instance metadata endpoint). For production or shared environments,
create an IAM role with S3 access and pass --instance-profile <name>.