ClipSE turns long videos into short, reviewable clips. Upload a video or add a video URL, transcribe it, ask an AI model to find promising short-form moments, review the suggestions in the browser, then render and download the clips you want to keep.
The default deployment is Docker Compose and includes the web app, worker, PostgreSQL, local S3-compatible storage, and a Whisper transcription service. You control the app, data, model choices, storage, and runtime.
- Local account sign-up with Better Auth.
- Video upload and URL intake through yt-dlp.
- Whisper transcription through
faster-whisperor Hailo-10H. - AI clip analysis through OpenAI, Gemini, OpenRouter, or Codex CLI.
- Browser review flow with transcript context, clip timing, and render controls.
- Vertical-short focus detection with local detectors or Hailo-10H vision/VLM backends.
- S3-compatible media storage using Garage by default.
- Published GHCR images plus local build overrides.
Run ClipSE with published images:
mkdir clipse
cd clipse
curl -fsSLO https://raw.githubusercontent.com/fixtse/ClipSE/main/docker-compose.yml
curl -fsSLO https://raw.githubusercontent.com/fixtse/ClipSE/main/docker-compose.cpu.yml
curl -fsSLO https://raw.githubusercontent.com/fixtse/ClipSE/main/docker-compose.intel.yml
curl -fsSLO https://raw.githubusercontent.com/fixtse/ClipSE/main/docker-compose.hailo.yml
curl -fsSLO https://raw.githubusercontent.com/fixtse/ClipSE/main/.env.example
mkdir -p services/garage
curl -fsSLo services/garage/garage.toml https://raw.githubusercontent.com/fixtse/ClipSE/main/services/garage/garage.toml
cp .env.example .env
mkdir -p models/whisper models/yolo models/hailo
docker compose up -dOpen http://localhost:3000, create a local account, then open the workspace settings to choose your AI provider, analysis model, transcription backend, and transcription model.
Stop the stack:
docker compose downRemove persistent database and object-storage data:
docker compose down -vFrom a cloned checkout:
cp .env.example .env
docker compose up -dFor any non-local deployment, replace the default auth secret before starting:
openssl rand -base64 32Set the generated value as BETTER_AUTH_SECRET in .env, and set BETTER_AUTH_BASE_URL to the public app URL. If the app is reachable from additional browser origins, add them to BETTER_AUTH_TRUSTED_ORIGINS as a comma-separated list.
- Sign in or create the first local account.
- Open settings and configure an AI provider.
- Choose the transcription provider and model.
- Create a channel.
- Add a video by uploading a file or pasting a video URL.
- Start transcription.
- Run AI analysis to generate clip suggestions.
- Review the suggested clips and adjust timing as needed.
- Render clips with the selected aspect and subtitle options.
- Download the finished clips.
Configure these inside the app settings after sign-in.
| Provider | Required settings | Notes |
|---|---|---|
openai |
OpenAI API key, model | Uses the official OpenAI-compatible API. Optional base URL can point at another OpenAI-compatible service. |
gemini |
Gemini API key, model | Loads available Gemini models from Google when an API key is present. |
openrouter |
OpenRouter API key, model | Loads models from OpenRouter. |
codex |
Codex model | Uses the Codex CLI mounted into the app and worker containers. Run codex login on the host first. |
| Provider | Models | Notes |
|---|---|---|
faster-whisper |
small, medium, large-v3-turbo |
Default provider. The default AI Docker service uses CUDA. |
hailo |
whisper-tiny, whisper-base, whisper-small |
Hailo-10H inference backend through docker-compose.hailo.yml. |
Hailo host setup requires the UGen300/Hailo PCIe driver before Compose can pass /dev/h1x-0 into the container:
curl -fsSL "https://raw.githubusercontent.com/fixtse/ClipSE/main/scripts/install-hailo-ugen300-driver.sh" \
-o scripts/install-hailo-ugen300-driver.sh
chmod +x scripts/install-hailo-ugen300-driver.sh
./scripts/install-hailo-ugen300-driver.sh ~/Downloads/UGen300_M2_5.3.0_driver_Linux_amd64.zip
sudo rebootTranscription chunking can be enabled in settings. The chunk length accepts 1 to 120 minutes and defaults to 20 minutes when enabled.
Render controls are selected per clip in the review flow.
| Option | Use |
|---|---|
| Aspect mode | Choose the output framing for the rendered clip, including short-form vertical output. |
| Burn subtitles | Render transcript captions into the video. |
| Intro/outro bumper | Add configured bumper media before or after rendered clips when available. |
Copy .env.example to .env and change values for your environment. Docker Compose supplies internal service URLs for containers, so most local installs only need BETTER_AUTH_SECRET, BETTER_AUTH_BASE_URL, and provider credentials configured in the app.
| Variable | Default | Description |
|---|---|---|
DATABASE_URL |
postgresql://postgres:postgres@localhost:5433/clipse |
PostgreSQL connection string for local tooling. Compose overrides this inside containers. |
BETTER_AUTH_SECRET |
local development secret | Cookie/session signing secret. Replace for any shared or public deployment. |
BETTER_AUTH_BASE_URL |
http://localhost:3000 |
Browser-facing app URL. |
BETTER_AUTH_TRUSTED_ORIGINS |
empty | Additional browser origins allowed to call Better Auth endpoints, separated by commas. BETTER_AUTH_BASE_URL is trusted automatically. |
CLIPSE_DISABLE_AUTH |
false |
Set to true to bypass sign-in and allow local anonymous access. Use only in trusted local deployments. |
| Variable | Default | Description |
|---|---|---|
CLIPSE_MAX_CLIPS_PER_VIDEO |
8 |
Maximum regular clip suggestions per analysis. Valid range: 1 to 20. |
CLIPSE_MAX_SHORTS_PER_VIDEO |
16 |
Maximum short-form candidates per analysis. Valid range: 1 to 40. |
| Variable | Default | Description |
|---|---|---|
CLIPSE_S3_ENDPOINT |
http://localhost:3900 |
Internal S3-compatible endpoint. Compose sets this to Garage inside containers. |
CLIPSE_S3_PUBLIC_ENDPOINT |
http://localhost:3900 |
Browser-reachable S3-compatible endpoint for signed media URLs. |
CLIPSE_S3_REGION |
garage |
S3 region value. |
CLIPSE_S3_BUCKET |
clipse |
Bucket for uploads, thumbnails, transcripts, and renders. |
CLIPSE_S3_ACCESS_KEY_ID |
local Garage key | S3 access key. |
CLIPSE_S3_SECRET_ACCESS_KEY |
local Garage secret | S3 secret key. |
CLIPSE_S3_FORCE_PATH_STYLE |
true |
Use path-style S3 URLs. Keep true for Garage and MinIO. |
| Variable | Default | Description |
|---|---|---|
WHISPER_SERVICE_URL |
http://localhost:8000 |
AI service transcription API URL for local tooling. Compose sets this to http://ai:8000 inside containers. |
WHISPER_PROVIDER |
faster-whisper |
AI service default transcription provider. Use hailo with the Hailo override. |
WHISPER_DEVICE |
cuda |
faster-whisper device. Use cpu only with a compatible compute type and enough patience. |
WHISPER_COMPUTE_TYPE |
float16 |
faster-whisper compute type. |
WHISPER_CPU_FALLBACK |
false |
AI container CPU fallback toggle for faster-whisper. |
NVIDIA_VISIBLE_DEVICES |
all |
GPU devices exposed to CUDA containers. |
NVIDIA_DRIVER_CAPABILITIES |
compute,utility,video |
NVIDIA container capabilities. |
WHISPER_CACHE_DIR |
./models/whisper in dev compose |
Host cache path for downloaded Whisper models in development. Production Docker stores model files under ./models/whisper. |
| Variable | Default | Description |
|---|---|---|
CLIPSE_AI_HAILO_IMAGE |
ghcr.io/fixtse/clipse-ai-hailo:latest |
Hailo AI image. Override this for a private/local Hailo image. |
HAILO_DEVICE |
/dev/h1x-0 |
Hailo accelerator device passed into the container. |
HAILO_WHISPER_MODEL |
whisper-base |
Hailo transcription model. |
HAILO_WHISPER_HEF_PATH |
empty | Optional explicit Whisper HEF path. Usually not needed when the matching .hef is under ./models. |
HAILO_WHISPER_TIMEOUT_MS |
60000 |
Hailo Whisper generation timeout in milliseconds. Increase for long audio. |
HAILO_VLM_MODEL |
qwen2-vl-2b |
Hailo VLM focus-detection model. Supported aliases include qwen2-vl-2b, qwen2.5-vl-3b, and qwen3-vl-2b-instruct. |
HAILO_VLM_HEF_PATH |
empty | Optional explicit VLM HEF path. Usually not needed when the matching .hef is under ./models. |
HAILO_VLM_FOCUS_SAMPLE_INTERVAL_SECONDS |
1.0 |
Frame sampling interval for Hailo VLM focus detection. |
HAILO_VLM_FOCUS_MAX_SAMPLES |
8 |
Maximum sampled frames per focus-detection request. |
HAILO_VLM_OPTIMIZE_MEMORY_ON_DEVICE |
true |
Hailo VLM memory optimization toggle. |
HAILO_VISION_MODEL |
yolov8n |
Hailo YOLO-family model for mode-aware shorts focus detection. |
HAILO_VISION_HEF_PATH |
empty | Optional explicit Hailo vision HEF path. Usually not needed when the matching .hef is under ./models. |
HAILO_SCREEN_OCR_HEF_PATH |
empty | Optional OCR/text-detection HEF path for screen-heavy shorts. Usually not needed when the matching .hef is under ./models. |
HAILO_OBJECT_LABELS |
empty | Optional comma-separated COCO class ids or names for general object focus mode. Examples: 67,73 or cell phone,book. |
HAILO_VISION_SAMPLE_INTERVAL_SECONDS |
0.35 |
Frame sampling interval for Hailo vision focus detection. |
HAILO_VISION_MAX_SAMPLES |
0 |
Maximum sampled frames per Hailo vision request. Use 0 to sample until the clip end. |
HAILO_VISION_COMMAND |
runner command | Override for the Hailo vision helper command. |
HAILO_VISION_FRAME_COMMAND |
empty | Optional per-frame command returning JSON detections when using a custom Hailo detector wrapper. |
HAILO_FOCUS_DEBUG |
WHISPER_DEBUG value in Docker |
Enables Hailo focus command logs in the AI service. |
CLIPSE_FOCUS_DEBUG |
WHISPER_DEBUG value in Docker |
Enables web/worker logs showing Hailo focus use and local detector fallback. |
HAILO_COMMAND_TIMEOUT_SECONDS |
900 |
Timeout for Hailo helper commands. |
HAILO_APPS_REF |
main |
Hailo Apps git ref used when building the Hailo image. |
HAILORT_WHEEL_DIR |
./services/whisper/hailo-packages |
Local Hailo image build option: directory containing one hailort-*.whl or pyhailort-*.whl. If the directory also contains hailort_*.deb, it is installed into the local image for libhailort. |
HAILO_HOST_LIB_DIR |
/usr/lib/hailo |
Host HailoRT library mount path. |
HAILO_HOST_BIN_DIR |
/usr/bin |
Host binary mount path for hailortcli. |
| Variable | Default | Description |
|---|---|---|
CLIPSE_FOCUS_PROVIDER |
auto |
auto, local, hailo-vlm, or hailo-vision. |
CLIPSE_HAILO_SERVICE_URL |
http://localhost:8000 |
Hailo focus API URL. Compose sets this to http://ai:8000 inside containers. |
CLIPSE_YOLO_MODEL |
yolo11n.pt |
Local person/face focus model used by the worker. |
CLIPSE_LOCAL_DETECTOR_DEVICE |
auto |
Local YOLO/RT-DETR device preference. Use intel:gpu for OpenVINO on Intel GPU, cuda for PyTorch CUDA, or cpu. The Intel compose override sets intel:gpu. |
Focus provider modes:
| Provider | Use case |
|---|---|
auto |
Default local detector flow with automatic local fallbacks. |
local |
Force local YOLO/RT-DETR/OpenCV detection. |
hailo-vision |
Recommended Hailo Docker mode for people, product, screen, and object focus detection. |
hailo-vlm |
Legacy Hailo VLM prompt path for face/person focus detection. |
| Variable | Default | Description |
|---|---|---|
CLIPSE_YTDLP_COOKIES_FILE |
empty | Optional cookies file path for yt-dlp when a source requires browser cookies. Mount the file into the worker container. |
CLIPSE_YTDLP_USER_AGENT |
empty | Optional yt-dlp user agent override. |
| Variable | Default | Description |
|---|---|---|
HOST_CODEX_HOME |
${HOME}/.codex |
Host Codex config directory mounted into containers. |
CLIPSE_CODEX_COMMAND |
codex |
Command used by the app and worker. |
CLIPSE_CODEX_HOME |
/root/.codex |
Container Codex config directory. |
CLIPSE_CODEX_CWD |
/app |
Working directory for Codex CLI calls. |
CLIPSE_CODEX_TIMEOUT_MS |
300000 |
Codex request timeout in milliseconds. |
Authenticate on the host before selecting the Codex provider:
codex loginFor Windows PowerShell:
HOST_CODEX_HOME="C:/Users/<you>/.codex"For WSL:
HOST_CODEX_HOME="/mnt/c/Users/<you>/.codex"| Variable | Default |
|---|---|
CLIPSE_APP_IMAGE |
ghcr.io/fixtse/clipse-app:latest |
CLIPSE_WORKER_IMAGE |
ghcr.io/fixtse/clipse-worker:latest |
CLIPSE_MIGRATE_IMAGE |
ghcr.io/fixtse/clipse-migrate:latest |
CLIPSE_AI_IMAGE |
ghcr.io/fixtse/clipse-ai:latest |
CLIPSE_AI_HAILO_IMAGE |
ghcr.io/fixtse/clipse-ai-hailo:latest |
CLIPSE_GARAGE_INIT_IMAGE |
ghcr.io/fixtse/clipse-garage-init:latest |
The legacy CLIPSE_WHISPER_IMAGE and CLIPSE_WHISPER_HAILO_IMAGE variables are still accepted as fallbacks, but new deployments should use the CLIPSE_AI_* names.
Run with published images:
mkdir -p models/whisper models/yolo models/hailo
docker compose up -dThe default stack expects an NVIDIA GPU for CUDA Whisper and local focus detection. Verify the host runtime with:
docker run --rm --gpus all nvidia/cuda:12.6.0-base-ubuntu24.04 nvidia-smiRun without an NVIDIA GPU:
mkdir -p models/whisper models/yolo models/hailo
docker compose -f docker-compose.yml -f docker-compose.cpu.yml up -dRun with Intel GPU ffmpeg acceleration and CPU Whisper:
sudo apt install -y vainfo intel-media-va-driver libva-drm2 libva2
ls -l /dev/dri/renderD128
mkdir -p models/whisper models/yolo models/hailo
docker compose -f docker-compose.yml -f docker-compose.intel.yml up -dThis Intel example targets Ubuntu 26.06. The host needs VAAPI/QSV userspace packages (vainfo, intel-media-va-driver, libva-drm2, and libva2) installed so the Intel media driver (iHD) is available. The compose override passes /dev/dri/renderD128 into the app and worker containers for Intel QSV encoding and requests OpenVINO Intel GPU inference for local YOLO/RT-DETR focus detection. The host must expose that render device, and the Docker user must be able to access it. If your host uses different device group IDs, set CLIPSE_RENDER_GID="$(getent group render | cut -d: -f3)" and CLIPSE_VIDEO_GID="$(getent group video | cut -d: -f3)" before starting Compose.
Check Intel driver access inside the worker with:
docker compose -f docker-compose.yml -f docker-compose.intel.yml exec worker sh -lc \
'vainfo --display drm --device /dev/dri/renderD128 && ffmpeg -hide_banner -v error -init_hw_device qsv=hw:/dev/dri/renderD128 -f lavfi -i nullsrc=s=16x16:d=0.1 -frames:v 1 -f null -'Run with Intel GPU ffmpeg acceleration and Hailo-10H Whisper/focus:
sudo apt install -y vainfo intel-media-va-driver libva-drm2 libva2
ls -l /dev/dri/renderD128
WHISPER_PROVIDER=hailo \
CLIPSE_FOCUS_PROVIDER=hailo-vision \
HAILO_DEVICE=/dev/h1x-0 \
docker compose -f docker-compose.yml -f docker-compose.intel.yml -f docker-compose.hailo.yml up -dUse this setup on Intel hosts where /dev/dri/renderD128 handles ffmpeg QSV rendering and local OpenVINO YOLO/RT-DETR fallback, while /dev/h1x-0 handles Hailo transcription or focus detection. Keep docker-compose.hailo.yml last so its Whisper provider settings override the CPU Whisper defaults from the Intel file.
Build app images locally:
docker compose -f docker-compose.yml -f docker-compose.build.yml up --buildModel files live under ./models, which is mounted into the AI and worker containers as /models:
mkdir -p models/whisper models/yolo models/hailo
# faster-whisper downloads/cache: ./models/whisper
# local YOLO/RT-DETR files: ./models/yolo/yolo11n.pt or ./models/yolo/rtdetr-l.pt
# Hailo HEFs: ./models/hailo/whisper-base.hef, ./models/hailo/yolov8n.hef, etc.The default Hailo compose override pulls ghcr.io/fixtse/clipse-ai-hailo:latest, which targets HailoRT 5.3. The host PCIe driver must be the same HailoRT version as the runtime in the image. If you need a newer HailoRT release, build a local Hailo image with matching hailort_*.deb and hailort-*.whl packages, then install the matching PCIe driver on the host. Put HEFs models under ./models/hailo.
For a custom HailoRT version, build the Hailo image locally with a wheel directory outside the repo:
# Install the host PCIe driver package that matches the HailoRT version in your image.
curl -fsSL "https://raw.githubusercontent.com/fixtse/ClipSE/main/scripts/install-hailo-ugen300-driver.sh" \
-o scripts/install-hailo-ugen300-driver.sh
chmod +x scripts/install-hailo-ugen300-driver.sh
./scripts/install-hailo-ugen300-driver.sh ~/Downloads/UGen300_M2_5.3.0_driver_Linux_amd64.zip
sudo reboot
ls -l /dev/h1x-*
hailortcli scan
mkdir -p models/hailo
# Put licensed .hef files in ./models/hailo.
CLIPSE_AI_HAILO_IMAGE=clipse-ai-hailo:local \
HAILORT_WHEEL_DIR="$HOME/Downloads/hailort" \
docker compose -f docker-compose.yml -f docker-compose.hailo.yml -f docker-compose.hailo-build.yml build aiWhen running that local image, keep CLIPSE_AI_HAILO_IMAGE=clipse-ai-hailo:local in the environment for the up command.
Run Hailo-10H without an NVIDIA GPU:
curl -fsSL "https://raw.githubusercontent.com/fixtse/ClipSE/main/scripts/install-hailo-ugen300-driver.sh" \
-o scripts/install-hailo-ugen300-driver.sh
chmod +x scripts/install-hailo-ugen300-driver.sh
./scripts/install-hailo-ugen300-driver.sh ~/Downloads/UGen300_M2_5.3.0_driver_Linux_amd64.zip
sudo reboot
ls -l /dev/h1x-*
hailortcli scan
WHISPER_PROVIDER=hailo \
CLIPSE_FOCUS_PROVIDER=hailo-vision \
docker compose -f docker-compose.yml -f docker-compose.cpu.yml -f docker-compose.hailo.yml up -d
curl http://localhost:8000/healthRun Hailo-10H with Intel GPU ffmpeg acceleration:
curl -fsSL "https://raw.githubusercontent.com/fixtse/ClipSE/main/scripts/install-hailo-ugen300-driver.sh" \
-o scripts/install-hailo-ugen300-driver.sh
chmod +x scripts/install-hailo-ugen300-driver.sh
./scripts/install-hailo-ugen300-driver.sh ~/Downloads/UGen300_M2_5.3.0_driver_Linux_amd64.zip
sudo reboot
ls -l /dev/h1x-*
hailortcli scan
WHISPER_PROVIDER=hailo \
CLIPSE_FOCUS_PROVIDER=hailo-vision \
HAILO_DEVICE=/dev/h1x-0 \
docker compose -f docker-compose.yml -f docker-compose.intel.yml -f docker-compose.hailo.yml up -d
curl http://localhost:8000/healthRun Hailo-10H on a host that also has an NVIDIA GPU:
curl -fsSL "https://raw.githubusercontent.com/fixtse/ClipSE/main/scripts/install-hailo-ugen300-driver.sh" \
-o scripts/install-hailo-ugen300-driver.sh
chmod +x scripts/install-hailo-ugen300-driver.sh
./scripts/install-hailo-ugen300-driver.sh ~/Downloads/UGen300_M2_5.3.0_driver_Linux_amd64.zip
sudo reboot
ls -l /dev/h1x-*
hailortcli scan
WHISPER_PROVIDER=hailo \
CLIPSE_FOCUS_PROVIDER=hailo-vision \
docker compose -f docker-compose.yml -f docker-compose.hailo.yml up -dHailo HEFs are auto-discovered under ./models by filename, so HAILO_WHISPER_MODEL=whisper-base can use a file such as ./models/hailo/whisper-base.hef without setting HAILO_WHISPER_HEF_PATH. See DOCKER.md for advanced Hailo licensing, private image builds, WSL notes, and custom HEF path overrides.
Check logs:
docker compose ps
docker compose logs -f app
docker compose logs -f worker
docker compose logs -f aiSee DOCKER.md for startup checks, troubleshooting, Hailo licensing notes, Garage reset steps, and private Hailo image builds.
Use pnpm for local commands:
pnpm install
cp .env.example .env
docker compose -f docker-compose.dev.yml up --buildUseful commands:
pnpm check
pnpm typecheck
pnpm test:unit
pnpm db:generate
pnpm db:migrateGenerate migrations after changing the Drizzle schema:
pnpm db:generateapps/web/src/app- Next.js App Router pages and route handlers.apps/web/src/components/clipse- workspace UI.apps/web/src/modules/content-videos- upload drafts, dashboard, and video state.apps/web/src/modules/content-transcriptions- transcript persistence.apps/web/src/modules/content-clips- clip suggestions and render state.apps/web/src/modules/content-jobs- background job queue state.apps/web/src/modules/content-settings- AI and transcription settings.apps/web/src/server/actions- mutation-oriented server actions.apps/web/src/server/api/routers- query-oriented tRPC endpoints.apps/worker/src/clipse-worker.ts- transcription, analysis, and render worker.services/whisper- AI API container source for faster-whisper transcription and Hailo helpers.services/postgres/migrations- Drizzle migrations.services/garage- Garage object storage config and init image.
- Next.js App Router, React, and TypeScript.
- Better Auth local email/password authentication.
- tRPC and TanStack Query.
- Drizzle ORM and PostgreSQL.
- S3-compatible object storage.
- FFmpeg (nvapi, intel-qsv, cpu) and yt-dlp.
- Whisper, YOLO, OpenCV (CUDA, Hailo-10H, and CPU).
- OpenAI-compatible AI SDK providers, Gemini, OpenRouter, and Codex CLI.
- Tailwind CSS, shadcn/ui, and Framer Motion.
Read CONTRIBUTING.md before opening a pull request. Keep changes focused, update tests for behavior changes, and run the local checks before submitting.
Security issues should be reported privately. See SECURITY.md.
ClipSE is licensed under AGPL-3.0-only.
