SAM 2 No-Code Fine-Tuning Platform

Train Meta's Segment Anything Model 2 on your own data — no code required.

Visit Webpage 👋 · LinkedIn · Report Bug 🪳

Demo

What It Does

Fine-tuning SAM 2 traditionally requires deep knowledge of PyTorch, distributed training, Hydra configs, and cloud GPU orchestration. This platform removes all of that friction.

Through a clean web interface, users can:

Authenticate via GitHub or Google OAuth
Configure training runs — choose LoRA rank, model checkpoint size, target dataset, and epoch count
Launch GPU-accelerated training on Modal Labs with a single click
Monitor progress through real-time streaming logs (Server-Sent Events)
Download fine-tuned checkpoints directly from Cloudflare R2 storage

The result: production-quality SAM 2 fine-tuning, accessible to anyone with a browser.

Architecture & Tech Stack

_{System architecture showing the full data flow from frontend to GPU training and back}

Why These Technologies?

Layer	Technology	Why It Matters
Frontend	Next.js 16 + React 19	Server Components and the App Router enable API routes, SSE streaming, and UI to coexist in one deployment — no separate backend server needed for the web layer
	TypeScript 5	End-to-end type safety from database schema (Drizzle) through API routes to React components, catching bugs at compile time
	Tailwind CSS 4 + Material UI 7	Rapid, consistent styling with MUI's component library for complex UI elements like training configuration forms
	Drizzle ORM	Type-safe SQL with zero overhead — generates migrations from TypeScript schema, keeping the DB in sync without heavy ORM abstractions
ML Backend	PyTorch 2.5+ / SAM 2	Meta's state-of-the-art segmentation model, fine-tuned with LoRA adapters to minimize compute while preserving quality
	Hydra Configs	Declarative, composable training configurations — each combination of model size, dataset, and hyperparameters maps to a clean YAML override
	FastAPI	Lightweight Python API layer that bridges the Next.js frontend to the PyTorch training loop, handling job dispatch and webhook callbacks
Infrastructure	Modal Labs	Serverless GPU compute — pay only for training time with zero cold-start provisioning of A100/H100 hardware. No GPU clusters to manage
	Vercel	Zero-config deployment for the Next.js app with edge functions, automatic HTTPS, and preview deployments on every PR
	Neon PostgreSQL	Serverless Postgres that scales to zero — perfect for bursty workloads where training jobs may be hours apart
	Cloudflare R2	S3-compatible object storage with zero egress fees — critical when users download multi-GB checkpoint files
AI Agent	LiveKit + OpenAI	Real-time voice agent integration powered by LiveKit's WebRTC infrastructure, enabling conversational interaction with the training platform
AI Agent	Deepgram + Cartesia	Speech-to-text and text-to-speech pipeline for natural voice interactions — Deepgram for low-latency transcription, Cartesia for expressive speech synthesis
Auth & Ops	Better-Auth	Lightweight OAuth framework supporting GitHub and Google — admin approval gate ensures only authorized users can launch GPU training jobs
Auth & Ops	Logfire	Observability for the training pipeline — trace job submissions, monitor Modal webhook callbacks, and debug SSE streaming issues

How It Works

  Authenticate          Configure             Train                Monitor             Download
 ┌───────────┐     ┌──────────────┐     ┌──────────────┐     ┌──────────────┐     ┌────────────┐
 │  GitHub /  │────>│  LoRA Rank   │────>│  Modal Labs  │────>│  Real-time   │────>│ Checkpoint │
 │  Google    │     │  Model Size  │     │  Serverless  │     │  SSE Logs    │     │ from R2    │
 │  OAuth     │     │  Dataset     │     │  GPU (A100)  │     │              │     │            │
 └───────────┘     │  Epochs      │     └──────────────┘     └──────────────┘     └────────────┘
                   └──────────────┘

Sign in with GitHub or Google — an admin approves your account for training access
Pick your parameters — LoRA rank (2/4/8/16/32), checkpoint size (tiny/small/base+/large), dataset, and epoch count
Hit train — the app submits a job to Modal Labs, which spins up a GPU instance and begins fine-tuning
Watch it run — logs stream back to your browser in real-time via Server-Sent Events
Grab your model — once training completes, Modal uploads the checkpoint to R2 and you download it instantly

Repository Structure

This project is organized as a monorepo with three git submodules, each handling a distinct concern:

sam2finetuning/
├── sam2loranocodefinetuning/   # Next.js 16 web application
│   ├── app/api/               #   API routes (train, jobs, download, auth)
│   ├── src/components/        #   React UI (Config, Logs, Controls)
│   ├── src/db/                #   Drizzle ORM schema & migrations
│   └── src/lib/               #   Auth, utils, constants
│
├── modalsam2/                 # Meta SAM 2 fork with training support
│   ├── training/              #   train.py, trainer.py, loss functions
│   ├── sam2/configs/          #   Hydra YAML training configs
│   └── training/dataset/      #   Dataset loaders (VOS, SA-1B, SA-V, DAVIS)
│
├── sam2webappvoiceagent/      # LiveKit voice agent integration
│   └── ...                    #   Real-time voice interaction with the platform
│
├── images/                    # Documentation assets
└── README.md

Submodule	Purpose
`sam2loranocodefinetuning`	Full-stack web app — handles auth, UI, job management, SSE log streaming, and checkpoint downloads
`modalsam2`	Fork of Meta's SAM 2 repo extended with LoRA fine-tuning support, Hydra configs, and Modal Labs deployment
`sam2webappvoiceagent`	Voice-powered AI agent that enables conversational interaction with the training platform via LiveKit

Getting Started

Prerequisites

Node.js 18+ and pnpm
Python 3.10+
Accounts on Modal, Neon, Cloudflare R2

Clone with Submodules

git clone --recurse-submodules https://github.com/czhurdlespeed/sam2finetuning.git
cd sam2finetuning

Frontend

cd sam2loranocodefinetuning
pnpm install
cp .env.example .env.local   # fill in your service credentials
pnpm db:push                 # push schema to Neon
pnpm dev                     # start dev server on localhost:3000

Training Backend

cd modalsam2/sam2
pip install -e ".[dev]"
cd checkpoints && ./download_ckpts.sh && cd ..

Environment Variables

The app requires credentials for several services. See the table below for the key variables:

Variable	Service
`DATABASE_URL`	Neon PostgreSQL connection string
`MODAL_TRAIN_URL`, `MODAL_KEY`, `MODAL_SECRET`	Modal Labs API
`CF_R2_`, `AWS_`	Cloudflare R2 storage
`BETTER_AUTH_*`	OAuth configuration
`LIVEKIT_*`	Voice agent (optional)

Built by Calvin Wetzel

_{If you found this project useful, consider giving it a star!}

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
images		images
modalsam2 @ 97620dc		modalsam2 @ 97620dc
sam2loranocodefinetuning @ c6209d8		sam2loranocodefinetuning @ c6209d8
sam2webappvoiceagent @ ba18664		sam2webappvoiceagent @ ba18664
.gitmodules		.gitmodules
CLAUDE.md		CLAUDE.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SAM 2 No-Code Fine-Tuning Platform

Train Meta's Segment Anything Model 2 on your own data — no code required.

Demo

What It Does

Architecture & Tech Stack

Why These Technologies?

How It Works

Repository Structure

Getting Started

Prerequisites

Clone with Submodules

Frontend

Training Backend

Environment Variables

Built by Calvin Wetzel

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 1

Folders and files

Latest commit

History

Repository files navigation

SAM 2 No-Code Fine-Tuning Platform

Train Meta's Segment Anything Model 2 on your own data — no code required.

Demo

What It Does

Architecture & Tech Stack

Why These Technologies?

How It Works

Repository Structure

Getting Started

Prerequisites

Clone with Submodules

Frontend

Training Backend

Environment Variables

Built by Calvin Wetzel

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Packages