labrat

Autonomous ML research agent skill. Interviews the user, scopes an ML project into a tractable plan, then designs experiments, deploys to Modal GPUs, tracks results, and iterates within a compute budget.

Also includes treadmill — a portable recurring-command skill for agent harnesses that don't have a built-in loop.

Built on the Agent Skills open standard. Practically, this skill only works well with Claude Code — it relies heavily on interactive tool use (subagent spawning, user interviews, recurring loops) that other harnesses like Codex, OpenCode, and Cursor don't yet support. Install instructions for other tools are included for forward-compatibility, but expect a degraded experience.

Install

Claude Code

Copy the skill directories to your personal skills folder:

cp -r labrat/ ~/.claude/skills/labrat
cp -r treadmill/ ~/.claude/skills/treadmill

Or install as a plugin from GitHub:

# In Claude Code
/install-plugin gh:tnguyen21/labrat

Then invoke with /labrat or just describe a research goal.

Codex

Copy the skill into your project:

cp -r labrat/ .codex/skills/labrat

Or place the AGENTS.md in your project root for basic instructions without the full skill system.

OpenCode

Copy the skill to any supported location:

# Project-local
cp -r labrat/ .opencode/skills/labrat

# Or global
cp -r labrat/ ~/.config/opencode/skills/labrat

OpenCode also reads from .claude/skills/ and .agents/skills/.

Other Agent Skills-compatible tools

Copy the labrat/ directory to wherever your tool discovers skills. The format follows the Agent Skills spec.

Prerequisites

Python 3.12+
Modal CLI (uv tool install modal && modal setup)
Modal account with GPU access

Usage

Start a new research session:

Run a labrat session: test whether dropout rate affects convergence on CIFAR-10. Budget: $10.

For a new session, the agent first asks a few clarifying questions and writes a scoped project brief in .research/scope.md before writing code.

Continue an existing session (if .research/state.json exists):

Continue the research session.

Check status without an agent:

python labrat/scripts/research-status

Advance state without an agent:

python labrat/scripts/research-advance

For Codex-style unattended progress, run the supervisor instead:

python labrat/scripts/research-supervise

How it works

Interview — Clarifies goals, metrics, constraints, and writes a scoped project brief
Initialize — Creates .research/ with state.json, scope.md, plan.md, log.md
Baseline — Always runs a baseline experiment first
Iterate — Each experiment changes one variable from baseline
Track — Logs results, tracks spend against budget, and reconciles finished volume-backed artifacts back into state
Conclude — Writes summary.md with results table and findings

research-advance now prefers remote results.json artifacts on a Modal Volume and only falls back to app-state inspection for recovery. research-supervise adds the next handoff for Codex by invoking codex exec when the session is actionable again, which is the closest equivalent to Claude Code's recurring /loop.

Experiments run on Modal GPUs (defaults to T4, the cheapest option). Each experiment produces:

config.json — hyperparameters
train.py — training script
modal_app.py — Modal deployment wrapper
results.json — collected after run

Structure

labrat/                           # ML research skill
  SKILL.md                        # Agent instructions
  scripts/research-advance        # State reconciliation worker
  scripts/research-supervise      # Reconcile state, then wake Codex if needed
  scripts/research-status         # CLI status checker
  references/modal-patterns.md    # Modal deployment patterns
treadmill/                        # Recurring command skill
  SKILL.md                        # Agent instructions
  scripts/treadmill               # Background loop manager
AGENTS.md                         # Codex/fallback instructions
README.md
LICENSE

License

Apache-2.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

labrat

Install

Claude Code

Codex

OpenCode

Other Agent Skills-compatible tools

Prerequisites

Usage

How it works

Structure

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
labrat		labrat
treadmill		treadmill
.gitignore		.gitignore
AGENTS.md		AGENTS.md
LICENSE		LICENSE
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

labrat

Install

Claude Code

Codex

OpenCode

Other Agent Skills-compatible tools

Prerequisites

Usage

How it works

Structure

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages