Cursor's rules of engagement for the Agent Battle project
Agent Battle is a live experiment where two self-evolving autonomous trading agents — ZERO and MAX — compete against each other on Base over a configurable battle duration. Every 4 hours, each agent reviews its own trading performance and rewrites its own strategy parameters using a self-improvement prompt powered by Claude via Bankr's LLM Gateway. The agents pay for their own inference from their trading wallets. Spectators watch the battle unfold in real time through a separate dashboard that surfaces each agent's PnL, trades, reasoning, troll box, and scoreboard.
Three repos, clean separation:
| Repo | Purpose | Does NOT |
|---|---|---|
agent-01 |
ZERO's trading agent — trades, logs to Supabase, self-improves every 4 hours | Display anything, know opponent's strategy |
agent-02 |
MAX's trading agent — identical structure, different identity | Display anything, know opponent's strategy |
agent-battle |
Battle dashboard — reads Supabase, displays everything spectators see | Trade, write to Supabase, contain agent logic |
Each repo is independent. Changes to one repo must never require changes to another.
- Language: TypeScript
- Runtime: Node.js + tsx
- Trading: Bankr CLI (
@bankr/cli) —submitPromptandpollJob - Self-improvement LLM: Claude via Bankr LLM Gateway (
https://llm.bankr.bot) - Database: Supabase — shared across both agents, separated by
agent_idandbattle_id - Dashboard: Next.js
- Cron: node-cron — fires every 4 hours per agent
- Environment: dotenvx
Always build in this exact order. Never skip steps. Never combine steps.
Step 0 — Repo setup (fork, clean, configure env files, confirm boots)
Step 1 — Supabase migration (add columns, create tables, add indexes)
Step 2 — Supabase query layer (functions that fetch each prompt data point)
Step 3 — Prompt assembly (src/prompts/self-improvement.ts with typed params)
Step 4 — LLM Gateway test (manual call to confirm Bankr responds before cron)
Step 5 — LLM call (send assembled prompt, log raw response, don't parse yet)
Step 6 — Response parsing + validation (parse JSON, clamp out-of-bounds values)
Step 7 — Write back (atomic .env.local update + agent restart)
Step 8 — Cron wiring (wrap steps 2-7 in node-cron, staggered start times)
Step 9 — Battle scripts (battle:start, battle:status, battle:end)
Step 10 — Dashboard (round UI, troll box feed, scoreboard, countdown)
To start a session, tell Cursor: "We are on step X" and reference this file.
- agent-01 (ZERO):
0x08824c2b8428c4e8f48ea3c522b31c349f799f29 - agent-02 (MAX):
0xdfc6919e7cfba037988af43b2e07c8eb0cb86d41 - RSI agent (separate, do not touch):
0x001ddfd18f6849e014224f41e464ccf18bf7f5e1
ZERO and MAX start with different parameters to guarantee immediate divergence:
ZERO (agent-01):
AGENT_TRADE_AMOUNT=0.50
AGENT_COOLDOWN_HOURS=4
AGENT_MAX_POSITIONS=1
AGENT_INTERVAL_MS=300000
AGENT_SLIPPAGE=3
MAX (agent-02):
AGENT_TRADE_AMOUNT=2.00
AGENT_COOLDOWN_HOURS=0.5
AGENT_MAX_POSITIONS=4
AGENT_INTERVAL_MS=180000
AGENT_SLIPPAGE=3
Hard rails (never mutated by agent):
AGENT_MAX_DAILY_LOSS_PCT=10
AGENT_MAX_TRADE_PCT=15
AGENT_DRY_RUN=false
Agent may only change parameters within these ranges:
| Variable | Min | Max |
|---|---|---|
AGENT_TRADE_AMOUNT |
0.50 | 5.00 |
AGENT_COOLDOWN_HOURS |
0.5 | 8 |
AGENT_MAX_POSITIONS |
1 | 4 |
AGENT_INTERVAL_MS |
180000 | 600000 |
Out-of-bounds values must be clamped, not rejected.
Run in each agent repo before building the cron:
bankr llm setup claudeThis sets ANTHROPIC_BASE_URL=https://llm.bankr.bot and wires auth to the Bankr API key. Do NOT use a personal Anthropic key for the self-improvement cron — agents must pay for their own inference.
Test before building:
curl https://llm.bankr.bot/v1/messages \
-H "Content-Type: application/json" \
-H "X-API-Key: <bankr-api-key>" \
-d '{"model": "claude-haiku-4-5-20251001", "max_tokens": 100, "messages": [{"role": "user", "content": "say hello"}]}'If this times out, stop and flag it. Do not proceed.
- agent-01 cron fires at :00 every 4 hours
- agent-02 cron fires at :15 every 4 hours (staggered to prevent API hammering)
- Each cron is wrapped in a top-level try/catch — never crashes, always schedules next cycle
Parameter write-back sequence:
- Check for in-flight trades — wait for resolution if pending
- Write new parameters to temp file
- Atomic rename temp file to
.env.local - Restart agent process
- Confirm restart (wallet address appears in logs)
Error handling — every scenario must be handled explicitly:
| Scenario | Action |
|---|---|
| Supabase query fails | Abort, hold, consecutive_holds +1 |
| Prompt assembly fails | Abort, hold, consecutive_holds +1 |
| LLM Gateway timeout | Retry once after 30s, then abort |
| Malformed JSON | Abort, log raw response |
| Parameter out of bounds | Clamp to nearest bound, log warning |
| Supabase write fails | Retry once, still update .env.local |
| Agent restart fails | Retry after 10s, log CRON CRITICAL |
| Unhandled error | Log CRON CRITICAL, hold |
Log format:
CRON ERROR: [description] for [agent_id] — [error]
CRON WARNING: [description] for [agent_id]
CRON CRITICAL: Unhandled error for [agent_id] — [error]
src/prompts/self-improvement.ts — typed template function, all placeholders as params
src/prompts/personalities.ts — ZERO_PERSONALITY and MAX_PERSONALITY constants
Full prompt templates in agent-battle-prompts.md.
npm run battle:start # validates env, creates battle record, sets BATTLE_ID, prints next steps
npm run battle:status # current battle state, balances, round number
npm run battle:end # marks complete, stops crons, prints final resultsScript location: scripts/battle-start.ts
Core requirements (v1):
- Scoreboard — both agent balances, PnL, round number
- Round UI — "ROUND X COMPLETE" state, countdown to next round ("ROUND X+1 STARTS IN 3:47:22")
- Agent corners — reasoning, troll box, parameter changes with directional indicators (↑↓)
- Troll box feed — unified chronological feed, both agents interleaved, avatar/name per message
- Placeholder section for announcer commentary (empty in v1, ready for v1.5)
Dashboard battle query:
SELECT * FROM battles WHERE status = 'active' ORDER BY started_at DESC LIMIT 1No active battle → waiting state. Battle ended → results view with full troll box replay.
- Bankr reliability — get one clean agent-01 cycle before building cron
- In-flight trades — cron must wait for pending trades before restarting
- Cron staggering — agent-01 at :00, agent-02 at :15
- RSI agent — must be stopped before running Supabase migration
- LLM Gateway cold start — test manual curl before building cron
- Atomic file write — temp file + rename, never write directly to .env.local
- No rollback — known v1 limitation, expected behavior
- One step at a time. Complete and verify before starting the next.
- No file outside current step's scope gets touched. Flag it if a fix requires touching unrelated files.
- No new architecture. Use what exists. Ask before adding libraries.
- No speculative features. Build exactly what is specified.
- Compile and test before declaring done. Not done until it runs without errors.
- No silent failures. Every error must be logged explicitly.
- Aggressive logging during development. Every major cron step logs inputs and outputs.
- Ask before refactoring. Flag messy code, don't silently rewrite it.
- Stop.
- State exactly what is unclear.
- Propose two options maximum.
- Wait for a decision.
Do not guess. Do not assume. Do not proceed with uncertainty.
A step is complete when:
- Code runs without errors
- Console output confirms expected behavior
- Supabase shows correct data (where applicable)
- No unrelated files were modified
- Next step's prerequisites are met
- Never modify both agent repos in the same step
- Never write trading logic in agent-battle
- Never read one agent's strategy from the other agent
- Never hardcode wallet addresses, API keys, or secrets
- Never use personal Anthropic API key for self-improvement cron
- Never swallow errors silently
- Never introduce a dependency without asking
- Never refactor working code without explicit instruction
- Never combine multiple build steps into one
- Never declare done without testing
- Never write directly to .env.local — always atomic temp file rename
- Never start cron build before confirming LLM Gateway works manually