Last updated: 2026-03-07
This document is the detailed runtime architecture companion to ../ARCHITECTURE.md. It describes how the application currently runs in practice, while the top-level architecture doc defines the layering rules and boundaries the repository should preserve.
Forecaster Arena is a Next.js 14 application that orchestrates a paper-trading benchmark over Polymarket markets.
At a high level, the system has four major layers:
-
Public web UI
- Read-only pages for the homepage, models, cohorts, markets, methodology, changelog, and about.
- These pages consume internal API routes for live data and render empty states when the database has not been seeded yet.
-
Admin surface
- Cookie-authenticated operational pages under
/admin. - Supports cohort/market maintenance, exports, log review, and cost review.
- Cookie-authenticated operational pages under
-
Internal orchestration APIs
- Cron-authenticated routes under
/api/cron/*. - Admin-authenticated routes under
/api/admin/*. - Public read routes under
/api/*.
- Cron-authenticated routes under
-
SQLite-backed benchmark engine
- Market synchronization from Polymarket.
- Decision generation through OpenRouter.
- Trade execution and settlement.
- Snapshotting, scoring, logging, and backup.
The entire application runs from a single codebase and a single SQLite database by default.
Polymarket Gamma API
|
v
lib/polymarket/*
|
v
lib/engine/market.ts ------+
|
OpenRouter API v
| SQLite / lib/db/*
v ^
lib/openrouter/* |
| |
v |
lib/engine/decision.ts -----+
|
v
lib/engine/execution.ts
|
v
trades / positions / agents
|
v
lib/engine/resolution.ts
|
v
brier_scores / snapshots / logs
|
v
lib/application/*
|
v
app/api/* routes
|
v
app/* pages + features/* + components/*
Primary routes:
//models/models/[id](idresolves canonical family slugs; legacy roster ids are compatibility aliases)/cohorts/cohorts/[id]/cohorts/[id]/models/[familySlugOrLegacyId](familySlugOrLegacyIdresolves canonical family slugs; legacy roster ids are compatibility aliases)/markets/markets/[id]/decisions/[id]/methodology/about/changelog
These pages are presentation-oriented and rely on internal API routes for benchmark data. The UI now distinguishes between:
- a live benchmark with real cohort/trade data,
- a synced preview where markets are available but cohorts have not started,
- an empty boot state with no synced markets or cohorts.
Current public read endpoints:
GET /api/healthGET /api/leaderboardGET /api/models/[id]GET /api/cohorts/[id]GET /api/cohorts/[id]/models/[familySlugOrLegacyId]GET /api/marketsGET /api/markets/[id]GET /api/decisions/recentGET /api/decisions/[id]GET /api/performance-data
These endpoints read from SQLite only. They do not mutate benchmark state.
Authenticated by signed forecaster_admin cookie:
POST /api/admin/loginDELETE /api/admin/loginGET /api/admin/statsGET /api/admin/logsGET /api/admin/costsGET /api/admin/benchmarkPOST /api/admin/benchmark/releasesPOST /api/admin/benchmark/configsPOST /api/admin/benchmark/defaultPOST /api/admin/actionPOST /api/admin/exportGET /api/admin/export
Admin routes are also rate-limited by middleware.
Authenticated by Authorization: Bearer <CRON_SECRET>:
POST /api/cron/start-cohortPOST /api/cron/run-decisionsPOST /api/cron/sync-marketsPOST /api/cron/check-resolutionsPOST /api/cron/take-snapshotsPOST /api/cron/backup
The code does not embed a scheduler. Cron timing is an operational concern and is documented in docs/OPERATIONS.md.
Checked-in Playwright smoke coverage lives under playwright/.
That suite runs against a deterministic seeded SQLite database prepared specifically for browser testing. It validates the user-facing contract that unit tests do not fully cover:
- public route rendering
- mobile navigation behavior
- seeded dynamic detail routes
- admin login and authenticated admin pages
The browser layer currently has two seeded scenarios:
- rich-data smoke coverage via
npm run test:e2e - empty-state smoke coverage via
npm run test:e2e:empty
Key files:
lib/db/index.tslib/db/schema.tslib/db/queries/*.ts
Responsibilities:
- Initialize the SQLite database.
- Enforce foreign keys and WAL mode.
- Seed baseline models and methodology metadata.
- Expose CRUD and aggregate queries.
- Provide both normal and immediate transactions.
Important current behavior:
withTransaction()is used for normal atomic write groups.withImmediateTransaction()is used for lock-first workflows that must serialize competing writers, such as decision claims and week-unique cohort creation.
Key files:
lib/polymarket/api.tslib/polymarket/aggregates.tslib/polymarket/transformers.tslib/engine/market.ts
Responsibilities:
- Fetch top markets from Polymarket.
- Re-check locally relevant markets for status changes.
- Upsert normalized market records into SQLite.
Current behavior:
- Top market ingestion is volume-based and capped by
TOP_MARKETS_COUNT. - Existing active/closed markets with open positions are revisited so status changes do not depend solely on the “top markets” window.
Key files:
lib/openrouter/client.tslib/openrouter/prompts.tslib/openrouter/parser.tslib/engine/decision.ts
Responsibilities:
- Build the model prompt from portfolio state, open positions, and current market set.
- Call OpenRouter with deterministic settings.
- Validate structured JSON responses.
- Persist decision logs and hand off trades to the execution engine.
Current execution model:
- Decisions run sequentially per cohort.
- Each model call is capped by
LLM_TIMEOUT_MS = 40000. - Transport-level retries are disabled by default in
callOpenRouterWithRetry(..., retries = 0). - Parse-level retries remain enabled through
LLM_MAX_RETRIES = 1.
Why the decision path changed:
- The system now claims exactly one decision row per
(agent_id, cohort_id, decision_week)before any model call starts. - This prevents overlapping cron runs from inserting multiple decision rows and executing duplicate trades.
Key file:
lib/engine/execution.ts
Responsibilities:
- Validate BET and SELL instructions.
- Resolve executable price from market state.
- Update positions, trades, and agent balances.
Current rules:
- Max single bet is
25%of current cash balance. - Binary markets store the traded side explicitly as
YESorNO. - Multi-outcome markets use the named outcome string as the stored side.
Key file:
lib/engine/resolution.ts
Responsibilities:
- Poll unresolved markets that are locally marked
closed. - Detect resolution via Polymarket.
- Settle open positions and record Brier scores.
- Complete cohorts once no open positions remain.
Important current ordering:
- Detect external winner.
- Settle all currently open positions for that market.
- If any settlement fails, keep the market
closedlocally so the next run can retry. - Only after all settlements succeed, mark the market
resolved.
This ordering was introduced specifically to avoid stranding open positions on a market that can no longer be revisited.
Key files:
app/api/cron/take-snapshots/route.tslib/db/queries/snapshots.tslib/scoring/pnl.tslib/scoring/brier.ts
Responsibilities:
- Mark positions to market.
- Persist timestamped portfolio snapshots.
- Maintain portfolio-level P/L series and Brier history.
Important current behavior:
- Snapshots are timestamped with
snapshot_timestamp, not a dailysnapshot_date. - Snapshots are intended to be taken on a 10-minute cadence.
- Closed-but-unresolved positions retain prior value when live feeds would incorrectly collapse them to zero.
- Snapshot upserts are unique on
(agent_id, snapshot_timestamp).
States:
activecompleted
Rules:
- A cohort is identified by
cohort_number, but week uniqueness is enforced bystarted_at. started_atis normalized to the current week start (Sunday 00:00 UTC).createCohort()is idempotent for the current week.checkAndCompleteCohorts()marks a cohort completed once it has at least one decision and zero open positions.
States are represented implicitly by row contents rather than a dedicated enum column:
-
Claimed / in-progress
- A decision row exists.
action = 'ERROR'error_message = '__IN_PROGRESS__'- placeholder prompts are written to reserve the slot.
-
Finalized success
actionbecomesBET,SELL, orHOLD.- prompt/response fields are replaced with real values.
-
Finalized error
action = 'ERROR'error_messagecontains the failure reason.
Retry behavior:
- If the prior row is a completed BET/SELL with zero recorded trades, the same row can be reclaimed and overwritten.
- If the prior row is a stale in-progress placeholder, it can also be reclaimed.
- This keeps one canonical decision row per agent/week while still allowing safe reruns.
States:
activeclosedresolvedcancelledis represented asresolvedwithresolution_outcome = 'CANCELLED'
Rules:
- Sync moves markets between
activeandclosed. - Resolution processing consumes only
closedmarkets. - Unknown/undeterminable winners are handled as
CANCELLED.
Implemented in lib/api/cron-auth.ts.
- Production fails closed if
CRON_SECRETis missing. - Requests must supply
Authorization: Bearer <CRON_SECRET>. - Comparison uses constant-time string comparison.
Implemented across:
app/api/admin/login/route.tslib/auth.tslib/api/admin-route.ts
Current model:
- Password-only login.
- Signed session cookie named
forecaster_admin. - Cookie is
httpOnly,sameSite=lax,securein production, and valid for 7 days.
Implemented in middleware.ts.
Current limits:
- Admin login: 5 POST requests per minute per IP.
- Cron POST routes: 10 requests per minute per IP.
- Other admin API routes: 30 requests per minute per IP.
This limiter is in-memory and single-instance only. It is intentionally lightweight, but it is not a distributed rate limit.
GET /api/health is public.
It reports subsystem status for:
- database reachability,
- environment completeness,
- basic data integrity.
Current privacy behavior:
- It does not reveal exact missing secret names.
- It does not expose raw database exception strings.
- It returns generic subsystem messages such as
Required configuration is incomplete.
This route is suitable for uptime monitoring, but not as a verbose debugging endpoint.
POST /api/admin/export creates a capped CSV export and packages it into a ZIP archive.
Current behavior:
- Admin-authenticated only.
- Max date window: 7 days.
- Max rows per table: 50,000.
- Export files are generated in a temporary directory.
- CSVs are written synchronously before archiving.
- Archive creation uses
spawnSync('zip', ['-j', ...])with argv, not shell interpolation. - Output filenames are sanitized to alphanumeric /
_/-. - Old export archives are cleaned after roughly 24 hours.
This matters architecturally because the export route is one of the few places that bridges application data into filesystem artifacts.
The code is schedule-agnostic, but the repository’s intended cadence is:
- Market sync: every 5 minutes
- Resolution checks: every hour
- Snapshots: every 10 minutes
- Start cohort: Sunday 00:00 UTC
- Run decisions: Sunday 00:05 UTC
- Backup: Saturday 23:00 UTC
These timings are documented operationally in docs/OPERATIONS.md.
These are the most important invariants the code currently relies on:
-
One cohort start per week
- Enforced by unique
started_at.
- Enforced by unique
-
One frozen benchmark slot per cohort participant
- Physically enforced by
UNIQUE(cohort_id, benchmark_config_model_id). - Semantically carried by
agents.family_id,agents.release_id, andagents.benchmark_config_model_id.
- Physically enforced by
-
One open position per
(agent, market, side)- Enforced by unique
(agent_id, market_id, side).
- Enforced by unique
-
One canonical decision row per agent/week
- Enforced by unique
(agent_id, cohort_id, decision_week).
- Enforced by unique
-
One snapshot per agent/timestamp
- Enforced by unique
(agent_id, snapshot_timestamp).
- Enforced by unique
-
A market is not locally marked resolved until settlement succeeds
- Enforced by resolution ordering.
Primary architecture files:
app/api/*for HTTP surfacesapp/*for thin route/page wrappersfeatures/*for page-level UI flows and feature compositioncomponents/*for shared presentational UIlib/constants.tsfor runtime configlib/application/*for route-facing orchestration and read modelslib/db/*for persistencelib/engine/*for benchmark orchestrationlib/openrouter/*for model I/Olib/polymarket/*for market I/Olib/scoring/*for P/L and Brier logicmiddleware.tsfor request protection
docs/API_REFERENCE.mddocs/OPERATIONS.mddocs/SECURITY.mddocs/DATABASE_SCHEMA.mddocs/DECISIONS.mddocs/SCORING.md