From d78052f3f54e5fa8ef3bf61520465c740a9b9b7c Mon Sep 17 00:00:00 2001 From: Meshal Date: Sat, 6 Jun 2026 17:34:43 -0700 Subject: [PATCH 1/2] docs: adopt anti-rot debt ledger and ADR directory --- docs/DEBT.md | 26 ++++++++++++++++++++++++++ docs/adr/0000-template.md | 33 +++++++++++++++++++++++++++++++++ 2 files changed, 59 insertions(+) create mode 100644 docs/DEBT.md create mode 100644 docs/adr/0000-template.md diff --git a/docs/DEBT.md b/docs/DEBT.md new file mode 100644 index 0000000..e185c7a --- /dev/null +++ b/docs/DEBT.md @@ -0,0 +1,26 @@ +--- +type: canonical +source: none +sla: on-change +last_updated: "2026-06-06" +audience: [ai-agents, contributors] +--- + +# Technical Debt Ledger + +The accumulated cost of deliberate shortcuts. The goal is not zero debt, it is +zero untracked debt. Anything recorded here was a conscious choice with a known +fix. Add entries with `/debt-log`. Remove an entry when the debt is paid (note it +in the PR). + + diff --git a/docs/adr/0000-template.md b/docs/adr/0000-template.md new file mode 100644 index 0000000..226eb85 --- /dev/null +++ b/docs/adr/0000-template.md @@ -0,0 +1,33 @@ +--- +type: canonical +source: none +sla: on-change +last_updated: "2026-06-06" +audience: [ai-agents, contributors] +--- + +# ADR-0000: + +> Template. Copy to `docs/adr/<NNNN>-<kebab-title>.md` (use `/new-adr`). +> One decision per record. Records are append-only history: supersede, do not rewrite. + +- **Status:** Proposed | Accepted | Superseded by ADR-XXXX +- **Date:** YYYY-MM-DD +- **Deciders:** <names / from repo-framework ownership> + +## Context + +What is the situation? What forces, constraints, and requirements make a decision +necessary? Keep it factual. + +## Decision + +What we are doing, stated in active voice: "We will ...". Be specific about scope +and boundaries (module layout, dependency direction, technology choice). + +## Consequences + +- **Positive:** what gets easier or safer. +- **Negative / tradeoffs:** what gets harder, and any debt this introduces + (cross-reference `docs/DEBT.md` if applicable). +- **Follow-ups:** anything that must happen as a result. From c78876694be3ca9075af23128678c4d015a7a625 Mon Sep 17 00:00:00 2001 From: Meshal <contact@meshal.ai> Date: Sat, 6 Jun 2026 19:15:58 -0700 Subject: [PATCH 2/2] docs: replace stub placeholders with honest content --- docs/deployment.md | 55 +++++++++++++++++++++++++++++++++++++---- docs/troubleshooting.md | 2 -- 2 files changed, 50 insertions(+), 7 deletions(-) diff --git a/docs/deployment.md b/docs/deployment.md index edf504f..fd79816 100644 --- a/docs/deployment.md +++ b/docs/deployment.md @@ -6,13 +6,58 @@ last-reviewed: 2026-03-31 # Deployment and Release · fallax -> TODO: Document deployment process, release strategy, and rollback procedures. +Fallax is a research and benchmarking tool, not a deployed service. There is +no server, no container, and no production environment to operate. All +evaluation runs execute locally against the provider APIs you configure. -## Deployment Process +## Running Evaluations Locally -## Release Strategy +Follow the Quick Start in the root [README](../README.md): -## Rollback Procedures +```bash +# Install deps (core + all extras) +uv sync --all-extras -## Environment Configuration +# Run a model evaluation +uv run python -m fallax run \ + --models claude-sonnet-4-6 \ + --judge claude-haiku-4-5-20251001 \ + --output results.jsonl +# Capture a baseline against the v1 benchmark +uv run python -m fallax baseline capture \ + --version v1 \ + --model claude-sonnet-4-6 \ + --judge claude-haiku-4-5-20251001 + +# Compare against a captured baseline +uv run python -m fallax baseline compare \ + --version v1 \ + --model claude-sonnet-4-6 \ + --judge claude-haiku-4-5-20251001 +``` + +Required environment variables: + +| Provider | Variable | +|---|---| +| Anthropic (default) | `ANTHROPIC_API_KEY` | +| OpenAI / OpenRouter | `OPENAI_API_KEY` or `OPENROUTER_API_KEY` | +| Google Gemini | `GOOGLE_API_KEY` | + +## Optional Dashboard + +The FastAPI results explorer is available locally only: + +```bash +uv sync --extra dashboard +uv run python -m fallax dashboard +``` + +It does not require any deployment infrastructure. + +## Versioning + +Fallax follows [Semantic Versioning](https://semver.org/). The current package +version is in `pyproject.toml`. Releases are tagged in git; there is no +release pipeline or package registry publication at this time. diff --git a/docs/troubleshooting.md b/docs/troubleshooting.md index 2e0a7d4..02939ec 100644 --- a/docs/troubleshooting.md +++ b/docs/troubleshooting.md @@ -6,8 +6,6 @@ last-reviewed: 2026-03-31 # Troubleshooting · fallax -> TODO: Document known failure modes, diagnostic steps, and common fixes. - ## Common Issues ### `git status` shows many files modified right after a clone or pull