Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 26 additions & 0 deletions docs/DEBT.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
---
type: canonical
source: none
sla: on-change
last_updated: "2026-06-06"
audience: [ai-agents, contributors]
---

# Technical Debt Ledger

The accumulated cost of deliberate shortcuts. The goal is not zero debt, it is
zero untracked debt. Anything recorded here was a conscious choice with a known
fix. Add entries with `/debt-log`. Remove an entry when the debt is paid (note it
in the PR).

<!-- New entries are appended below, newest first. Format:

### <short title>
- **Date:** YYYY-MM-DD
- **Where:** <file/module/path>
- **What:** the shortcut and why the proper fix was not done
- **Risk if left:** what degrades over time
- **Suggested fix:** the path to doing it right
- **Owner:** <from CODEOWNERS or repo-framework ownership>

-->
33 changes: 33 additions & 0 deletions docs/adr/0000-template.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
---
type: canonical
source: none
sla: on-change
last_updated: "2026-06-06"
audience: [ai-agents, contributors]
---

# ADR-0000: <Title of the decision>

> Template. Copy to `docs/adr/<NNNN>-<kebab-title>.md` (use `/new-adr`).
> One decision per record. Records are append-only history: supersede, do not rewrite.

- **Status:** Proposed | Accepted | Superseded by ADR-XXXX
- **Date:** YYYY-MM-DD
- **Deciders:** <names / from repo-framework ownership>

## Context

What is the situation? What forces, constraints, and requirements make a decision
necessary? Keep it factual.

## Decision

What we are doing, stated in active voice: "We will ...". Be specific about scope
and boundaries (module layout, dependency direction, technology choice).

## Consequences

- **Positive:** what gets easier or safer.
- **Negative / tradeoffs:** what gets harder, and any debt this introduces
(cross-reference `docs/DEBT.md` if applicable).
- **Follow-ups:** anything that must happen as a result.
55 changes: 50 additions & 5 deletions docs/deployment.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,13 +6,58 @@ last-reviewed: 2026-03-31

# Deployment and Release · fallax

> TODO: Document deployment process, release strategy, and rollback procedures.
Fallax is a research and benchmarking tool, not a deployed service. There is
no server, no container, and no production environment to operate. All
evaluation runs execute locally against the provider APIs you configure.

## Deployment Process
## Running Evaluations Locally

## Release Strategy
Follow the Quick Start in the root [README](../README.md):

## Rollback Procedures
```bash
# Install deps (core + all extras)
uv sync --all-extras

## Environment Configuration
# Run a model evaluation
uv run python -m fallax run \
--models claude-sonnet-4-6 \
--judge claude-haiku-4-5-20251001 \
--output results.jsonl

# Capture a baseline against the v1 benchmark
uv run python -m fallax baseline capture \
--version v1 \
--model claude-sonnet-4-6 \
--judge claude-haiku-4-5-20251001

# Compare against a captured baseline
uv run python -m fallax baseline compare \
--version v1 \
--model claude-sonnet-4-6 \
--judge claude-haiku-4-5-20251001
```

Required environment variables:

| Provider | Variable |
|---|---|
| Anthropic (default) | `ANTHROPIC_API_KEY` |
| OpenAI / OpenRouter | `OPENAI_API_KEY` or `OPENROUTER_API_KEY` |
| Google Gemini | `GOOGLE_API_KEY` |

## Optional Dashboard

The FastAPI results explorer is available locally only:

```bash
uv sync --extra dashboard
uv run python -m fallax dashboard
```

It does not require any deployment infrastructure.

## Versioning

Fallax follows [Semantic Versioning](https://semver.org/). The current package
version is in `pyproject.toml`. Releases are tagged in git; there is no
release pipeline or package registry publication at this time.
2 changes: 0 additions & 2 deletions docs/troubleshooting.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,6 @@ last-reviewed: 2026-03-31

# Troubleshooting · fallax

> TODO: Document known failure modes, diagnostic steps, and common fixes.

## Common Issues

### `git status` shows many files modified right after a clone or pull
Expand Down