Open-source infrastructure for supervised legal AI in England & Wales.
Open a matter. Ask the assistant. Install legal modules. Run them through capability and privilege gates. Keep the audit trail.
Legalise is open source. The hosted site is a limited evaluation environment. Real AI workflows require your own model key. Legalise does not provide model access and is not for live client matters.
The interesting question in legal AI is no longer only what AI can automate.
It is what a firm should choose not to automate, where human judgement must stay named, and how the system proves that boundary held.
Legalise is an open-source substrate for that problem. The bet is simple: legal AI should not just be a chatbot that answers questions. If it touches legal work, it should sit inside a matter file. It should know what documents it used, what permissions it had, what it produced, and which qualified human remained responsible for the important calls.
The audit trail matters because it makes the work inspectable. But audit is not the product. Audit is the receipt.
The full thesis lives in docs/MANIFESTO.md. The claim boundary lives in docs/SUPERVISED_AUTONOMY.md. The hosted site is only an evaluation environment.
Legalise currently ships a worked evaluation workspace around the Khan v Acme sample matter:
- Matter workspace: parties, documents, chronology, privilege posture, retention clock, audit trail.
- Assistant: matter-scoped chat with document and chronology citations.
- Workflows / modules: Pre-Motion, Contract Review, Letters, Tabular Review, Case Law, Anonymisation, document edit.
- Capability gates: manifests declare what a skill needs; the workspace grants it; runtime checks it.
- Privilege-aware gateway: Anthropic, OpenAI, and Ollama behind one dispatch layer.
- BYO keys: users store their own provider keys encrypted at rest.
- Audit trail: model calls, module actions, denials, mutations, provider failures, and storage failures leave rows.
- Hosted-access posture: public evaluation signup is open; real AI workflows still require your own model key.
The plugin layer, where most of the legal logic lives, is claude-for-uk-legal: 15 skills across UK employment law, civil litigation, and legal research.
A regulator, insurer, supervisor, or partner will eventually ask four questions about any AI tool used on a matter:
- What did it see?
- Under what protection?
- What did it produce?
- Who remained accountable?
Every matter has a spine: documents, chronology, parties, retention clock, privilege posture. The AI only sees what lives inside the matter. Cross-matter leakage is structurally impossible.
Disclosure-tainted chronology entries carry a CPR 31.22 implied-undertaking flag. The chronology gate withholds detail until acknowledgement. The acknowledgement is audited.
Every model call, document mutation, chronology entry, and capability denial writes one row to an audit log that the application never updates or deletes. Timestamped, hashed, tied to the matter and the actor. The Audit tab is the regulator-facing record.
Append-only is enforced by convention today: the application never writes UPDATE or DELETE against audit_entries. Postgres-level WORM grants (REVOKE UPDATE/DELETE on the table for the app role) are a live-matter readiness gate; the current audit trail is therefore not forensically tamper-resistant against a DB superuser. See docs/TRUST.md.
No background calls. No invisible inference. If it touched the matter, it's logged.
Every matter carries one of three privilege flags.
A_cleared: privileged material excluded or cleared. Cloud providers permitted.B_mixed: opt-in per provider. Default for most matters.C_paused: privileged material present or unresolved. Cloud calls refused at the gateway. Local model only (Ollama).
The gateway reads the posture before every model call. Privilege is a hard dispatch constraint, not a checkbox.
Prompt and response are hashed and stored. So is the model, the tokens, the latency, the posture, the module that made the call. Any AI interaction on the matter can be reconstructed from the audit row, subject to the tamper-resistance caveat above.
Capabilities for each module are declared in the manifest (read documents, call the model, write citations, etc.), granted on install, and checked at runtime before every privileged operation. A denial is a structured 403 plus an audit row.
The doctrine:
Manifest requests capabilities. Workspace grants capabilities. Runtime enforces capabilities.
The hosted evaluation environment at legalise.dev is open for evaluation accounts. You can browse the Khan v Acme demo on the hosted site, create an account to run the workspace, or run the full stack locally.
Stack: Postgres + pgvector + MinIO + Redis + Gotenberg + FastAPI + React.
-
Clone.
git clone https://github.com/b1rdmania/legalise cd legalise -
Copy env. Every variable has a working default for the Khan v Acme demo. The only decision is whether to set a provider key (Anthropic / OpenAI) or run the keyless
stub-echomodel.cp .env.example .env
-
Bring the stack up.
docker compose -f infra/docker-compose.yml up --build -d
-
Check the stack with
legalise doctor. Inspection-only; verifies the database is reachable, migrations are current, MinIO is responding, plugins are mounted, and the v2 manifests validate.docker compose -f infra/docker-compose.yml exec backend python -m app.tools.doctorPre-signup,
khan.demo_presentwill soft-noteno users yet — seed lands on first signup. That's expected. -
Register an account. Open http://localhost:3000 and use the signup form. Dev-autoverify is on, so registration immediately verifies the account — no SMTP setup or email click is needed. You'll land signed in as a non-superuser; the Khan demo matter seeds on first signup.
-
Promote yourself to superuser via the bootstrap CLI. The CLI promotes an existing user — run it after step 5, not before.
docker compose -f infra/docker-compose.yml exec backend \ python -m app.tools.bootstrap_admin --email you@example.com -
Reload the browser so
AuthProviderre-fetches/auth/users/me. Superuser context loads. -
Re-run doctor —
khan.demo_presentshould now beok. The full Khan v Acme demo is wired; seedocs/DEMO.mdfor the install → grant → run → audit walkthrough.
To prove the fork is healthy end-to-end without driving the UI by hand, run ./scripts/smoke.sh. It executes the same Playwright first-run spec the CI workflow runs (truncates the local database — see the script's prompt).
- If deploying your fork to Fly, change
app = "legalise-backend"inbackend/fly.tomlbeforefly deploy. - The backend image vendors
claude-for-uk-legalat a pinned SHA. Forks can point at their own plugin catalogue with the Docker build argsPLUGINS_REPOandPLUGINS_REPO_REF. - Common setup errors and their fixes live in
docs/TROUBLESHOOTING.md.
Evaluation release candidate. Honest about what's in and what isn't.
Shipped:
- Five surfaces wired end-to-end against the Khan v Acme matter
- Audit middleware on every model call and matter mutation
- Privilege-aware gateway across Anthropic, OpenAI, Ollama
- Runtime capability enforcement at five boundaries (plugin bridge, model gateway, tool invocation, document body read, citation writes)
- Tracked-changes editing with accept / reject and version timeline
- fastapi-users cookie sessions, email verification, per-user AES-256-GCM-encrypted provider keys
- Bootstrap audit rows on per-user seed so the Audit tab is non-empty on first paint
- Real-DB E2E test infrastructure; 155 passed, 53 skipped in backend CI
Live-matter readiness gates:
- Real R2/S3 object storage for uploaded and generated artefacts. Fly filesystem remains cache/materialisation only.
- Durable jobs (
arq+ Redis +jobstable). Long runs should not depend on a live request. - Release-step migrations instead of app-boot schema mutation.
- Hosted evaluation limits: storage, workflow runs, active jobs, generated artefacts, and public submissions.
- Matter export / delete with retention-aware audit handling.
- WORM audit groundwork.
- Key-rotation runbook for encrypted provider keys.
v0.6 trust layer:
- Configurable prompt shroud before cloud model dispatch.
- Legal-quality evals for grounding, citation integrity, refusal behaviour, and module regressions.
Full roadmap: docs/ROADMAP.md.
docs/MANIFESTO.md: commitments that don't movedocs/TRUST.md: privilege architecture, sub-processor list, open gapsdocs/SUPERVISED_AUTONOMY.md: launch definition and claim boundaryARCHITECTURE.md: stack rationale and decisionsdocs/ENGINEERING.md: bespoke vs boring; what's custom, what's stockdocs/AUTH.md: auth and provider-key modeldocs/MODULE_DEVELOPMENT.md: write a new moduledocs/ATTRIBUTIONS.md: library credits and licence notes
This is software for evaluating legal-AI workflows. It is not legal advice, not a law firm, and not for live client matters. Real regulated use needs the firm’s own supervision, policies, model-key posture, and professional controls.
Apache 2.0. See LICENSE.
@b1rdmania. Open an issue. Or, if you're a UK solicitor wondering what your AI did with the client documents, get in touch.
Canonical upstream: github.com/b1rdmania/legalise. Forks are independent deployments and are not operated, reviewed, or endorsed by the maintainer unless explicitly stated.
