@argahv
Legal Agent is an AI-assisted contract review platform aimed at high-stakes Master Service Agreements (MSAs) and similar vendor paper. Teams upload agreements, extract structured clauses, score legal and operational risk, and align suggestions with an enterprise playbook backed by pgvector retrieval. Human reviewers stay in the loop for approvals, redlines, and immutable audit history.
This repository is split across parallel engineering tracks:
| Track | Scope |
|---|---|
| Backend executor | backend/app/ — FastAPI entrypoint, routers, service layer wiring |
| Frontend executor | frontend/app/, frontend/components/, frontend/lib/ — reviewer UX |
| Infra & docs (this PR surface) | Root Makefile, Docker, Compose, docs/, CI, helper scripts |
Current scaffold note:
backend/app/main.pyand Alembic migrations are expected from the API track. Tooling (Makefile, Compose, Dockerfiles) already targetsuvicorn app.main:appso the stack lights up as soon as those files land.
Legal teams drown in repetitive first-pass review — scanning the same vendor tricks (microscopic liability caps, sneaky auto-renewals, one-sided indemnity) across hundreds of pages per quarter. Junior counsel loses velocity; senior counsel loses patience.
Legal Agent compresses that workflow to:
- Ingest — DOCX/PDF upload with resilient text extraction (
backend/app/ai/document_text.py). - Understand — LLM-powered extraction to typed clauses (
backend/app/ai/chains/extraction.py). - Check — deterministic guardrails (
backend/app/ai/chains/rule_engine.py). - Score & narrate — risk storytelling that cites playbook language (
backend/app/ai/chains/risk.py). - Recommend — RAG-ranked redlines tied to
PlaybookEntryrows (backend/app/models/playbook.py). - Decide — human approvals + audit trail (
backend/app/models/approval.py,audit.py).
Non-goals for the MVP: autonomous signing, definitive legal advice without counsel, or uncited model claims.
Drop real UI captures here once the frontend track publishes pages:
docs/assets/screenshots/
01-dashboard.png # Contract queue + risk heatmap
02-clause-detail.png # Extracted clause + playbook comparison
03-redline-diff.png # Suggested replacement + reviewer actions
04-approvals.png # Approval routing + SLA timestamps
Until then, use make frontend.dev locally and capture after branding pass.
- Docker Compose v2 (
docker compose version) - Python 3.12 (for local venv workflows)
- Node 22 + npm (for local Next.js)
make
Requires Docker Desktop running (for Compose Postgres). Then either:
cp .env.example .env
# Edit: JWT_SECRET_KEY, OPENROUTER_* or OPENAI_* (chat + embedding model ids), DATABASE_URL if not using Compose defaults
make backend.install # local Python venv (needed for Alembic + seeds + embeddings from the host)
make setup # same as: bash scripts/full_setup.sh → dev_bootstrap.sh (db up, alembic upgrade head, seed_playbook, embed_playbook_vectors)Or run the shell script directly:
./scripts/dev_bootstrap.shUse **make vector.index** (or bash scripts/vector_reindex.sh) to force re-embed all playbook rows after you change wording or models.
Destructive DB reset (drop database, migrate, seed, embed): bash scripts/reset_db.sh
make upServices (see docker-compose.yml):
| Service | Port | Purpose |
|---|---|---|
db |
5432 | Postgres 16 + pgvector |
backend |
8000 | FastAPI (hot reload in Compose) |
frontend |
3000 | Next.js production image |
make backend.install # backend/.venv
make frontend.install
docker compose up -d db
make backend.stop # optional: if port 8000 is already in use (Errno 48)
make backend.run # needs app.main for uvicorn import
make frontend.devOpenAPI (once routers exist): http://localhost:8000/docs
┌──────────────┐ HTTPS/WSS ┌──────────────────────┐
│ Browser │◀──────────────▶│ Next.js 15 (React) │
│ (reviewers) │ │ frontend/* │
└──────────────┘ └─────────┬────────────┘
│ REST + WS
▼
┌──────────────────────────────────────────────────────┐
│ FastAPI (async) — backend/app │
│ • JWT auth / rate limiting │
│ • Upload + ingestion │
│ • LangChain extraction + risk chains │
│ • WebSocket hub (`app/ws/hub.py`) │
└─────────┬───────────────────────────────┬────────────┘
│ SQLAlchemy async │
▼ ▼
┌──────────────────────┐ ┌─────────────────┐
│ Postgres + pgvector │ │ OpenAI APIs │
│ • contracts/clauses │ │ chat + embed │
│ • playbook embeddings│ └─────────────────┘
│ • audit / approval │
└──────────────────────┘
Mermaid counterparts live in docs/ARCHITECTURE.md for zoomable diagrams.
| Layer | Choice | Why |
|---|---|---|
| API | FastAPI | Native async, OpenAPI, excellent typing |
| ORM | SQLAlchemy 2 + asyncpg | Mature migrations story, aligns with FastAPI DI |
| Vectors | pgvector in Postgres | Fewer moving parts than split vector DB for MVP |
| AI | LangChain + LangSmith | Traceability + composable LCEL-style chains |
| UI | Next.js 15 + Tailwind + shadcn | App Router, rapid iteration, accessible primitives |
| Tooling | Make + Compose | Lowest-common-denominator DX for mixed teams |
Authoritative template: .env.example (every variable commented). Highlights:
| Variable | Role |
|---|---|
DATABASE_URL |
Async SQLAlchemy DSN (postgresql+asyncpg://...) |
POSTGRES_* |
Compose db service credentials |
JWT_SECRET_KEY |
Signing material for access/refresh tokens |
OPENAI_API_KEY / OPENAI_MODEL |
Chat + default model tier |
OPENAI_EMBEDDING_MODEL / VECTOR_DIM |
Embeddings + pgvector width |
LANGSMITH_* + LANGCHAIN_TRACING_V2 |
Observability switches |
CORS_ORIGINS |
Browser origins allowed to call API with credentials |
NEXT_PUBLIC_API_URL / NEXT_PUBLIC_WS_URL |
Frontend discovery |
Naming nuance: backend/app/core/config.py validates ACCESS_TOKEN_EXPIRE_MINUTES / REFRESH_TOKEN_EXPIRE_DAYS — the example file mirrors both JWT_* and canonical names to reduce onboarding friction.
| Task | Command |
|---|---|
| Full stack up/down | make up / make down |
| Tail logs | make logs |
| Backend shell tests | make backend.test |
| Frontend lint/typecheck | make frontend.lint, make frontend.typecheck |
| Repo-wide gate | make check |
| DB console | make db.shell |
| Reset database | make db.reset (destructive) |
| Playbook re-embed guide | make vector.index |
| LangSmith UI | make langsmith.open |
./scripts/dev_bootstrap.sh and make backend.seed load rows via backend/scripts/seed_playbook.py (idempotent).
Targets are self-documented via make help. Notable implementations:
- backend.lint currently enforces Ruff on
backend/scripts+backend/testsbecausebackend/app/is mid-flight in a parallel executor (seedocs/CONTRIBUTING.md). - frontend.test skips gracefully until a
testscript lands infrontend/package.json.
- Backend image: multi-stage wheel build,
python:3.12-slim,tiniPID1, non-rootappuser. Seebackend/Dockerfile. - Frontend image: Node 22, multi-stage; entrypoint tries
.next/standalone/server.jsthennpm run start. Enableoutput: "standalone"innext.config.tswhen the frontend track opts in (Dockerfile comment documents this — do not editnext.config.tsfrom infra PRs unless you own that track). - Compose layers optional
env_fileentries (.env.examplethen.env) sodocker compose config -qworks on fresh clones.
| Doc | Contents |
|---|---|
docs/ARCHITECTURE.md |
Context diagrams, ER-style model chart, lifecycle |
docs/AI_PIPELINE.md |
Chains, retrieval, LangSmith walkthrough |
docs/DEBUGGING_AI.md |
Logging, traces, eval harness sketch |
docs/COST_OPTIMIZATION.md |
Model tiering, caching, batching |
docs/SECURITY.md |
Encryption, PII, OWASP mapping |
docs/DEPLOYMENT.md |
Managed Postgres, Fly/Render/ECS thoughts |
docs/SCALING.md |
RLS vs schema isolation, queues |
docs/API.md |
Endpoint intent + /docs pointer |
docs/PROMPTS.md |
Example prompt templates per clause |
docs/PLAYBOOK_SAMPLES.md |
Worked legal examples + mapping |
docs/AI_ENGINEERING_ROADMAP.md |
Senior → staff AI engineering curriculum |
docs/EVOLUTION.md |
Product phases + integration roadmap |
docs/CONTRIBUTING.md |
Branching, commits, PR checklist |
GitHub Actions (.github/workflows/ci.yml) runs:
- backend — Ruff on
scripts/tests, pytest with Postgres + pgvector service - frontend — ESLint,
tsc --noEmit,next build - docker-build — verifies backend + frontend Dockerfiles
The API track has not merged backend/app/main.py yet — expected. Run domain modules (pytest, seeds) or stub main.py locally if you need Compose health temporarily.
Ensure .env DATABASE_URL uses hostname db and matches POSTGRES_*. After changing env vars, docker compose down && make up.
Check NEXT_PUBLIC_API_URL matches published backend port (8000 by default) and that CORS_ORIGINS includes your UI origin.
Confirm ./docker/postgres/init-pgvector.sql mounted (see docker-compose.yml). For existing volumes created before the init script, run CREATE EXTENSION vector; manually via make db.shell.
Raise RATE_LIMIT_PER_MINUTE temporarily in .env — never commit secrets.
Heavy python / npm installs can exhaust space — remove old Docker images (docker system prune) or relocate Docker data root.
Enable tracing: LANGCHAIN_TRACING_V2=true + valid LANGSMITH_API_KEY. Re-run a chain; traces may take ~30s to appear.
- Rotate
JWT_SECRET_KEYfor every environment; store in a secret manager in production. - Treat uploads as untrusted — scan/examine binaries before server-side parsing in production.
- Never log raw contract text at INFO in multi-tenant deployments — see
docs/SECURITY.md. - Enable TLS everywhere outside local dev; align WebSocket URL scheme (
wss://).
- Choose managed Postgres with pgvector + backups (Supabase, Neon, RDS).
- Push images via CI to ECR/GCR/Artifact Registry.
- Run API + worker (future) autoscaling off request queue depth + CPU.
- Export OpenTelemetry to Honeycomb/Datadog; keep LangSmith for LLM-specific spans.
- Promote migrations separately from app deploys — see
docs/DEPLOYMENT.md.
The frontend/ directory currently carries its own Git history (legacy CNA bootstrap). Root git init now exists, but Git will not automatically track nested repo contents.
You should consolidate by either:
- Removing
frontend/.gitand recommitting the tree into the monorepo, or - Using
git subtree add/ submodule strategy intentionally.
This infra PR does not delete frontend/.git per project constraints.
Ruff runs on scripts/tests only until backend/app/ import order + unused import debt is cleared by the owning track (see Makefile comment).
.
├── backend/
│ ├── Dockerfile
│ ├── app/ # domain code — parallel executor
│ ├── scripts/ # seed + ops helpers (infra-owned)
│ ├── tests/
│ ├── requirements.txt
│ ├── requirements-dev.txt
│ └── pyproject.toml # Ruff config
├── frontend/
│ ├── Dockerfile
│ ├── docker-entrypoint.sh
│ └── ...
├── docker/
│ └── postgres/
│ └── init-pgvector.sql
├── docs/
├── scripts/
│ ├── dev_bootstrap.sh
│ ├── reset_db.sh
│ └── vector_reindex.sh
├── .github/workflows/
├── docker-compose.yml
├── Makefile
├── .env.example
└── README.md # you are here
See docs/CONTRIBUTING.md for branching + review norms. License file not included in this scaffolding — add one before open-sourcing.
| Capability | Module |
|---|---|
| Clause extraction | backend/app/ai/chains/extraction.py |
| Deterministic checks | backend/app/ai/chains/rule_engine.py |
| Risk narration | backend/app/ai/chains/risk.py |
| Prompt inventory | backend/app/ai/prompts.py |
| Playbook ORM | backend/app/models/playbook.py |
| Settings | backend/app/core/config.py |
These paths are safe to cite in docs and onboarding decks — they won't change casually.
| Cadence | Activity |
|---|---|
| Daily | Watch LangSmith error runs + API 5xx |
| Weekly | Playbook drift review with counsel |
| Monthly | Cost retro (docs/COST_OPTIMIZATION.md metrics) |
| Quarterly | DR drill on Postgres restore |
Initial targets (tune with production telemetry):
- Ingest + text extraction < 5s for typical DOCX ≤ 20 MB on dev-grade CPUs
- End-to-end first-pass review JSON < 60s for ~40-page MSA using
gpt-4o-miniextraction path - API availability SLO begins at 99.5% pre-enterprise hardening — tighten as on-call matures
Document deviations with trace + query artifacts in postmortems.
- Map each contract to
organization_idwhen migrations arrive — never query without tenant filter. - Record reviewer decisions in
Auditrows; disallow destructive updates via DB roles. - Pair deletion policies in object storage with contract retention legal holds.
- Insert rows with admin tooling (future) or temporary SQL.
- Re-embed vectors — follow
make vector.indexguidance until automation ships. - Update
docs/PLAYBOOK_SAMPLES.mdwith rationale so future reviewers understand non-obvious positions.
| Term | Definition |
|---|---|
| Playbook | Curated standard positions + acceptable fallback language |
| Redline | Suggested textual change with playbook linkage |
| Clause | Atomic legal segment extracted for analysis |
| RAG | Retrieval-augmented generation (pgvector + prompt hints) |
Until a formal support channel exists:
- Start from
docs/DEBUGGING_AI.mdfor AI regressions. - Check Compose logs (
make logs) for connectivity issues. - Capture LangSmith traces for model regressions.
Elevator pitch: "Legal Agent gives every reviewer a playbook-aware co-pilot — extraction, risk scoring, and cited redlines in one workflow, grounded in your enterprise standards instead of generic GPT advice."
Buyer: VP Legal Ops + Head of Procurement partnering with IT for AI governance.
Proof points: Traceable suggestions, audit trail, editable playbook corpus.
Read docs/EVOLUTION.md for multi-agent supervision, DocuSign/Slack integrations, and fine-tuned domain models — architecture intents complement MVP scope.
Populate when docs/DEBUGGING_AI.md harness exists:
| Dataset | Metric | Baseline | Current |
|---|---|---|---|
| Clause F1 (sample) | Macro F1 | — | — |
| Playbook recall@5 | Recall | — | — |
| Cost / contract | USD | — | — |
- Use
make helpfrequently — targets stay descriptive. - Prefer
docker composeover ad-hoc manual container sprawl. - Keep
.envout of Git; prefer secret store integration for shared staging clusters.
This README describes software architecture only. It does not provide legal counsel. All contractual decisions require qualified attorneys.
Design inspirations: modern legal ops tooling, LangChain observability best practices, and enterprise security guidance from SOC2 programs.
| Environment | Database | Tracing | Uploads |
|---|---|---|---|
| Local dev | Compose Postgres | Optional LangSmith | Local volume path |
| Staging | Managed Postgres | On | Private bucket |
| Production | HA Postgres | Required | SSE-KMS bucket |
| Code | Meaning |
|---|---|
| 200 | OK / resource returned |
| 201 | Created (upload / resource spawn) |
| 202 | Accepted async job |
| 400 | Malformed client request |
| 401/403 | AuthZ issues |
| 409 | Conflict (approval state) |
| 422 | Validation errors (schema) |
| 429 | Rate limit (SlowAPI) |
| Event | Payload hints |
|---|---|
ingest.status |
{ contract_id, stage, pct } |
review.update |
{ contract_id, summary_version } |
approval.requested |
{ approval_id, assignee } |
Actual schemas arrive with router implementation.
| Prefix | Meaning |
|---|---|
backend.* |
Python service |
frontend.* |
Next.js app |
db.* |
Postgres via Compose |
* (top-level) |
Meta / orchestration |
Make is dependency-light, works offline, and reads well in READMEs for polyglot teams. If you prefer just, wrap the same scripts.
- Which contract types beyond MSA in v1?
- Mandatory human approval on which clause families?
- Data residency commitments by customer segment?
- Maximum permissible automation before ethics review?
Track answers in your private issue tracker — not in this public template if sensitive.
| Version | Date | Highlights |
|---|---|---|
| 0.1.0-scaffold | TBD | Monorepo infra + docs |
Ship small, measure, annotate traces, and keep humans foremost in the loop. Legal Agent is infrastructure for judgment — not a replacement for it.
Happy reviewing.
Before opening a PR or sharing your laptop screen:
JWT_SECRET_KEYis random and not the example stringOPENAI_API_KEYbelongs to a project with spend alertsDATABASE_URLmatches Compose credentials when using DockerLANGCHAIN_TRACING_V2is disabled in prod unless SOC2 controls allow itCORS_ORIGINSlists only trusted UI hosts (no wildcards in prod)
Ensure backend/ is writable so .venv and pytest caches can be created. On macOS with SIP, avoid storing the repo inside cloud-synced folders that strip +x bits from helper scripts — re-run chmod +x scripts/*.sh if Git checkouts drop execute permission.
Container names (legal-agent-db, etc.) are explicit for support threads. If you operate multiple clones, override COMPOSE_PROJECT_NAME or adjust name: in docker-compose.yml to avoid collisions.
Uncomment the worker service stub after you add arq/celery dependencies and a worker entrypoint. Until then, rely on FastAPI BackgroundTasks for short jobs only.
Beyond vector, you may eventually require pg_trgm for fuzzy text matches or btree_gin for composite patterns — document each extension in Alembic revisions with rollback notes.
- Reviewer: "System flagged uncapped consequential damages waiver."
- Engineer: "Rule engine hit keyword triple + playbook similarity 0.91 — trace
abc123in LangSmith." - Reviewer: "Accepting redline v2 — please log approval."
Stories like this become training data for better UX copy — capture them anonymized.
Frontend executor owns shortcuts — list here once the command palette ships (/ search, ⌘K, etc.).
Risk heatmaps should not rely solely on red/green hues — pair color with icons, numeric severity, and text labels for accessibility compliance.
If MSAs arrive bilingual, store language metadata on Document rows and branch OCR + prompts accordingly. Do not assume English-only regex in rule_engine.py long-term.
For immutable integrity proofs, consider hashing normalized text + prior audit hash into each new Audit row (Merkle-lite). Legal teams occasionally ask for tamper evidence beyond database ACLs.
Set monthly OpenAI budget alerts in the provider console and track internal tokens_out metrics. Tie breaker: prefer reducing max tokens before swapping to lower-quality models for critical clauses.
- Incidents: (none / links)
- AI eval metrics: schema validity %, cost/contract
- Playbook changes merged:
- Customer demos scheduled:
- Blockers:
Paste into your tracker of choice.
git statusat repo root should be the default — nested repos confuse beginners.- Consider
git config status.showUntrackedFiles normaliffrontend/.gitremains transiently.
Use Legal Agent in prose, legal-agent for package/Compose/docker image slugs, and legal_agent for SQL identifiers matching POSTGRES_DB.