Legal Agent

@argahv

Legal Agent is an AI-assisted contract review platform aimed at high-stakes Master Service Agreements (MSAs) and similar vendor paper. Teams upload agreements, extract structured clauses, score legal and operational risk, and align suggestions with an enterprise playbook backed by pgvector retrieval. Human reviewers stay in the loop for approvals, redlines, and immutable audit history.

This repository is split across parallel engineering tracks:

Track	Scope
Backend executor	`backend/app/` — FastAPI entrypoint, routers, service layer wiring
Frontend executor	`frontend/app/`, `frontend/components/`, `frontend/lib/` — reviewer UX
Infra & docs (this PR surface)	Root `Makefile`, Docker, Compose, `docs/`, CI, helper scripts

Current scaffold note: backend/app/main.py and Alembic migrations are expected from the API track. Tooling (Makefile, Compose, Dockerfiles) already targets uvicorn app.main:app so the stack lights up as soon as those files land.

Product vision

Legal teams drown in repetitive first-pass review — scanning the same vendor tricks (microscopic liability caps, sneaky auto-renewals, one-sided indemnity) across hundreds of pages per quarter. Junior counsel loses velocity; senior counsel loses patience.

Legal Agent compresses that workflow to:

Ingest — DOCX/PDF upload with resilient text extraction (backend/app/ai/document_text.py).
Understand — LLM-powered extraction to typed clauses (backend/app/ai/chains/extraction.py).
Check — deterministic guardrails (backend/app/ai/chains/rule_engine.py).
Score & narrate — risk storytelling that cites playbook language (backend/app/ai/chains/risk.py).
Recommend — RAG-ranked redlines tied to PlaybookEntry rows (backend/app/models/playbook.py).
Decide — human approvals + audit trail (backend/app/models/approval.py, audit.py).

Non-goals for the MVP: autonomous signing, definitive legal advice without counsel, or uncited model claims.

Screenshots (placeholders)

Drop real UI captures here once the frontend track publishes pages:

docs/assets/screenshots/
  01-dashboard.png     # Contract queue + risk heatmap
  02-clause-detail.png # Extracted clause + playbook comparison
  03-redline-diff.png  # Suggested replacement + reviewer actions
  04-approvals.png     # Approval routing + SLA timestamps

Until then, use make frontend.dev locally and capture after branding pass.

Quickstart

Prerequisites

Docker Compose v2 (docker compose version)
Python 3.12 (for local venv workflows)
Node 22 + npm (for local Next.js)
make

One-time setup (DB + migrations + playbook seed + vector embeddings)

Requires Docker Desktop running (for Compose Postgres). Then either:

cp .env.example .env
# Edit: JWT_SECRET_KEY, OPENROUTER_* or OPENAI_* (chat + embedding model ids), DATABASE_URL if not using Compose defaults

make backend.install   # local Python venv (needed for Alembic + seeds + embeddings from the host)
make setup             # same as: bash scripts/full_setup.sh → dev_bootstrap.sh (db up, alembic upgrade head, seed_playbook, embed_playbook_vectors)

Or run the shell script directly:

./scripts/dev_bootstrap.sh

Use **make vector.index** (or bash scripts/vector_reindex.sh) to force re-embed all playbook rows after you change wording or models.

Destructive DB reset (drop database, migrate, seed, embed): bash scripts/reset_db.sh

Run everything (Docker)

make up

Services (see docker-compose.yml):

Service	Port	Purpose
`db`	5432	Postgres 16 + pgvector
`backend`	8000	FastAPI (hot reload in Compose)
`frontend`	3000	Next.js production image

Run locally (hybrid)

make backend.install   # backend/.venv
make frontend.install
docker compose up -d db
make backend.stop        # optional: if port 8000 is already in use (Errno 48)
make backend.run         # needs app.main for uvicorn import
make frontend.dev

OpenAPI (once routers exist): http://localhost:8000/docs

Architecture (ASCII)

┌──────────────┐   HTTPS/WSS    ┌──────────────────────┐
│   Browser    │◀──────────────▶│  Next.js 15 (React)   │
│  (reviewers) │                │  frontend/*           │
└──────────────┘                └─────────┬────────────┘
                                          │ REST + WS
                                          ▼
┌──────────────────────────────────────────────────────┐
│ FastAPI (async) — backend/app                        │
│  • JWT auth / rate limiting                          │
│  • Upload + ingestion                                │
│  • LangChain extraction + risk chains                │
│  • WebSocket hub (`app/ws/hub.py`)                   │
└─────────┬───────────────────────────────┬────────────┘
          │ SQLAlchemy async                     │
          ▼                                       ▼
┌──────────────────────┐                 ┌─────────────────┐
│ Postgres + pgvector   │                 │ OpenAI APIs     │
│  • contracts/clauses  │                 │ chat + embed    │
│  • playbook embeddings│                 └─────────────────┘
│  • audit / approval   │
└──────────────────────┘

Mermaid counterparts live in docs/ARCHITECTURE.md for zoomable diagrams.

Tech stack rationale

Layer	Choice	Why
API	FastAPI	Native async, OpenAPI, excellent typing
ORM	SQLAlchemy 2 + asyncpg	Mature migrations story, aligns with FastAPI DI
Vectors	pgvector in Postgres	Fewer moving parts than split vector DB for MVP
AI	LangChain + LangSmith	Traceability + composable LCEL-style chains
UI	Next.js 15 + Tailwind + shadcn	App Router, rapid iteration, accessible primitives
Tooling	Make + Compose	Lowest-common-denominator DX for mixed teams

Environment variables

Authoritative template: .env.example (every variable commented). Highlights:

Variable	Role
`DATABASE_URL`	Async SQLAlchemy DSN (`postgresql+asyncpg://...`)
`POSTGRES_*`	Compose `db` service credentials
`JWT_SECRET_KEY`	Signing material for access/refresh tokens
`OPENAI_API_KEY` / `OPENAI_MODEL`	Chat + default model tier
`OPENAI_EMBEDDING_MODEL` / `VECTOR_DIM`	Embeddings + pgvector width
`LANGSMITH_*` + `LANGCHAIN_TRACING_V2`	Observability switches
`CORS_ORIGINS`	Browser origins allowed to call API with credentials
`NEXT_PUBLIC_API_URL` / `NEXT_PUBLIC_WS_URL`	Frontend discovery

Naming nuance: backend/app/core/config.py validates ACCESS_TOKEN_EXPIRE_MINUTES / REFRESH_TOKEN_EXPIRE_DAYS — the example file mirrors both JWT_* and canonical names to reduce onboarding friction.

Common workflows

Task	Command
Full stack up/down	`make up` / `make down`
Tail logs	`make logs`
Backend shell tests	`make backend.test`
Frontend lint/typecheck	`make frontend.lint`, `make frontend.typecheck`
Repo-wide gate	`make check`
DB console	`make db.shell`
Reset database	`make db.reset` (destructive)
Playbook re-embed guide	`make vector.index`
LangSmith UI	`make langsmith.open`

Bootstrap playbook seed

./scripts/dev_bootstrap.sh and make backend.seed load rows via backend/scripts/seed_playbook.py (idempotent).

Makefile reference (compact)

Targets are self-documented via make help. Notable implementations:

backend.lint currently enforces Ruff on backend/scripts + backend/tests because backend/app/ is mid-flight in a parallel executor (see docs/CONTRIBUTING.md).
frontend.test skips gracefully until a test script lands in frontend/package.json.

Docker notes

Backend image: multi-stage wheel build, python:3.12-slim, tini PID1, non-root appuser. See backend/Dockerfile.
Frontend image: Node 22, multi-stage; entrypoint tries .next/standalone/server.js then npm run start. Enable output: "standalone" in next.config.ts when the frontend track opts in (Dockerfile comment documents this — do not edit next.config.ts from infra PRs unless you own that track).
Compose layers optional env_file entries (.env.example then .env) so docker compose config -q works on fresh clones.

Documentation suite

Doc	Contents
`docs/ARCHITECTURE.md`	Context diagrams, ER-style model chart, lifecycle
`docs/AI_PIPELINE.md`	Chains, retrieval, LangSmith walkthrough
`docs/DEBUGGING_AI.md`	Logging, traces, eval harness sketch
`docs/COST_OPTIMIZATION.md`	Model tiering, caching, batching
`docs/SECURITY.md`	Encryption, PII, OWASP mapping
`docs/DEPLOYMENT.md`	Managed Postgres, Fly/Render/ECS thoughts
`docs/SCALING.md`	RLS vs schema isolation, queues
`docs/API.md`	Endpoint intent + `/docs` pointer
`docs/PROMPTS.md`	Example prompt templates per clause
`docs/PLAYBOOK_SAMPLES.md`	Worked legal examples + mapping
`docs/AI_ENGINEERING_ROADMAP.md`	Senior → staff AI engineering curriculum
`docs/EVOLUTION.md`	Product phases + integration roadmap
`docs/CONTRIBUTING.md`	Branching, commits, PR checklist

Testing & CI

GitHub Actions (.github/workflows/ci.yml) runs:

backend — Ruff on scripts/tests, pytest with Postgres + pgvector service
frontend — ESLint, tsc --noEmit, next build
docker-build — verifies backend + frontend Dockerfiles

Troubleshooting

`uvicorn` cannot import `app.main`

The API track has not merged backend/app/main.py yet — expected. Run domain modules (pytest, seeds) or stub main.py locally if you need Compose health temporarily.

Database connection errors inside Compose

Ensure .env DATABASE_URL uses hostname db and matches POSTGRES_*. After changing env vars, docker compose down && make up.

Frontend cannot reach API

Check NEXT_PUBLIC_API_URL matches published backend port (8000 by default) and that CORS_ORIGINS includes your UI origin.

pgvector extension missing

Confirm ./docker/postgres/init-pgvector.sql mounted (see docker-compose.yml). For existing volumes created before the init script, run CREATE EXTENSION vector; manually via make db.shell.

Rate limiting during dev

Raise RATE_LIMIT_PER_MINUTE temporarily in .env — never commit secrets.

Disk pressure on local machines

Heavy python / npm installs can exhaust space — remove old Docker images (docker system prune) or relocate Docker data root.

LangSmith empty project

Enable tracing: LANGCHAIN_TRACING_V2=true + valid LANGSMITH_API_KEY. Re-run a chain; traces may take ~30s to appear.

Security notes (abbreviated)

Rotate JWT_SECRET_KEY for every environment; store in a secret manager in production.
Treat uploads as untrusted — scan/examine binaries before server-side parsing in production.
Never log raw contract text at INFO in multi-tenant deployments — see docs/SECURITY.md.
Enable TLS everywhere outside local dev; align WebSocket URL scheme (wss://).

Deployment recommendations (high level)

Choose managed Postgres with pgvector + backups (Supabase, Neon, RDS).
Push images via CI to ECR/GCR/Artifact Registry.
Run API + worker (future) autoscaling off request queue depth + CPU.
Export OpenTelemetry to Honeycomb/Datadog; keep LangSmith for LLM-specific spans.
Promote migrations separately from app deploys — see docs/DEPLOYMENT.md.

Known caveats

Nested `frontend/.git`

The frontend/ directory currently carries its own Git history (legacy CNA bootstrap). Root git init now exists, but Git will not automatically track nested repo contents.

You should consolidate by either:

Removing frontend/.git and recommitting the tree into the monorepo, or
Using git subtree add / submodule strategy intentionally.

This infra PR does not delete frontend/.git per project constraints.

Lint scope during parallel execution

Ruff runs on scripts/tests only until backend/app/ import order + unused import debt is cleared by the owning track (see Makefile comment).

Repository layout (selected)

.
├── backend/
│   ├── Dockerfile
│   ├── app/                 # domain code — parallel executor
│   ├── scripts/             # seed + ops helpers (infra-owned)
│   ├── tests/
│   ├── requirements.txt
│   ├── requirements-dev.txt
│   └── pyproject.toml       # Ruff config
├── frontend/
│   ├── Dockerfile
│   ├── docker-entrypoint.sh
│   └── ...
├── docker/
│   └── postgres/
│       └── init-pgvector.sql
├── docs/
├── scripts/
│   ├── dev_bootstrap.sh
│   ├── reset_db.sh
│   └── vector_reindex.sh
├── .github/workflows/
├── docker-compose.yml
├── Makefile
├── .env.example
└── README.md                # you are here

Licensing & contributions

See docs/CONTRIBUTING.md for branching + review norms. License file not included in this scaffolding — add one before open-sourcing.

AI feature map (code pointers)

Capability	Module
Clause extraction	`backend/app/ai/chains/extraction.py`
Deterministic checks	`backend/app/ai/chains/rule_engine.py`
Risk narration	`backend/app/ai/chains/risk.py`
Prompt inventory	`backend/app/ai/prompts.py`
Playbook ORM	`backend/app/models/playbook.py`
Settings	`backend/app/core/config.py`

These paths are safe to cite in docs and onboarding decks — they won't change casually.

Operational cadence (suggested)

Cadence	Activity
Daily	Watch LangSmith error runs + API 5xx
Weekly	Playbook drift review with counsel
Monthly	Cost retro (`docs/COST_OPTIMIZATION.md` metrics)
Quarterly	DR drill on Postgres restore

Performance expectations

Initial targets (tune with production telemetry):

Ingest + text extraction < 5s for typical DOCX ≤ 20 MB on dev-grade CPUs
End-to-end first-pass review JSON < 60s for ~40-page MSA using gpt-4o-mini extraction path
API availability SLO begins at 99.5% pre-enterprise hardening — tighten as on-call matures

Document deviations with trace + query artifacts in postmortems.

Data governance hooks

Map each contract to organization_id when migrations arrive — never query without tenant filter.
Record reviewer decisions in Audit rows; disallow destructive updates via DB roles.
Pair deletion policies in object storage with contract retention legal holds.

Extending the playbook

Insert rows with admin tooling (future) or temporary SQL.
Re-embed vectors — follow make vector.index guidance until automation ships.
Update docs/PLAYBOOK_SAMPLES.md with rationale so future reviewers understand non-obvious positions.

Glossary

Term	Definition
Playbook	Curated standard positions + acceptable fallback language
Redline	Suggested textual change with playbook linkage
Clause	Atomic legal segment extracted for analysis
RAG	Retrieval-augmented generation (pgvector + prompt hints)

Community & support

Until a formal support channel exists:

Start from docs/DEBUGGING_AI.md for AI regressions.
Check Compose logs (make logs) for connectivity issues.
Capture LangSmith traces for model regressions.

Marketing snapshot (internal)

Elevator pitch: "Legal Agent gives every reviewer a playbook-aware co-pilot — extraction, risk scoring, and cited redlines in one workflow, grounded in your enterprise standards instead of generic GPT advice."

Buyer: VP Legal Ops + Head of Procurement partnering with IT for AI governance.

Proof points: Traceable suggestions, audit trail, editable playbook corpus.

Roadmap teaser

Read docs/EVOLUTION.md for multi-agent supervision, DocuSign/Slack integrations, and fine-tuned domain models — architecture intents complement MVP scope.

Benchmarks (TODO once eval harness lands)

Populate when docs/DEBUGGING_AI.md harness exists:

Dataset	Metric	Baseline	Current
Clause F1 (sample)	Macro F1	—	—
Playbook recall@5	Recall	—	—
Cost / contract	USD	—	—

Developer happiness tips

Use make help frequently — targets stay descriptive.
Prefer docker compose over ad-hoc manual container sprawl.
Keep .env out of Git; prefer secret store integration for shared staging clusters.

Legal disclaimer (not legal advice)

This README describes software architecture only. It does not provide legal counsel. All contractual decisions require qualified attorneys.

Acknowledgments

Design inspirations: modern legal ops tooling, LangChain observability best practices, and enterprise security guidance from SOC2 programs.

Appendix — environment matrix

Environment	Database	Tracing	Uploads
Local dev	Compose Postgres	Optional LangSmith	Local volume path
Staging	Managed Postgres	On	Private bucket
Production	HA Postgres	Required	SSE-KMS bucket

Appendix — HTTP status expectations

Code	Meaning
200	OK / resource returned
201	Created (upload / resource spawn)
202	Accepted async job
400	Malformed client request
401/403	AuthZ issues
409	Conflict (approval state)
422	Validation errors (schema)
429	Rate limit (SlowAPI)

Appendix — WebSocket events (planned)

Event	Payload hints
`ingest.status`	`{ contract_id, stage, pct }`
`review.update`	`{ contract_id, summary_version }`
`approval.requested`	`{ approval_id, assignee }`

Actual schemas arrive with router implementation.

Appendix — Makefile taxonomy

Prefix	Meaning
`backend.`*	Python service
`frontend.`*	Next.js app
`db.`*	Postgres via Compose
`*` (top-level)	Meta / orchestration

Appendix — why Make?

Make is dependency-light, works offline, and reads well in READMEs for polyglot teams. If you prefer just, wrap the same scripts.

Appendix — Open questions for PM/LEGAL

Which contract types beyond MSA in v1?
Mandatory human approval on which clause families?
Data residency commitments by customer segment?
Maximum permissible automation before ethics review?

Track answers in your private issue tracker — not in this public template if sensitive.

Appendix — changelog placeholder

Version	Date	Highlights
0.1.0-scaffold	TBD	Monorepo infra + docs

Final notes

Ship small, measure, annotate traces, and keep humans foremost in the loop. Legal Agent is infrastructure for judgment — not a replacement for it.

Happy reviewing.

Appendix — sample `.env` sanity checklist

Before opening a PR or sharing your laptop screen:

JWT_SECRET_KEY is random and not the example string
OPENAI_API_KEY belongs to a project with spend alerts
DATABASE_URL matches Compose credentials when using Docker
LANGCHAIN_TRACING_V2 is disabled in prod unless SOC2 controls allow it
CORS_ORIGINS lists only trusted UI hosts (no wildcards in prod)

Appendix — local directory permissions

Ensure backend/ is writable so .venv and pytest caches can be created. On macOS with SIP, avoid storing the repo inside cloud-synced folders that strip +x bits from helper scripts — re-run chmod +x scripts/*.sh if Git checkouts drop execute permission.

Appendix — Compose service naming

Container names (legal-agent-db, etc.) are explicit for support threads. If you operate multiple clones, override COMPOSE_PROJECT_NAME or adjust name: in docker-compose.yml to avoid collisions.

Appendix — future worker enablement

Uncomment the worker service stub after you add arq/celery dependencies and a worker entrypoint. Until then, rely on FastAPI BackgroundTasks for short jobs only.

Appendix — Postgres extensions

Beyond vector, you may eventually require pg_trgm for fuzzy text matches or btree_gin for composite patterns — document each extension in Alembic revisions with rollback notes.

Appendix — example risk review conversation (fictional)

Reviewer: "System flagged uncapped consequential damages waiver."
Engineer: "Rule engine hit keyword triple + playbook similarity 0.91 — trace abc123 in LangSmith."
Reviewer: "Accepting redline v2 — please log approval."

Stories like this become training data for better UX copy — capture them anonymized.

Appendix — keyboard shortcuts (TBD)

Frontend executor owns shortcuts — list here once the command palette ships (/ search, ⌘K, etc.).

Appendix — color-blind safe UI guidance

Risk heatmaps should not rely solely on red/green hues — pair color with icons, numeric severity, and text labels for accessibility compliance.

Appendix — internationalization (future)

If MSAs arrive bilingual, store language metadata on Document rows and branch OCR + prompts accordingly. Do not assume English-only regex in rule_engine.py long-term.

Appendix — contract hash chain (advanced)

For immutable integrity proofs, consider hashing normalized text + prior audit hash into each new Audit row (Merkle-lite). Legal teams occasionally ask for tamper evidence beyond database ACLs.

Appendix — budget guardrails

Set monthly OpenAI budget alerts in the provider console and track internal tokens_out metrics. Tie breaker: prefer reducing max tokens before swapping to lower-quality models for critical clauses.

Appendix — sample weekly standup template

Incidents: (none / links)
AI eval metrics: schema validity %, cost/contract
Playbook changes merged:
Customer demos scheduled:
Blockers:

Paste into your tracker of choice.

Appendix — git hygiene for monorepos

git status at repo root should be the default — nested repos confuse beginners.
Consider git config status.showUntrackedFiles normal if frontend/.git remains transiently.

Appendix — naming: Legal Agent vs legal-agent

Use Legal Agent in prose, legal-agent for package/Compose/docker image slugs, and legal_agent for SQL identifiers matching POSTGRES_DB.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.github/workflows		.github/workflows
backend		backend
docker/postgres		docker/postgres
docs		docs
frontend		frontend
scripts		scripts
.dockerignore		.dockerignore
.editorconfig		.editorconfig
.env.example		.env.example
.gitignore		.gitignore
.markdownlint.yaml		.markdownlint.yaml
.pre-commit-config.yaml		.pre-commit-config.yaml
.prettierignore		.prettierignore
Makefile		Makefile
README.md		README.md
docker-compose.yml		docker-compose.yml

Folders and files

Latest commit

History

Repository files navigation