Skip to content

rob-otix-ai/lexius

Repository files navigation

Lexius

Legislation-agnostic AI compliance platform with provenance-tracked, deterministic regulatory analysis. Verbatim regulation text from EUR-Lex and offshore PDF sources, deterministic fact extraction, parallel hivemind assessment, and honest provenance labelling on every claim.

20 regulations across 2 jurisdictions — 10 EU (via EUR-Lex CELLAR) and 10 Cayman Islands CIMA (via PDF). Adding a new regulation is two commands — the fetcher, extractor, and swarm handle the rest.

Lexius provides general regulatory guidance and does not constitute legal advice. For implementation support, consult qualified legal counsel.

Install

# MCP server — connect Claude Desktop to the compliance database
npx @robotixai/lexius-mcp

# CLI — query from your terminal
npx @robotixai/lexius-cli

# Interactive agent — Claude-powered compliance consultant
npx @robotixai/lexius-agent

# Database — schema-ready Postgres with pgvector
docker pull robotixai/lexius-db

From Source

git clone https://github.com/rob-otix-ai/lexius.git
cd lexius
cp .env.example .env         # set OPENAI_API_KEY + ANTHROPIC_API_KEY
pnpm setup                   # install → build → DB → migrate → seed → fetch → extract

Architecture

┌──────────────────────────────────────────────────────────────┐
│  Consumers                                                    │
│  ┌─────┐ ┌─────┐ ┌─────┐ ┌─────────┐ ┌────────┐ ┌───────┐ │
│  │ API │ │ MCP │ │ CLI │ │  Agent  │ │ Skills │ │ Swarm │ │
│  └──┬──┘ └──┬──┘ └──┬──┘ └────┬────┘ └───┬────┘ └───┬───┘ │
│     └───────┴───────┴─────────┴───────────┴──────────┘     │
│                      │                                       │
│              ┌───────┴───────┐                               │
│              │     Core      │ use cases + domain             │
│              │  + Plugins    │ legislation-agnostic            │
│              └───────┬───────┘                               │
│                      │                                       │
│     ┌────────────────┼────────────────┐                      │
│     │                │                │                      │
│ ┌───┴───┐    ┌───────┴───────┐  ┌────┴─────┐               │
│ │Fetcher│    │      DB       │  │Extractor │               │
│ │CELLAR │    │Postgres+pgvec │  │regex/det │               │
│ └───────┘    └───────────────┘  └──────────┘               │
└──────────────────────────────────────────────────────────────┘

Published Packages

npm

Package Version Description
@robotixai/lexius-api npm REST API + SSE MCP server (15 endpoints)
@robotixai/lexius-mcp npm MCP server for Claude Desktop / Claude Code (13 tools, stdio + SSE)
@robotixai/lexius-cli npm Command-line interface (9 commands)
@robotixai/lexius-agent npm Interactive Claude compliance consultant + hivemind swarm
@robotixai/lexius-fetcher npm EUR-Lex CELLAR fetcher + deterministic extractor

Docker

Image Tag Description
robotixai/lexius-db 0.1.0 PostgreSQL 16 + pgvector, schema auto-applied (13 tables, 5 migrations)
robotixai/lexius-api 0.3.0 Express REST API + SSE MCP + swarm (full platform server)
robotixai/lexius-mcp 0.1.3 MCP server (node:20-slim, 202MB, self-contained bundle)

Workspace Packages (from source)

Package Description
@lexius/core Domain entities, ports, 14 use cases, legislation plugin system
@lexius/db Drizzle schema, 5 migrations (0000-0004), seeds for 10 legislations
@lexius/infra Drizzle repositories + OpenAI embedding service
@lexius/logger Pino logger factory

Quick Start

Zero to running — all via npm + Docker

# 1. Start the database (schema auto-applied, no migration needed)
docker run -d -p 5432:5432 \
  -e POSTGRES_PASSWORD=$POSTGRES_PASSWORD \
  -e POSTGRES_DB=$POSTGRES_DB \
  -e POSTGRES_USER=$POSTGRES_USER \
  robotixai/lexius-db
export DATABASE_URL=postgresql://$POSTGRES_USER:$POSTGRES_PASSWORD@localhost:5432/$POSTGRES_DB

# 2. Fetch verbatim regulation text from EUR-Lex + run extractor
npx @robotixai/lexius-fetcher ingest --celex 32024R1689 --legislation eu-ai-act
npx @robotixai/lexius-fetcher ingest --celex 32022R2554 --legislation dora

# 3. Query from the CLI
npx @robotixai/lexius-cli legislations
npx @robotixai/lexius-cli article 99 --legislation eu-ai-act

# 4. Or run the interactive agent
export ANTHROPIC_API_KEY=sk-ant-...
npx @robotixai/lexius-agent

No git clone needed. The Docker image provides the schema; the fetcher populates it from EUR-Lex; the CLI/agent/MCP server query it.

Docker Compose (full stack)

# Set your passwords
export POSTGRES_PASSWORD=secret
export OPENAI_API_KEY=sk-...

# Start DB + API
docker compose -f docker-compose.production.yml up -d

# The API is now at http://localhost:3000
# MCP proxy mode can point at it:
LEXIUS_API_URL=http://localhost:3000 LEXIUS_API_KEY=lx_... npx @robotixai/lexius-mcp

Claude Desktop Integration

Add to claude_desktop_config.json:

{
  "mcpServers": {
    "lexius": {
      "command": "npx",
      "args": ["@robotixai/lexius-mcp"],
      "env": {
        "DATABASE_URL": "postgresql://$POSTGRES_USER:$POSTGRES_PASSWORD@localhost:5432/$POSTGRES_DB"
      }
    }
  }
}

Or use proxy mode (no local DB needed):

{
  "mcpServers": {
    "lexius": {
      "command": "npx",
      "args": ["@robotixai/lexius-mcp"],
      "env": {
        "LEXIUS_API_URL": "https://your-lexius-instance.example.com",
        "LEXIUS_API_KEY": "lx_your_key_here"
      }
    }
  }
}

Docker MCP

# Proxy mode
docker run -e LEXIUS_API_URL=https://your-lexius-instance.example.com \
           -e LEXIUS_API_KEY=lx_... \
           robotixai/lexius-mcp

# Direct mode
docker run -e DATABASE_URL=postgresql://$POSTGRES_USER:$POSTGRES_PASSWORD@host:5432/$POSTGRES_DB \
           robotixai/lexius-mcp

Provenance

Every fact Lexius returns is labelled with its trust level:

Tier Meaning Enforced by
AUTHORITATIVE Verbatim from official source (EUR-Lex CELLAR). SHA-256 hash-verified. DB CHECK constraint + Specflow PROV-001
CURATED Written or reviewed by a domain expert. curated_by + reviewed_at recorded. DB CHECK constraint + PROV-003
AI_GENERATED Model output, not expert-reviewed. generated_by_model recorded. DB CHECK constraint

Data Pipeline

EUR-Lex CELLAR (XHTML)
    ↓ fetcher (lexius-fetch ingest --celex 32024R1689 or --source cima)
Articles table — 1,456 AUTHORITATIVE sections across 20 regulations, hash-verified
    ↓ extractor (lexius-fetch extract --legislation eu-ai-act)
Article Extracts — 8,123+ typed facts (fines EUR/KYD, %, dates, cross-refs, shall-clauses, imprisonment terms)
    ↓ cross-check (pnpm crosscheck)
CI fails if curated penalty amounts ≠ extracted values from verbatim law
    ↓ swarm (POST /api/v1/swarm/run)
Compliance workspace — 1,882 findings in 2.6s with 4 parallel agents
    ↓ synthesis
ComplianceReport with relianceByTier breakdown

Legislations

EU Regulations (via EUR-Lex CELLAR — XHTML)

Legislation CELEX Articles Extracts
GDPR 32016R0679 99 637
EU AI Act 32024R1689 126 1,181
DORA 32022R2554 64 554
Digital Services Act 32022R2065 93 699
Digital Markets Act 32022R1925 54 475
Data Act 32023R2854 50 397
Data Governance Act 32022R0868 38 273
Cyber Resilience Act 32024R2847 79 700
MiCA 32023R1114 155 1,672
eIDAS 2.0 32024R1183 48 335

Cayman Islands CIMA (via PDF)

Legislation Sections Extracts
Monetary Authority Act (2020 Rev.) 63 ~180
Banks and Trust Companies Act (2025 Rev.) 29 ~60
Mutual Funds Act (2025 Rev.) 52 ~140
Private Funds Act (2025 Rev.) 33 ~80
Securities Investment Business Act (2020 Rev.) 44 ~100
Insurance Act (2010) 41 ~80
Anti-Money Laundering Regulations (2025 Rev.) 107 ~250
Virtual Asset (Service Providers) Act (2024 Rev.) 41 ~110
Proceeds of Crime Act (2024 Rev.) 205 ~350
Beneficial Ownership Transparency Act (2023) 35 ~50

Totals

Legislations Sections/Articles Extracts
EU 10 806 6,923
Cayman Islands 10 650 ~1,200
Total 20 1,456 ~8,123

Adding a New Regulation

# EU regulation (via EUR-Lex CELLAR)
lexius-fetch ingest --celex <CELEX> --legislation <id>

# Offshore regulation (via PDF)
lexius-fetch ingest --source pdf --url <pdf-url> --legislation <id>

# All CIMA acts (via registry)
lexius-fetch ingest --source cima

# Auto-populate derivedFrom on curated rows from cross-references
lexius-fetch backfill-derivation --legislation <id> --apply

Hivemind Swarm

Parallel compliance analysis via autonomous agents sharing a Postgres workspace (stigmergic communication):

  • Deterministic — no LLM in the agent loop; same data = same findings
  • Fast — 1,882 findings in ~2.6s (4 agents, 126 EU AI Act articles)
  • Gap detection — discovers obligations in the law that the curated set misses (701 candidates)
  • Provenance — every finding inherits AUTHORITATIVE / CURATED / AI_GENERATED from its source
# Via API
curl -X POST localhost:3000/api/v1/swarm/run \
  -H "Authorization: Bearer lx_..." \
  -d '{"legislationId":"eu-ai-act","concurrency":4}'

# Via MCP
# Tool: legalai_run_swarm_assessment({ legislationId: "eu-ai-act" })

API Endpoints

POST /api/v1/classify                     Classify an AI system
GET  /api/v1/obligations                  Obligations by role/risk level
POST /api/v1/penalties/calculate          Calculate penalty exposure
GET  /api/v1/articles/:number             Verbatim article text
GET  /api/v1/articles/:id/history         Article revision history
GET  /api/v1/articles/:id/extracts        Extracted facts (fines, dates, cross-refs)
GET  /api/v1/deadlines                    Compliance deadlines
GET  /api/v1/obligations/:id/derivation   Trace obligation to source articles
POST /api/v1/knowledge/search             Semantic search
POST /api/v1/faq/search                   FAQ search
GET  /api/v1/legislations                 List legislations
POST /api/v1/audit                        Full compliance report
POST /api/v1/swarm/run                    Start hivemind swarm
GET  /api/v1/swarm/:sessionId/findings    Query swarm workspace
POST /api/v1/swarm/:sessionId/synthesise  Report from swarm findings
GET  /health                              DB stats + uptime
GET  /mcp/sse                             Remote MCP via SSE
GET  /integration-manifest.json           Claude.ai integration manifest

Auth: Authorization: Bearer lx_... on all /api/v1/* routes. Health + manifest are unauthenticated.

MCP Tools (13)

Tool Description
legalai_classify_system Risk classification (signals + keywords + semantic)
legalai_get_obligations Obligations filtered by role and risk level
legalai_calculate_penalty Penalty calculation with SME rules
legalai_get_article Verbatim article text (AUTHORITATIVE)
legalai_get_deadlines Compliance deadlines with days remaining
legalai_search_knowledge Semantic search across all content
legalai_answer_question FAQ lookup
legalai_run_assessment Structured assessments (Art. 6(3), GPAI)
legalai_list_legislations Available legislations
legalai_get_article_history Article revision history
legalai_get_derivation_chain Obligation → source article trace
legalai_get_article_extracts Deterministically extracted facts
legalai_run_swarm_assessment Parallel hivemind assessment

CLI Commands

npx @robotixai/lexius-cli legislations
npx @robotixai/lexius-cli classify --legislation eu-ai-act --description "recruitment AI" --role provider
npx @robotixai/lexius-cli obligations --legislation eu-ai-act --role provider --risk-level high-risk
npx @robotixai/lexius-cli penalty --legislation eu-ai-act --violation high-risk-non-compliance --turnover 500000000
npx @robotixai/lexius-cli article 99 --legislation eu-ai-act
npx @robotixai/lexius-cli deadlines --legislation eu-ai-act
npx @robotixai/lexius-cli search "risk management" --legislation eu-ai-act --type obligation
npx @robotixai/lexius-cli audit --legislation eu-ai-act --description "recruitment AI" --role provider

Contract Enforcement

22 contracts, 64 rules enforced by Specflow:

npx @robotixai/specflow-cli enforce .
Category Contracts Key Rules
Architecture arch_clean_layers, arch_package_boundaries, arch_legislation_plugins Domain has no infra imports; consumers don't cross-import
Provenance provenance_tiers Every entity has provenance_tier; fetcher = AUTHORITATIVE only; seeds use helpers
Extractor extractor_determinism Pure/sync modules; no LLM; cross-check exits non-zero on mismatch
Integration integration_security No key hashes in responses; SSE uses auth
Swarm hivemind_swarm No LLM in agent loop; atomic claims; cleanup complete
Offshore offshore_adapters No LLM in PDF parsing; source_format=pdf; section merge; dynamic header detection
Model Harness model_harness No direct SDK imports in agent code; providers don't import domain
Fetcher fetcher_verbatim Records sourceHash + fetchedAt
Curator curator_audit, curator_auth, curator_integrity Audit atomicity, role-gated routes, AUTHORITATIVE immutable, tier transitions, If-Match concurrency, derivedFrom anchoring (C-INT-007)
Audit audit_report_integrity, audit_enhancement_layer, audit_agent_layer GenerateAuditReport is deterministic; enhancement via port
Security security_secrets, security_sql_safety, security_input_validation, security_no_eval No hardcoded creds; parameterised queries; Zod validation
Quality qa_domain_types No any in domain layer

Testing

pnpm test                              # All tests
pnpm --filter @lexius/core test        # 183 unit tests
pnpm --filter @lexius/api test         # 36 functional tests
pnpm --filter @lexius/fetcher test     # 78 extractor + parser tests
pnpm crosscheck                        # Penalty cross-check vs verbatim law
npx @robotixai/specflow-cli enforce .  # 20 contracts, 45 rules

Documentation

Full spec documents in docs/:

Type Count Index
PRD (Product Requirements) 12 docs/prd/INDEX.md
ARD (Architecture Decisions) 16 docs/ard/INDEX.md
DDD (Domain Design) 15 docs/ddd/INDEX.md

Environment Variables

Variable Required Description
DATABASE_URL Yes PostgreSQL connection string
OPENAI_API_KEY Embeddings OpenAI API key for semantic search
ANTHROPIC_API_KEY Agent only Anthropic API key for Claude agent
DB_PASSWORD Docker PostgreSQL password for docker-compose
LEXIUS_API_URL MCP proxy Hosted API URL (proxy mode)
LEXIUS_API_KEY MCP proxy API key for hosted API
LEXIUS_PROFILE Curator Profile name to load from credentials file (default: default)
LEXIUS_CREDENTIALS_FILE Curator Override path to the credentials file (default: ~/.config/lexius/credentials)
LEXIUS_CURATOR_ID Curator Override the curator identity stamped on edits
LEXIUS_ROLE MCP Force the MCP server role (reader or curator), bypassing credentials file

Curator Workflow

Named domain experts can edit CURATED-tier facts (obligations in v1) via CLI without a code deploy. Every edit is audited, concurrency-safe, and re-embedded.

# 1. Admin generates a curator key
pnpm create-api-key --role curator --owner expert@example.com

# 2. Expert logs in (paste-a-key flow)
npx @robotixai/lexius-curate login --key lx_curator_... --url https://lexius.example.com
npx @robotixai/lexius-curate whoami

# 3. Expert works: dry-run by default, --apply to commit
npx @robotixai/lexius-curate obligations list --stale
npx @robotixai/lexius-curate obligations edit eu-ai-act-art-9-provider \
  --row-version 3 \
  --changes '{"obligation":"Establish and maintain a risk management system"}' \
  --reason "clarifying per Art. 9(2)" \
  --apply

# 4. View history + revert
npx @robotixai/lexius-curate obligations history eu-ai-act-art-9-provider
npx @robotixai/lexius-curate revert <edit_id> --reason "too aggressive" --apply

Guarantees (PRD-013 / ARD-017):

  • Transactional: row update, audit insert, re-embed all land in one DB transaction.
  • Concurrency-safe: If-Match: <row_version> required; mismatch returns 409.
  • Auditable: every edit writes an append-only curator_edits row with editor, source (cli/api/mcp/etc.), reason, old/new values, and row_version before/after.
  • Anchored: every CURATED row carries non-empty derivedFrom resolving to AUTHORITATIVE articles. No orphan interpretations.
  • Honest: curators cannot override mined facts. If an article_extracts value is wrong, fix the extractor — don't paper over it with a curator edit.
  • Staleness-aware: when the fetcher re-ingests an article with a changed source_hash, every CURATED obligation citing that article is flagged needs_review. Curators triage via lexius-curate obligations list --stale.

See PRD-013, ARD-017, and DDD-016 for design detail.

Tech Stack

  • Runtime: Node.js 20+ (ESM)
  • Language: TypeScript (strict)
  • Database: PostgreSQL 16 + pgvector
  • ORM: Drizzle
  • Embeddings: OpenAI text-embedding-3-small (1536 dimensions)
  • API: Express 5
  • MCP: @modelcontextprotocol/sdk (stdio + SSE)
  • CLI: Commander
  • Agent: @anthropic-ai/sdk (Claude, temperature 0)
  • Bundler: esbuild
  • Monorepo: Turborepo + pnpm workspaces
  • PDF Parsing: pdfjs-dist (offshore legislation)
  • Contracts: Specflow (20 contracts, 45 rules)
  • Testing: Vitest + Supertest (297 tests)

License

MIT

About

AI regulatory compliance tool with provenance-tracked verbatim law. 20 regulations, 2 jurisdictions (EU + Cayman Islands CIMA). 1,456 articles, 8,123 deterministic extracts. No hallucinated numbers. Works with Claude, GPT-4, Ollama, or any model via OpenRouter.

Topics

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors