DeerFlow

Static analysis and policy framework for AI coding agents — 21-gate compliance engine, fabrication detector, and tamper-proof lock

Honest Disclosure — Read First

This is a static-analysis / guardrail framework for AI coding agents, not a runtime library for production applications, and not a workflow engine that replaces Temporal, Inngest, or AWS Step Functions.

What this repo is

Approximately 18K lines of TypeScript organized as:

A 21-gate compliance engine (deerflow/enforcement/) running at dev / CI time to keep AI coding agents honest.
A fabrication detector with 8 detection branches.
A tamper-proof lock using SHA-256 hashes.
An agent guard and agent contract layer.
A reference Fastify server (src/) and CLI (bin/deerflow.ts) showing how to wire the gates into an HTTP layer.
A Hera subsystem (deerflow/hera/) for TF-IDF conversation memory and agent specialization.
An 8-phase workflow pipeline enforcer (deerflow/workflow.ts).

What this repo is not

Not a durable workflow engine. deerflow/workflow.ts is a pipeline enforcer for a single-machine dev session with 8 phases (ANALYZE → PLAN → SCAFFOLD → IMPLEMENT → VALIDATE → TEST → SECURITY → QUALITY_GATE). The comment @see https://temporal.io/workflows in the file references the workflow concept, not a feature parity claim. There is no database, no durable execution, no retry, no distributed orchestration.
Not a fabrication oracle. The fabrication detector is heuristic static analysis. It produces signals, not verdicts.
Not a distributed memory layer. Hera runs in-process only. State is not persisted to a database and is not shared across processes or machines.

Limitations

Fabrication detector is heuristic. The 8 detection branches (INVENTED_API, FAKE_IMPORT, HALLUCINATED_TYPE, IMPOSSIBLE_RETURN, PHANTOM_METHOD, SPECULATIVE_CODE, COPY_PASTE_SMELL, UNVERIFIABLE_LOGIC) rely on pattern matching plus cross-reference against package.json and the TypeScript type checker. They produce false positives (flagging legitimate code) and false negatives (missing subtle fabrication). Read the gate output as a signal, not as an absolute verdict.
Hera is in-process. TF-IDF conversation memory and agent specialization are demo-grade. They do not persist to a database and are not distributed.
No performance or load testing. 92 tests (unit + integration + e2e) pass at the time of this commit; tsc builds clean. The e2e tests call createApp() and hit /health, /ready, /live with no external network calls. There is no benchmark, no load test, and no fuzz test for the fabrication detector.
Not a replacement for human review or a full type checker. The gates complement, but do not replace, careful code review and a properly configured TypeScript compiler.

Alternatives

Need	Use instead	Stars
Durable workflow orchestration	Temporal	~13k★
Durable workflow orchestration	Inngest	~5k★
Durable workflow orchestration	DBOS	~1k★
Durable workflow orchestration	Restate	~7k★
Long-running memory for agents	mem0	~25k★
Long-running memory for agents	Letta (formerly MemGPT)	~13k★
Long-running memory for agents	Zep	~3k★

For production use, you still need

Rate limiting, authentication and authorization, persistent storage for tamper-proof-lock and agent sessions, distributed coordination, observability (OpenTelemetry), and a measured false-positive / false-negative evaluation of each gate on a real codebase before enabling --max-warnings=0.

Features

21-Gate Compliance Engine — Multi-gate enforcement with a zero-mock-data policy.
Fabrication Detector — 8 detection branches for invented APIs, fake imports, hallucinated types, phantom methods, speculative code, copy-paste smell, and unverifiable logic.
Tamper-Proof Lock — SHA-256 hash verification of artifacts and contracts.
Agent Guard — Real-time monitoring of agent behavior.
Workflow Enforcer — 8-phase pipeline enforcer for single-machine dev sessions (not a durable workflow engine — see disclosure above).
Skill Modules — Pluggable skills for code-review, security, test, UI, and search.
Hera Subsystem — TF-IDF conversation memory and agent specialization (in-process, not distributed).
CLI Tool — Command-line interface for quality checks and compliance.

Tech Stack

Category	Technology
Language	TypeScript 5
Validation	Zod 3
Testing	Vitest 2
Linting	ESLint 9
Formatting	Prettier 3
Runtime	Node.js 20+

Getting Started

Prerequisites

Node.js 20+ and npm 10+

Installation

# Clone the repository
git clone https://github.com/ntd25022006q/deerflow.git
cd deerflow

# Install dependencies
npm install

# Run quality gate checks
npm run quality-gate

Available Scripts

Script	Description
`npm run dev`	Start development server with hot reload
`npm run build`	Compile TypeScript to dist/
`npm run start`	Run compiled production server
`npm run lint`	Check code with ESLint
`npm run type-check`	TypeScript compilation check
`npm run test`	Run Vitest test suite
`npm run test:coverage`	Run tests with coverage
`npm run format:check`	Verify Prettier formatting
`npm run quality-gate`	Run full quality pipeline
`npm run agent-guard`	Run agent guard checks
`npm run compliance`	Run 21-gate compliance checks

21-Gate Quality Pipeline

Every gate must pass or the agent is locked.

Gate	Name	Enforces
1	ACTION_GUARD	Pre-action validation — blocks before damage
2	FILE_GUARD	Filesystem integrity — no deleted critical files
3	ANTI_PATTERN	Code quality — no `any`, mock data, secrets
4	DEPENDENCY_GUARD	Library conflicts and banned packages
5	SECURITY_DEEP	Deep security scan (OWASP, CWE, injection)
6	STRUCTURE_ANALYZE	Dead code, circular deps, nesting, coupling
7	TEST_QUALITY	Assertion density, error paths, no skips
8	BUILD_INTEGRITY	Build output completeness and size
9	SOURCE_CITATION	Evidence-based code (anti-fabrication policy)
10	ENFORCEMENT_REGISTRY	Rule registry validation with evidence
11	TYPE_CHECK	TypeScript compilation clean
12	LINT	ESLint zero errors and warnings
13	TEST	Test suite and coverage >= 80%
14	SECURITY	npm audit (moderate+ vulnerabilities)
15	AGENT_COORDINATOR	Session management and lock state
16	EVIDENCE_VALIDATOR	Active evidence enforcement (papers/repos/docs)
17	QUALITY_OVER_SPEED	Quality gates — no rushing, no shortcuts
18	TAMPER_PROOF_LOCK	Cryptographic lock verification (SHA-256)
19	FABRICATION_DETECTOR	Detect fabricated or invented code
20	AGENT_CONTRACT	Contract compliance verification
21	COMPLETION_GUARANTEE	Mandatory task completion

Architecture

flowchart TB
    subgraph CLI["CLI (bin/deerflow.ts)"]
        A1[deerflow check]
        A2[deerflow enforce]
        A3[deerflow lock]
    end

    subgraph Server["Reference Server (src/)"]
        B1[Fastify HTTP]
        B2[/health /ready /live]
    end

    subgraph Core["DeerFlow Engine (deerflow/)"]
        C1[workflow.ts<br/>8-phase enforcer]
        C2[agent-guard.ts<br/>behavior monitor]
    end

    subgraph Enforcement["21-Gate Compliance (deerflow/enforcement/)"]
        D1[Fabrication Detector<br/>8 branches]
        D2[Tamper-Proof Lock<br/>SHA-256]
        D3[Compliance Engine<br/>orchestrator]
        D4[Evidence Validator]
        D5[Source Citation]
        D6[Quality Over Speed]
        D7[19 more gates...]
    end

    subgraph Hera["Hera Self-Evolution (deerflow/hera/)"]
        E1[TF-IDF Index<br/>pure TypeScript]
        E2[Conversation Memory]
        E3[Agent Specialization]
        E4[Evolution Engine]
        E5[Adaptive Coordinator]
    end

    subgraph Skills["Skill Modules (deerflow/skills/)"]
        F1[code-review]
        F2[security]
        F3[test]
        F4[ui]
        F5[search]
    end

    CLI --> Core
    Server --> Core
    Core --> Enforcement
    Core --> Hera
    Core --> Skills
    Enforcement --> D3
    D3 --> D1 & D2 & D4 & D5 & D6 & D7

Usage Examples

Programmatic: Run the fabrication detector

import { FabricationDetector } from '@deerflow/enforcement/fabrication-detector';

const detector = new FabricationDetector({
  workingDir: process.cwd(),
  scanDirs: ['src'],
  minFabricationScore: 80,
  maxCriticalFindings: 0,
});

const result = detector.scan();

if (!result.passed) {
  console.error(result.lockReason);
  for (const finding of result.findings) {
    console.error(`  [${finding.severity}] ${finding.file}:${finding.line} — ${finding.message}`);
  }
  process.exit(1);
}

Programmatic: Use the Hera TF-IDF memory

import { TFIDFIndex } from '@deerflow/hera/tfidf-index';

const index = new TFIDFIndex({ dataDir: '.agent/hera' });
index.initialize();

// Add conversations to the corpus — IDF weights evolve
index.addDocument('User asked about authentication patterns in Fastify');
index.addDocument('Discussed JWT vs session cookies for SPA authentication');

// Embed a new query and compare against stored vectors
const query = index.embed('How should I handle auth in my API?');
const stored = index.embed('JWT vs session cookies for SPA authentication');
const similarity = index.cosineSimilarity(query.vector, stored.vector);

console.log(`Similarity: ${similarity.toFixed(3)}`); // → 0.42
index.save(); // Persist vocabulary for next session

CLI: Run the full 21-gate pipeline

# Run all gates, exit non-zero on any failure
npx deerflow enforce --strict

# Run only specific gates
npx deerflow enforce --only FABRICATION_DETECTOR,TAMPER_PROOF_LOCK

# Generate a tamper-proof lock file
npx deerflow lock --out .agent/lock.json

Project Structure

deerflow/
├── src/                        # Reference Fastify server showing how to wire gates
│   ├── index.ts                # Main entry — createApp()
│   ├── server.ts               # HTTP server
│   ├── routes/                 # /health, /ready, /live
│   └── services/               # Business logic
├── bin/                        # CLI entry point
│   └── deerflow.ts
├── deerflow/                   # The framework itself
│   ├── enforcement/            # 21-gate compliance engine (20 modules)
│   ├── hera/                   # TF-IDF memory + agent specialization (6 modules)
│   ├── skills/                 # Pluggable skills (5 modules)
│   ├── agent-guard.ts          # Real-time agent behavior monitor
│   ├── workflow.ts             # 8-phase pipeline enforcer
│   └── index.ts                # Public API
├── eslint-plugins/             # Custom ESLint rules for fabrication detection
├── tests/
│   ├── unit/                   # 85 unit tests
│   ├── integration/            # 3 integration tests
│   └── e2e/                    # 4 e2e tests
├── docker/                     # Dockerfile + docker-compose
└── templates/                  # Project templates

Testing

# Run all tests
npm run test

# Run specific test types
npm run test:unit
npm run test:integration
npm run test:e2e

# Run with coverage
npm run test:coverage

92 tests pass: 85 unit (config, hera TF-IDF math, index API), 3 integration, 4 e2e (createApp() hitting /health, /ready, /live with no external network calls).

The Hera test suite explicitly validates mathematical properties:

TF-IDF vectors are deterministic (same input → identical weights to 10 decimal places)
Cosine similarity is bounded in [0, 1] for non-negative vectors
Rare terms receive higher IDF than common terms
Agent specialization weights converge toward 1.0 with repeated success
Softmax produces a valid probability distribution

Performance Considerations

TF-IDF index: O(n) embedding where n = unique terms in input. Vocabulary pruned at 50,000 terms by default. State persists to a vocabulary.json file (typically < 1 MB).
Fabrication detector: O(files × patterns) per scan. The 12 hallucination patterns are compiled regexes; cross-reference against package.json is O(1) via a Set.
Tamper-proof lock: Single SHA-256 hash per artifact. Negligible overhead.
No external services required. The framework runs entirely in-process. No database, no message queue, no LLM API calls.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.agent/reports		.agent/reports
.github		.github
bin		bin
deerflow		deerflow
docker		docker
eslint-plugins		eslint-plugins
scripts		scripts
src		src
templates		templates
tests		tests
.env.example		.env.example
.gitignore		.gitignore
.prettierignore		.prettierignore
.prettierrc		.prettierrc
AGENT_RULES.md		AGENT_RULES.md
CODEOWNERS		CODEOWNERS
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
DECISIONS.md		DECISIONS.md
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
SECURITY.md		SECURITY.md
cli-tools.json		cli-tools.json
docker-compose.yml		docker-compose.yml
eslint.config.mjs		eslint.config.mjs
lint-staged.config.mjs		lint-staged.config.mjs
package-lock.json		package-lock.json
package.json		package.json
tsconfig.build.json		tsconfig.build.json
tsconfig.eslint.json		tsconfig.eslint.json
tsconfig.json		tsconfig.json
vitest.config.ts		vitest.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DeerFlow

Honest Disclosure — Read First

What this repo is

What this repo is not

Limitations

Alternatives

For production use, you still need

Features

Tech Stack

Getting Started

Prerequisites

Installation

Available Scripts

21-Gate Quality Pipeline

Architecture

Usage Examples

Programmatic: Run the fabrication detector

Programmatic: Use the Hera TF-IDF memory

CLI: Run the full 21-gate pipeline

Project Structure

Testing

Performance Considerations

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

DeerFlow

Honest Disclosure — Read First

What this repo is

What this repo is not

Limitations

Alternatives

For production use, you still need

Features

Tech Stack

Getting Started

Prerequisites

Installation

Available Scripts

21-Gate Quality Pipeline

Architecture

Usage Examples

Programmatic: Run the fabrication detector

Programmatic: Use the Hera TF-IDF memory

CLI: Run the full 21-gate pipeline

Project Structure

Testing

Performance Considerations

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages