org-as-code

The missing governance layer for AI agent teams.

Turn organizational decisions into Git-native, auditable, version-controlled artifacts. Replace Slack threads and meeting notes with YAML files and immutable audit trails.

Quick Start · Documentation · MCP Server · Theory · Examples

pip3 install -r requirements.txt        # pyyaml + mcp — that's it
python3 org_cli.py dashboard            # see it live in 30 seconds
python3 org_cli.py create FEAT-003 feature "My first process" "Testing org-as-code" --agent coder
python3 org_cli.py verify               # check hash-chain integrity

Why This Exists

AI agents can code, review, deploy, and monitor. But they can't govern themselves — and neither can humans govern them through Slack threads and meeting notes.

org-as-code gives AI agents and humans a shared protocol for working together. One agent proposes, another validates. A human steps in when stakes are high. Every handoff is logged, every decision is traceable, every rejection drives a better next iteration. The P↔V protocol doesn't care whether a step is performed by Claude, GPT, a junior developer, or a CTO — it enforces the same rhythm of propose, validate, converge.

This means you can build teams where AI agents do the heavy lifting (proposals, implementations, reviews) while humans retain authority over critical decisions — without bottlenecking every step. The feature_v2 template demonstrates this: AI handles routine work autonomously, but processes with H(s) ≥ 0.8 automatically require human sign-off.

What you get

	Feature	What it does
P↔V	Protocol	Proposals oscillate with Validations — expand options, then contract to decisions
H(s)	Priority Score	Calculated, not felt: `w₁·urgency + w₂·commitment + w₃·demand + w₄·blocking`
E(x)	Convergence Score	Quadratic energy — one critical gap outweighs three minor ones
FDM	Process Dependencies	Cycle detection, parallel groups, impact scoring, living dependency register
JSONL	Audit Trail	SHA-256 hash-chained, append-only, tamper-evident
MCP	AI Agent Interface	21 native tools over stdio transport
CLI	Human Interface	23 commands for operators
YAML	Per-Process State	One state file per process — no merge conflicts

Quick Start

1. Clone and configure

git clone https://github.com/syntriad/org-as-code.git
cd org-as-code
pip3 install -r requirements.txt

2. Register your agents

Edit registry/agents.yaml:

agents:
  - id: alice
    name: "Alice (Team Lead)"
    type: human
    interface: terminal
    skills: [strategic-review, final-validation, decision-making]
    status: active
    capacity: 5

  - id: coder
    name: "Coder (AI Agent)"
    type: ai
    interface: claude-code
    skills: [implementation, code-generation, testing]
    status: active
    capacity: 10

  - id: reviewer
    name: "Reviewer (AI Agent)"
    type: ai
    interface: ide
    skills: [technical-review, refactoring, testing]
    status: active
    capacity: 5

3. Start your first process

python3 org_cli.py create FEAT-001 feature \
  "Add user authentication" \
  "Implement JWT-based auth with refresh tokens" \
  --agent coder --priority 0.8

4. Let an AI agent validate

The assigned agent creates V.0_review.yaml in processes/FEAT-001/:

verdict: approved
confidence: 0.9
conditions:
  - "Add rate limiting before merge"
reviewed_by: reviewer

5. Commit and verify

python3 org_cli.py update FEAT-001 COMMITTED \
  --notes "Auth module complete. All tests pass."

python3 org_cli.py verify
# Chain integrity: VALID — no tampering detected.

Every step is logged in registry/artifacts.jsonl with hash-chain integrity.

The P↔V Protocol

Like a heartbeat (systole ↔ diastole), organizations need rhythm:

┌─────────────────────────────────────────┐
│              P ↔ V Protocol             │
├─────────────────────────────────────────┤
│                                         │
│   ┌─────────┐         ┌─────────┐      │
│   │    P    │────────▶│    V    │      │
│   │ Propose │         │Validate │      │
│   └─────────┘         └────┬────┘      │
│        ▲                   │           │
│        │     iterate       │           │
│        └───────────────────┘           │
│                                         │
│   P_READY → P_COMPLETE → V_COMPLETE    │
│                    → COMMITTED          │
│                    → ABANDONED          │
│                                         │
└─────────────────────────────────────────┘

P-steps (Production): Expand options, generate proposals V-steps (Validation): Contract, validate, select

This prevents two failure modes:

Unbounded divergence — chaos, no decisions
Frozen convergence — premature fixation, groupthink

Repository Structure

org-as-code/
├── registry/                ← Organization state (YAML)
│   ├── agents.yaml            Registered agents
│   ├── state.yaml             Process states (auto-generated index in per_process mode)
│   ├── tensions.yaml          Open problems/opportunities
│   ├── attractors.yaml        Strategic goals
│   ├── artifacts.jsonl        Immutable action log (hash-chained)
│   └── fdm.json               Dependency register (auto-generated)
│
├── processes/               ← All work (P,V artifacts + per-process state)
│   └── {ID}/
│       ├── state.yaml           Per-process state (in per_process mode)
│       ├── P.0_proposal.md      Proposals (expand)
│       └── V.0_review.yaml      Reviews (contract)
│
├── examples/                ← Real cases from independent evaluations
│   ├── SEC-001/               Security hardening (V.0 rejected, 6 artifacts)
│   └── PERF-001/              Performance optimization (V.0 rejected, 4 artifacts)
│
├── protocol/                ← Rules of the game
│   ├── config.yaml            Priority weights, thresholds, security
│   └── process_templates/     Reusable flows
│       ├── feature.yaml
│       ├── feature_v2.yaml    (with conditional human gate)
│       └── bugfix.yaml
│
├── tools/                   ← Maintenance utilities
│   ├── fix_hash_chain.py      Rebuild hash chain if corrupted
│   └── per_process_state.py   Migrate to per-process state files
│
├── org_mcp_server.py        ← MCP Server (21 tools, stdio transport)
├── org_cli.py               ← CLI interface (23 commands)
├── fdm.py                   ← Dependency graph engine (Tarjan, Kahn, stdlib only)
├── pyproject.toml           ← Package metadata (pip install org-as-code)
└── requirements.txt

CLI Reference

# Read commands
python3 org_cli.py status              # All processes + state
python3 org_cli.py tensions            # Open tensions
python3 org_cli.py attractors          # Strategic goals
python3 org_cli.py agents              # Registered agents
python3 org_cli.py health              # Health metrics
python3 org_cli.py log [--limit N]     # Recent artifact log
python3 org_cli.py show FEAT-001       # Process detail + artifacts
python3 org_cli.py verify              # Verify hash-chain integrity
python3 org_cli.py dashboard           # Combined overview

# Write commands
python3 org_cli.py create FEAT-002 feature "Title" "Description" --agent coder --priority 0.8
python3 org_cli.py update FEAT-002 COMMITTED --notes "Done"
python3 org_cli.py artifact coder V.0_review "Approved" --process FEAT-002
python3 org_cli.py tension-add "Title" "Description" --priority 0.7
python3 org_cli.py tension-resolve T-2026-001 "Resolved via process assignment"
python3 org_cli.py priority --urgency 0.9 --demand 0.8 --blocking 0.7
python3 org_cli.py energy --gaps 0.8 --inconsistencies 0.2 --evidence 0.6
python3 org_cli.py convergence FEAT-001  # Show E(x) trajectory over V-steps

# Dependency commands
python3 org_cli.py deps-add FEAT-002 FEAT-001      # FEAT-002 depends on FEAT-001
python3 org_cli.py deps-remove FEAT-002 FEAT-001    # Remove dependency
python3 org_cli.py deps FEAT-002                     # Show upstream + downstream deps
python3 org_cli.py deps-analyze                      # Full FDM analysis (parallel groups, cycles)

# Git sync
python3 org_cli.py sync                          # Pull only
python3 org_cli.py sync "commit message" --agent coder  # Commit + push

MCP Server

org_mcp_server.py gives AI agents native tools to interact with the organization:

Read tools

Tool	Description
`org_read_state`	Read all process states
`org_read_tensions`	Read open problems/opportunities
`org_read_attractors`	Read strategic goals
`org_read_agents`	Read registered agents
`org_read_health`	Read system health metrics
`org_read_process`	Read a specific process and its artifacts
`org_read_artifacts`	Read recent entries from audit log
`org_read_convergence`	Read convergence history (E(x) over V-steps)

Write tools

Tool	Description
`org_update_state`	Update process state
`org_create_process`	Create a new process (P.0), optional `depends_on`
`org_log_artifact`	Append to immutable audit log
`org_create_tension`	Register a new tension
`org_resolve_tension`	Mark a tension as resolved
`org_add_dependency`	Add a dependency between two processes
`org_remove_dependency`	Remove a dependency between two processes

Governance tools

Tool	Description
`org_calculate_priority`	Calculate priority score H(s)
`org_calculate_energy`	Calculate convergence score E(x)
`org_verify_chain`	Verify hash-chain integrity of audit log
`org_analyze_dependencies`	Full dependency graph analysis (cycles, parallel groups, bottleneck)
`org_read_dependencies`	Read the current FDM dependency register
`org_git_sync`	Pull, commit, push changes

Setup

Add to your Claude Code MCP configuration (~/.claude.json):

{
  "mcpServers": {
    "org-as-code": {
      "command": "python3",
      "args": ["org_mcp_server.py"],
      "env": {
        "ORG_REPO_PATH": "/path/to/org-as-code"
      }
    }
  }
}

Priority Score

Priority is not a feeling. It is a calculation:

H(s) = w₁·urgency + w₂·commitment + w₃·demand + w₄·blocking

Default weights (configurable in protocol/config.yaml):

Weight	Value	Component
w₁	0.30	Urgency — how time-sensitive
w₂	0.20	Commitment — how invested we are
w₃	0.30	Demand — external need
w₄	0.20	Blocking — how much this blocks other work

Thresholds:

H(s) ≥ 0.8 → Escalate to human
H(s) ≥ 0.5 → Action required
H(s) < 0.5 → Low priority

(Formally: the Semantic Hamiltonian — see THEORY.md)

Convergence Score

The priority score tells you what to work on. The convergence score tells you how far from done:

E(x) = w_g·gaps² + w_i·inconsistencies² + w_u·uncertainty² − w_e·evidence²

The quadratic penalty means one critical gap (0.9² = 0.81) outweighs three minor ones (3 × 0.3² = 0.27).

Threshold	Meaning
E(x) < 0.10	Ready to commit
E(x) < 0.30	Minor revision needed
E(x) ≥ 0.30	Major revision needed

V-step reviews include convergence scores (gaps, inconsistencies, uncertainty, evidence). The system auto-calculates E(x) and tracks it per process, showing whether iterations are converging, stagnating, or diverging.

(Formally: Semantic Energy — see THEORY.md)

Process Dependencies (FDM)

Processes can declare dependencies on each other. The system analyzes the dependency graph to detect cycles, compute parallel execution groups, and identify bottlenecks.

# FEAT-002 depends on FEAT-001
python3 org_cli.py deps-add FEAT-002 FEAT-001

# Full analysis: parallel groups, cycles, critical path
python3 org_cli.py deps-analyze

The dependency graph uses Tarjan's SCC algorithm for cycle detection and Kahn's algorithm for topological sort. When circular dependencies are found, the system proposes which edge to break.

Parallel groups show which processes can execute simultaneously:

Group 1: FEAT-001, BUG-001     (no dependencies — start immediately)
Group 2: FEAT-002              (depends on FEAT-001)
Group 3: FEAT-003              (depends on FEAT-002)

Impact scoring identifies bottleneck processes — the ones whose delay cascades through the most downstream dependents.

The dependency register (registry/fdm.json) is auto-generated after every dependency change, providing a persistent, git-versioned view of the dependency graph.

Hash-Chain Audit Trail

Every action is logged in registry/artifacts.jsonl with cryptographic linking:

{
  "type": "p_step",
  "agent": "coder",
  "process_id": "FEAT-001",
  "action": "P.0_proposal",
  "description": "Add user authentication",
  "timestamp": "2026-03-16T14:30:00Z",
  "prev_hash": "a3f8c1...",
  "entry_hash": "7b2e4d..."
}

Each entry's entry_hash = SHA-256(prev_hash + canonical JSON of entry). Chain integrity is verifiable via CLI (python3 org_cli.py verify) or MCP (org_verify_chain).

Tamper-evident. Append-only. Git-versioned.

Process Templates

Feature flow

name: feature
steps:
  - name: "P.0 — Proposal"
    artifact: "P.0_proposal.md"
    agent_types: [ai, human]
    next_state: P_COMPLETE

  - name: "V.0 — Review"
    artifact: "V.0_review.yaml"
    agent_types: [ai, human]
    next_state: V_COMPLETE
    requires: P_COMPLETE

  - name: "P.1 — Implementation"
    artifact: "P.1_implementation.md"
    agent_types: [ai]
    next_state: P_COMPLETE

  - name: "V.1 — Verification"
    artifact: "V.1_verification.yaml"
    agent_types: [ai, human]
    next_state: COMMITTED
    requires: P_COMPLETE

Feature v2 flow (with conditional human gate)

feature_v2.yaml adds priority-conditional human approval: processes with H(s) ≥ 0.8 require human sign-off at V.1, while low-risk processes (H(s) < 0.5) permit AI-only verification.

Theoretical Foundation

org-as-code is an implementation of the SYNTRIAD metapattern:

T : (S, I, C) → S'    — every process is a transformation
Ω = P ↔ V             — governed by expand-contract rhythm
V++                    — only validated work persists

The priority score H(s) and convergence score E(x) used throughout this tool are domain-specific instances of the Semantic Hamiltonian and Semantic Energy described in Semantic Thermodynamics.

For the design rationale — why quadratic scoring, how the audit trail works, and where the pattern comes from — see THEORY.md.

Use Cases

Multi-agent collaboration — Agent A writes code, Agent B reviews it, a human approves if the priority score warrants it. Each handoff is a P→V transition with a convergence score.
Human-AI teams — Humans set direction (attractors, tensions), AI agents execute (P-steps), either party validates (V-steps). The protocol scales from solo developer + one AI to full teams.
Async-first organizations — Replace meetings with validated artifacts. Every decision has a traceable audit trail, not a Slack thread.
Compliance-sensitive environments — Immutable decision logs for AI Act, SOC2, NIS2. The hash chain provides tamper-evident records of who decided what, when, and why.

Real-World Examples

The examples/ directory contains two cases from independent evaluations where the P↔V protocol caught errors before they shipped:

SEC-001 — A security proposal with two architectural errors (wrong injection model, wrong threat model). The V-step rejected it. The revised proposal shipped the correct fix. 6 artifacts over 5 days.
PERF-001 — A performance optimization that would have shipped at the accuracy constraint floor. The V-step caught the cliff-edge risk. The revised plan used ONNX quantization instead, maintaining a safety margin. 4 artifacts over 4 days.

See examples/README.md for the full story.

Running Tests

python3 -m pytest tests/ -v

156 tests cover security validation, hash-chain integrity, agent registration, input sanitization, energy calculation, convergence tracking, per-process state, auto-migration, dependency validation, graph analysis, FDM integration, and CLI commands.

Integration

org-as-code provides two integration paths — no bridge layer needed.

MCP-compatible agents (Claude Code, Windsurf, Gemini)

Use the MCP server. See Setup above. The agent gets 21 native tools over stdio — no code required.

Python agent frameworks (LangChain, CrewAI, custom)

Import directly from the server module:

from org_mcp_server import (
    org_create_process,
    org_log_artifact,
    org_update_state,
    org_add_dependency,
    org_analyze_dependencies,
    org_calculate_energy,
    org_read_convergence,
    org_verify_chain,
)

# Create processes with dependencies
org_create_process("FEAT-010", "feature", "Add caching", "Redis-based response cache", "my-agent")
org_create_process("FEAT-011", "feature", "Add cache invalidation", "TTL + event-based", "my-agent",
    depends_on="FEAT-010")

# Add dependency after creation
org_add_dependency("FEAT-012", "FEAT-010")

# Analyze dependency graph
print(org_analyze_dependencies())  # parallel groups, cycles, critical path, bottleneck

# Log a V-step with convergence scores
org_log_artifact("reviewer", "V.0_review", "Approved with conditions",
    process_id="FEAT-010",
    extra='{"convergence": {"gaps": 0.3, "inconsistencies": 0.1, "uncertainty": 0.2, "evidence": 0.7}}')

Set ORG_REPO_PATH before importing, or the module defaults to its own directory.

Non-Python frameworks

Call org_cli.py as a subprocess. All 23 CLI commands return structured, parseable output.

Known Limitations

We believe in transparency about what this system does and does not do well today:

State storage modes. Default mode is per_process: each process stores its state in processes/{ID}/state.yaml, eliminating concurrent write conflicts. registry/state.yaml is auto-generated as a read-only index. Legacy monolithic mode is available via state_storage.mode: monolithic in protocol/config.yaml. Existing monolithic deployments auto-migrate on first write when switching to per_process mode. Advisory file locking (fcntl.flock) protects against write corruption within a single host.
Security mode. All write tools validate agent IDs against registry/agents.yaml and process IDs against format [A-Z]+-[0-9]+. Default mode is permissive (logs warnings for unknown agents). Set security.mode: strict in protocol/config.yaml for production use, which rejects unknown agents from all write tools except org_log_artifact (bootstrap exemption). Commit messages are sanitized to prevent git trailer injection.
Local tamper-evidence only. The hash chain provides tamper-evidence but not tamper-resistance. A repository admin with force-push access can rewrite history. For stronger guarantees, consider anchoring chain tips to an external timestamping authority.

Roadmap

v2.3 provides auditable execution: every decision is logged, hash-chained, and verifiable. The next major capability is auditable authority — formalizing who may decide, not just what was decided.

Planned for a future release:

Formal role model — producer, validator, approver, senior approver, auditor as first-class concepts with explicit authorities
Decision artifacts — D.0_decision.yaml as auditable governance events, not just V-step metadata
Policy-based commit control — process type and risk class determine required approvals, role coverage, and human gates
Conflict resolution — when validators disagree: consensus, senior override, or mandatory escalation
Separation of duties — a producer may not self-approve; security processes require senior sign-off

This extends the state machine with a DECISION_PENDING state between V_COMPLETE and COMMITTED, and formalizes the relationship between the existing H(s) human gate and the governance layer.

The governance layer will ship when real multi-validator conflict use cases drive the design — not before.

License

MIT — See LICENSE. The governance protocol is open. Use it freely.

Contributing

See CONTRIBUTING.md. Pull requests that demonstrate real usage are preferred over theoretical discussion.

org-as-code is part of the SYNTRIAD ecosystem.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
examples		examples
processes		processes
protocol		protocol
registry		registry
tests		tests
tools		tools
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
THEORY.md		THEORY.md
fdm.py		fdm.py
org_cli.py		org_cli.py
org_mcp_server.py		org_mcp_server.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

org-as-code

Why This Exists

What you get

Quick Start

1. Clone and configure

2. Register your agents

3. Start your first process

4. Let an AI agent validate

5. Commit and verify

The P↔V Protocol

Repository Structure

CLI Reference

MCP Server

Read tools

Write tools

Governance tools

Setup

Priority Score

Convergence Score

Process Dependencies (FDM)

Hash-Chain Audit Trail

Process Templates

Feature flow

Feature v2 flow (with conditional human gate)

Theoretical Foundation

Use Cases

Real-World Examples

Running Tests

Integration

MCP-compatible agents (Claude Code, Windsurf, Gemini)

Python agent frameworks (LangChain, CrewAI, custom)

Non-Python frameworks

Known Limitations

Roadmap

License

Contributing

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages