Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
76 changes: 66 additions & 10 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,42 @@ Pre-1.0 minor versions may contain breaking changes; the project remains
in beta until the v1.0 stability criteria documented in
[README → Roadmap](README.md#roadmap) are met.

## [Unreleased]
## [0.7.0] - 2026-06-08

### Added
- **Decision provenance: signed `DecisionRecord` at every decision
boundary.** New `agentegrity.core.decision` module with
`DecisionRecord`, `CaptureTier`, `DecisionInput`, and
`RejectedAlternative` types. The `_BaseAdapter` (and
`IntegrityMonitor`) gains a `record_decision(...)` method and an
optional `signing_key=` constructor argument. The three decision
boundaries (`pre_tool_use`, `stop`, `subagent_start`) now append a
signed, hash-chained decision record to the same `AttestationChain`
that holds attestations, captured **before** the action executes so
a downstream verifier can prove the rationale was bound at decision
time and not retrofitted. Each subsequent `AttestationRecord`
carries `Evidence(evidence_type="decision", ...)` entries pointing
at the decisions that preceded it; `AttestationChain.verify_decision_links()`
validates the round-trip. **Capture tier today is C (Minimal) on every
shipped adapter** — the schema supports Tier B (Partial: reasoning
chain) and Tier A (Full: rejected alternatives), but no adapter
populates those fields in production yet. Honest framing: capture
fails open; on exception we log + emit a structured
`capture_failure` `FrameworkEvent` so monitoring can see the gap.
Spec at `spec/properties/decision-provenance.md`.
- **`AttestationChain` is now heterogeneous.** Holds both
`AttestationRecord` and `DecisionRecord` via a new structural
`ChainedRecord` Protocol. New `to_json()` / `from_json()`
convenience methods. New `verify_chain_detailed() -> (bool,
broken_idx, broken_kind)` for callers that want the broken
record's position. `verify_chain() -> bool` is unchanged.
- **`python -m agentegrity verify-decisions <chain.json>` CLI verb.**
Loads a serialized chain, runs `verify_chain()` +
`verify_decision_links()`, prints a per-record table (kind /
boundary / tier / signed / verified), exits non-zero on any
failure.
- **Glossary entries:** Decision Record, Capture Tier, Decision
Boundary.
- **AWS Bedrock Agents adapter (Python).** `pip install
agentegrity[bedrock-agents]`. One adapter, two surfaces:

Expand Down Expand Up @@ -92,6 +125,18 @@ in beta until the v1.0 stability criteria documented in
contract is loud rather than silent.

### Changed
- **`AttestationRecord` canonical payload now includes `record_kind`.**
Required so the heterogeneous chain can distinguish attestation
records from decision records under signature (otherwise a tamperer
could flip a decision into an attestation post-signing). **Backward-
incompatible:** chains serialized before v0.7 fail `verify_chain()`
after upgrade — signed or not — because the in-memory recomputed
`content_hash` (now over the new canonical bytes) doesn't match the
stored `chain_previous` references in subsequent records. Loading
still works; verification doesn't. No rescue migration script:
operators must either re-build the chain from a fresh root with
the new code or pin to v0.6 for legacy verification. Same break
applies to the Evidence-hash fix below; both land in this release.
- **`AgentegrityClient` adapter factory consolidated.** The five
per-framework methods (`create_claude_adapter`,
`create_langchain_adapter`, `create_openai_agents_adapter`,
Expand Down Expand Up @@ -130,6 +175,16 @@ in beta until the v1.0 stability criteria documented in
`[all]` automatically.

### Fixed
- **`Evidence.content_hash` is now a real, deterministic SHA-256** of
the canonical JSON of the layer-result dict. Was previously
`str(hash(str(r.to_dict())))` using Python's process-salted string
hash — non-deterministic across processes and non-portable, which
silently broke any attempt at tamper-evident verification across
process boundaries. The three triplicated record-build paths
(adapter base, monitor, SDK client) now share one
`build_attestation_record(...)` helper. **Backward-incompatible**:
re-builds the canonical payload of every newly-created attestation,
so old chains fail verification post-upgrade (see Changed above).
- **CrewAI adapter works on crewai ≥ 1.0.** crewai 1.0 relocated the
event classes from `crewai.utilities.events` to `crewai.events`
(canonical sources under `crewai.events.types.*`). The adapter still
Expand Down Expand Up @@ -383,12 +438,13 @@ in beta until the v1.0 stability criteria documented in
- Three working examples (`basic_evaluation.py`,
`runtime_monitoring.py`, `custom_validator.py`).

[Unreleased]: https://github.com/cogensec/agentegrity-framework/compare/v0.6.0...HEAD
[0.6.0]: https://github.com/cogensec/agentegrity-framework/releases/tag/v0.6.0
[0.5.3]: https://github.com/cogensec/agentegrity-framework/releases/tag/v0.5.3
[0.5.0]: https://github.com/cogensec/agentegrity-framework/releases/tag/v0.5.0
[0.4.0]: https://github.com/cogensec/agentegrity-framework/releases/tag/v0.4.0
[0.3.0]: https://github.com/cogensec/agentegrity-framework/releases/tag/v0.3.0
[0.2.1]: https://github.com/cogensec/agentegrity-framework/releases/tag/v0.2.1
[0.2.0]: https://github.com/cogensec/agentegrity-framework/releases/tag/v0.2.0
[0.1.0]: https://github.com/cogensec/agentegrity-framework/releases/tag/v0.1.0
[Unreleased]: https://github.com/cogensec/agentegrity/compare/v0.7.0...HEAD
[0.7.0]: https://github.com/cogensec/agentegrity/releases/tag/v0.7.0
[0.6.0]: https://github.com/cogensec/agentegrity/releases/tag/v0.6.0
[0.5.3]: https://github.com/cogensec/agentegrity/releases/tag/v0.5.3
[0.5.0]: https://github.com/cogensec/agentegrity/releases/tag/v0.5.0
[0.4.0]: https://github.com/cogensec/agentegrity/releases/tag/v0.4.0
[0.3.0]: https://github.com/cogensec/agentegrity/releases/tag/v0.3.0
[0.2.1]: https://github.com/cogensec/agentegrity/releases/tag/v0.2.1
[0.2.0]: https://github.com/cogensec/agentegrity/releases/tag/v0.2.0
[0.1.0]: https://github.com/cogensec/agentegrity/releases/tag/v0.1.0
4 changes: 2 additions & 2 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,8 +31,8 @@ Thank you for your interest in contributing to the Agentegrity Framework. This p

```bash
# Clone the repo
git clone https://github.com/cogensec/agentegrity-framework.git
cd agentegrity-framework
git clone https://github.com/cogensec/agentegrity.git
cd agentegrity

# Create a virtual environment
python -m venv .venv
Expand Down
20 changes: 11 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,13 +6,13 @@
<a href="https://npmjs.com/package/@agentegrity/client"><img src="https://img.shields.io/npm/v/@agentegrity/client" alt="npm"></a>
<a href="https://npmjs.com/package/@agentegrity/client"><img src="https://img.shields.io/npm/dm/@agentegrity/client" alt="npm"></a>
<a href="https://pepy.tech/projects/agentegrity"><img src="https://static.pepy.tech/personalized-badge/agentegrity?period=total&units=INTERNATIONAL_SYSTEM&left_color=BLACK&right_color=GREEN&left_text=downloads" alt="PyPI Downloads"></a>
<a href="https://deepwiki.com/Cogensec/agentegrity-framework/1-agentegrity-framework-overview"><img src="https://deepwiki.com/badge.svg" alt="Ask DeepWiki"></a>
<a href="https://deepwiki.com/Cogensec/agentegrity"><img src="https://deepwiki.com/badge.svg" alt="Ask DeepWiki"></a>
<a href="https://opensource.org/licenses/Apache-2.0">
<img src="https://img.shields.io/badge/License-Apache_2.0-blue.svg" alt="License: Apache 2.0"></a>
<a href="https://www.python.org/downloads/"><img src="https://img.shields.io/badge/python-3.10+-blue.svg" alt="Python 3.10+"></a>
<a href="pyproject.toml"><img src="https://img.shields.io/badge/library-v0.6.0-green.svg" alt="Library Version"></a>
<a href="pyproject.toml"><img src="https://img.shields.io/badge/library-v0.7.0-green.svg" alt="Library Version"></a>
<a href="spec/SPECIFICATION.md"><img src="https://img.shields.io/badge/spec-v1.0--draft-blue.svg" alt="Spec Version"></a
<a href="https://github.com/cogensec/agentegrity-framework"><img src="https://komarev.com/ghpvc/?username=cogensec&color=1E3A5F&style=flat-square&label=Repo+Views" alt="Repo Views" />
<a href="https://github.com/cogensec/agentegrity"><img src="https://komarev.com/ghpvc/?username=cogensec&color=1E3A5F&style=flat-square&label=Profile+Views" alt="Profile Views" />
</a>

</p>
Expand Down Expand Up @@ -47,7 +47,7 @@ A self-securing agent maintains three properties simultaneously. Each property i
| **Self-Stability** | Monitors its own behavioral drift against an established baseline and detects internal state corruption | Slow-drift attacks, memory poisoning, gradual goal redirection, identity erosion |
| **Self-Recovery** | Detects when its integrity has been compromised and restores itself to a known-good state | Persistent compromise, undetected lateral movement, state pollution across sessions |

v0.6.0 ships verification for all three capabilities (self-defense via the adversarial layer, self-stability via the cortical layer with optional LLM-backed semantic checks, self-recovery via the recovery layer with persistable checkpoint round-trip) across **eleven zero-config framework adapters** — five in Python (**Claude Agent SDK**, **LangChain / LangGraph**, **OpenAI Agents SDK**, **CrewAI**, **Google ADK**) and six in TypeScript (the same five, plus **Vercel AI SDK** which has no Python equivalent). All eleven share the `SessionExporter` extension point that lets any subscriber (including the commercial `agentegrity-pro` dashboard) receive live session data without touching the agent, and the same evaluator pipeline and attestation chain — a 2-3 line instrumentation on any of these frameworks produces the same signed audit trail.
v0.7.0 ships verification for all three capabilities (self-defense via the adversarial layer, self-stability via the cortical layer with optional LLM-backed semantic checks, self-recovery via the recovery layer with persistable checkpoint round-trip) plus **decision provenance** (signed, hash-chained `DecisionRecord`s captured at every decision boundary so the rationale is provably bound at decision time, not retrofitted) across **fourteen zero-config framework adapters** — eight in Python (**Claude Agent SDK**, **LangChain / LangGraph**, **OpenAI Agents SDK**, **CrewAI**, **Google ADK**, **AutoGen**, **Agno**, **AWS Bedrock Agents**) and six in TypeScript (the original five plus **Vercel AI SDK** which has no Python equivalent). All fourteen share the `SessionExporter` extension point that lets any subscriber (including the commercial `agentegrity-pro` dashboard) receive live session data without touching the agent, and the same evaluator pipeline and attestation chain — a 2-3 line instrumentation on any of these frameworks produces the same signed audit trail.

---

Expand Down Expand Up @@ -85,7 +85,7 @@ We believe in being explicit about what the library is and is not, because a sec

**What it does.** It provides a Python implementation of the four-layer verification architecture defined in the [Agentegrity Specification](spec/SPECIFICATION.md). It computes integrity scores from real evaluation runs, generates cryptographically signed attestation records, builds tamper-evident attestation chains, and produces structured audit logs for governance workflows. It runs locally with zero required dependencies and never makes network calls to Cogensec or any other service. It ships with extension points for custom threat detectors, custom policy rules, and custom validators.

**What it does not do.** The adversarial layer ships a regex pattern taxonomy across six attack families (prompt_injection, jailbreak, role_confusion, system_prompt_extraction, data_exfiltration, prompt_obfuscation) — calibrated 1.000 TPR / 0.000 FPR on the in-repo synthetic suite, but **0.000 TPR on the InjecAgent benchmark** (N=2,108) because action-oriented injections embedded in tool responses don't match the regex patterns. Closing that gap requires either an embedding-similarity check or an LLM-backed semantic classifier — both planned for the next release. The cortical layer uses Jensen-Shannon distance with Laplace smoothing for drift detection (replaces the older asymmetric KL approximation) and structural memory-provenance inspection. v0.2.0 introduced optional LLM-backed cortical checks (`pip install agentegrity[llm]`) that use Claude for semantic reasoning-chain validation, memory-provenance analysis, and drift classification; these run alongside the pattern-based checks and fail open on API errors. Production deployments should also register custom detectors with domain-specific logic. As of v0.6.0 the library ships eleven framework adapters — five in Python (Claude Agent SDK, LangChain / LangGraph, OpenAI Agents SDK, CrewAI, Google ADK) and six in TypeScript (the same five plus Vercel AI SDK). Adapters for Semantic Kernel, AutoGen, and AWS Bedrock Agents are on the post-0.6 roadmap.
**What it does not do.** The adversarial layer ships a regex pattern taxonomy across six attack families (prompt_injection, jailbreak, role_confusion, system_prompt_extraction, data_exfiltration, prompt_obfuscation) — calibrated 1.000 TPR / 0.000 FPR on the in-repo synthetic suite, but **0.000 TPR on the InjecAgent benchmark** (N=2,108) because action-oriented injections embedded in tool responses don't match the regex patterns. Closing that gap requires either an embedding-similarity check or an LLM-backed semantic classifier — both planned for the next release. The cortical layer uses Jensen-Shannon distance with Laplace smoothing for drift detection (replaces the older asymmetric KL approximation) and structural memory-provenance inspection. v0.2.0 introduced optional LLM-backed cortical checks (`pip install agentegrity[llm]`) that use Claude for semantic reasoning-chain validation, memory-provenance analysis, and drift classification; these run alongside the pattern-based checks and fail open on API errors. Production deployments should also register custom detectors with domain-specific logic. As of v0.7.0 the library ships fourteen framework adapters — eight in Python (Claude Agent SDK, LangChain / LangGraph, OpenAI Agents SDK, CrewAI, Google ADK, AutoGen, Agno, AWS Bedrock Agents) and six in TypeScript (the original five plus Vercel AI SDK). The Semantic Kernel adapter is deferred pending Microsoft Agent Framework GA (Q2 2026); one MAF adapter will cover both.

**What it deliberately is not.** It is not a guardrail. It does not block agent actions on its own — when an action is blocked, that is the result of explicit governance policy, not inferred risk. It is not a runtime enforcement layer trying to compete with WAF-style products. It is not a hosted service. It is a measurement and verification library, and everything it does is in service of producing evidence that an agent has (or lacks) the structural properties of a self-securing system.

Expand Down Expand Up @@ -276,7 +276,7 @@ See [`examples/`](examples/) for walkthroughs including custom threat detectors,
## Repository Structure

```
agentegrity-framework/
agentegrity/
├── MANIFESTO.md # The Agentegrity Manifesto
├── README.md # You are here
├── LICENSE # Apache 2.0
Expand Down Expand Up @@ -369,9 +369,11 @@ agentegrity-framework/

**v0.5.3 — Release & build polish.** Concrete version pins on TypeScript workspace deps (replacing `workspace:*`) so published packages install cleanly off‑registry, GitHub Actions bumped to checkout@v5 / setup-python@v6 / setup-node@v5, scoped push triggers + concurrency cancellation in CI, repo moved to the `cogensec` org, and an `AGENTEGRITY_OFFLINE` env var so test runs work without a reporter. Adds a Python `scripts/check_versions.py` mirroring the TypeScript one to keep `pyproject.toml`, `src/agentegrity/__init__.py`, and the README badge / claim lines from drifting apart again.

**v0.6.0 — Detection depth + recovery round-trip + conformance + benchmark (current).** The adversarial layer's substring match becomes a 21-pattern regex taxonomy across six attack families. The cortical layer's drift metric becomes Jensen-Shannon distance with Laplace smoothing and a `min_drift_samples` guard. `RecoveryLayer` gains a real `Checkpoint` Protocol with `InMemory` / `File` / `Sqlite` reference backends and a tested `snapshot()` ↔ `restore_to()` round-trip. The cortical layer gains a parallel `BaselineStore` Protocol so behavioural baselines survive process restarts. A cross-adapter conformance suite pins 9 invariants × 5 adapters. A detection benchmark harness (`pytest -m benchmark`) runs the synthetic suite plus loaders for InjecAgent / PINT / AgentDojo; numbers published in `STATUS.md`. Branch coverage gates land on Python (≥85%) and TypeScript (≥80% lines / 70% functions). The recovery layer is promoted to a first-class fourth default layer; `PropertyWeights` defaults rebalanced so RI gets 0.15 of the composite. Full migration notes in `CHANGELOG.md`.
**v0.6.0 — Detection depth + recovery round-trip + conformance + benchmark.** The adversarial layer's substring match becomes a 21-pattern regex taxonomy across six attack families. The cortical layer's drift metric becomes Jensen-Shannon distance with Laplace smoothing and a `min_drift_samples` guard. `RecoveryLayer` gains a real `Checkpoint` Protocol with `InMemory` / `File` / `Sqlite` reference backends and a tested `snapshot()` ↔ `restore_to()` round-trip. The cortical layer gains a parallel `BaselineStore` Protocol so behavioural baselines survive process restarts. A cross-adapter conformance suite pins 9 invariants × 5 adapters. A detection benchmark harness (`pytest -m benchmark`) runs the synthetic suite plus loaders for InjecAgent / PINT / AgentDojo; numbers published in `STATUS.md`. Branch coverage gates land on Python (≥85%) and TypeScript (≥80% lines / 70% functions). The recovery layer is promoted to a first-class fourth default layer; `PropertyWeights` defaults rebalanced so RI gets 0.15 of the composite.

**v0.6.0 — More adapters and compliance output (next).** Adapters for Semantic Kernel, AutoGen, AWS Bedrock Agents. Compliance report generation for EU AI Act, NIST AI RMF, and ISO 42001. Observability exporters (OpenTelemetry, Datadog).
**v0.7.0 — Three new Python adapters + decision provenance (current).** AWS Bedrock Agents (Strands hooks with real `event.cancel_tool` enforcement + boto3 trace-stream observation surface), Agno (Agent + Team via `pre_hooks` / `post_hooks` / `tool_hooks`, real enforcement via `StopAgentRun`), and AutoGen (OpenTelemetry SpanProcessor consuming GenAI semconv spans) join the Python adapter family — now eight strong. CrewAI compat fix for the 1.x event-bus relocation. A new synchronous `_evaluate_sync` dispatch core unlocks real enforcement on sync hook surfaces. The big core addition is **decision provenance**: `DecisionRecord` lives in the same `AttestationChain` as `AttestationRecord`, captured at the three decision boundaries (`pre_tool_use` / `stop` / `subagent_start`) before the action executes, Evidence-linked back from each subsequent attestation, verifiable via `python -m agentegrity verify-decisions <chain.json>`. The `Evidence.content_hash` defect (process-salted Python `hash()`) is fixed; chains serialized pre-v0.7 fail `verify_chain()` after upgrade — re-build from a fresh root.

**v0.8.0 — Compliance + observability (next).** Compliance report generation for EU AI Act, NIST AI RMF, and ISO 42001. OpenTelemetry instrumentation. Prometheus metrics.

**v1.0.0 — Stable API (when ready).** Declared stable when the public API has been unchanged for a full minor release cycle, when the library has production deployments at three or more external organizations, and when the framework has been cited in at least one peer-reviewed publication. v1.0.0 is not a date — it's a signal that adoption has happened beyond our direct influence.

Expand Down Expand Up @@ -431,7 +433,7 @@ If you use the Agentegrity Framework in research or production, please cite:
title={The Agentegrity Framework: Building and Verifying Self-Securing Autonomous AI Agents},
author={Cogensec Research},
year={2026},
url={https://github.com/cogensec/agentegrity-framework}
url={https://github.com/cogensec/agentegrity}
}
```

Expand Down
Loading
Loading