Audit Trail (hash-chain journal)

Engrava can record every change to your thought-graph in an append-only, hash-linked journal — a tamper-evident audit trail. Each entry captures one mutation (insert / update / delete of a thought or edge) as a before/after delta, and is cryptographically chained to the previous entry with SHA-256.

Read the Security model before relying on this for compliance. The chain detects accidental corruption and naive edits, but it is a keyless chain stored in the same database file — see the boundary below.

Enabling the journal

Journaling is off by default (zero overhead when disabled — the journal_entry table exists but is never written to). Turn it on either via configuration or the constructor.

In engrava.yaml:

database:
  path: "./engrava.db"

journal:
  enabled: true

from engrava import SqliteEngravaCore

async with await SqliteEngravaCore.from_config("engrava.yaml") as store:
    assert store.journal is not None  # journaling is active

Or when constructing the store directly:

import aiosqlite
from engrava import SqliteEngravaCore

async with aiosqlite.connect("engrava.db") as conn:
    conn.row_factory = aiosqlite.Row
    store = SqliteEngravaCore(conn, journal_enabled=True)
    await store.ensure_schema()

store.journal returns the JournalWriter when journaling is enabled, or None when it is off — so a quick if store.journal is not None: guards any journal-specific code.

What gets recorded

When journaling is enabled, the store records a journal entry automatically on every mutation of a thought or an edge — you do not call the journal yourself. The recorded mutation_type values (the MutationType enum) are:

`MutationType`	When
`INSERT_THOUGHT`	`create_thought()`
`UPDATE_THOUGHT`	`update_thought()`
`DELETE_THOUGHT`	`delete_thought()` (only when a row was actually deleted)
`INSERT_EDGE`	`create_edge()`
`UPDATE_EDGE`	`update_edge()`
`DELETE_EDGE`	`delete_edge()` (only when a row was actually deleted)

Each entry's delta is a {"before": ..., "after": ...} dictionary: inserts have before: null, deletes have after: null, and updates carry both sides.

Not recorded: embeddings (store_embedding) and action records (create_action) are not written to the journal — the audit trail covers the thought-and-edge graph, not the embedding or action tables. This also matters for backups — see Backup note.

TTL expiry is recorded. cleanup_expired() (and the auto-cleanup it triggers) goes through the same journaled paths, so expiry of a thought is captured according to the configured TTL strategy:

archive strategy → an UPDATE_THOUGHT entry (the thought's lifecycle_status flips to ARCHIVED and expires_at is cleared; the delta carries the before/after).
delete strategy → a DELETE_THOUGHT entry (after: null).

(The separate engrava gc CLI command, which physically purges already-archived rows, operates at the storage layer and is not journaled.)

The `JournalEntry` schema

Each entry is an immutable JournalEntry:

Field	Type	Meaning
`entry_id`	`str`	Stable UUID for this entry
`sequence_number`	`int`	Monotonic, gapless position in the chain (starts at 1)
`mutation_type`	`str`	One of the `MutationType` values above
`target_id`	`str \| None`	The affected `thought_id` / `edge_id`
`delta`	`dict`	`{"before": {...}, "after": {...}}` diff
`parent_hash`	`str \| None`	SHA-256 of the previous entry (`None` for the first entry)
`entry_hash`	`str`	SHA-256 of this entry's canonical content
`created_at`	`str`	ISO-8601 UTC timestamp

The hash is computed over the canonical string "{sequence_number}|{mutation_type}|{target_id}|{json(delta, sort_keys)}|{parent_hash}" via JournalWriter.compute_hash(...) (a static method, exposed for callers who want to recompute a hash independently).

Querying history

Use store.journal.get_entries(...) to read the trail. All filters are optional; results are ordered by sequence_number ascending.

# Everything that ever happened to one thought:
history = await store.journal.get_entries(target_id="thought-001")
for entry in history:
    print(entry.sequence_number, entry.mutation_type, entry.created_at)

# Only deletions, since a timestamp, capped:
deletions = await store.journal.get_entries(
    mutation_type="DELETE_THOUGHT",
    since="2026-01-01T00:00:00+00:00",
    limit=500,
)

Parameter	Default	Meaning
`target_id`	`None`	Filter by the affected entity ID
`mutation_type`	`None`	Filter by mutation type string
`since`	`None`	ISO-8601 lower bound on `created_at` (inclusive)
`limit`	`100`	Maximum entries returned

Verifying integrity

store.journal.verify_integrity() walks the whole chain in order, recomputes every hash, and checks the parent-hash linkage. It returns a JournalIntegrityResult:

result = await store.journal.verify_integrity()
if result.valid:
    print(f"Chain OK — {result.entries_checked} entries verified.")
else:
    print(
        f"Tampering or corruption detected at sequence "
        f"{result.first_invalid_sequence}: {result.error_message}"
    )

Field	Type	Meaning
`valid`	`bool`	`True` if every hash and link checks out
`entries_checked`	`int`	Number of entries verified
`first_invalid_sequence`	`int \| None`	Sequence of the first broken entry, or `None`
`error_message`	`str \| None`	Description of the first error, or `None`

An empty journal verifies as valid=True with entries_checked=0.

Run verification on a schedule (e.g. before each backup, during incident response, or as a periodic monitoring check) rather than only ad hoc — that is what turns the chain from a passive structure into an active control.

Worked example

import aiosqlite
import uuid
from engrava import (
    SqliteEngravaCore,
    ThoughtRecord,
    ThoughtType,
    Priority,
    LifecycleStatus,
)

async with aiosqlite.connect(":memory:") as conn:
    conn.row_factory = aiosqlite.Row
    store = SqliteEngravaCore(conn, journal_enabled=True)
    await store.ensure_schema()

    note = ThoughtRecord(
        thought_id=str(uuid.uuid4()),
        thought_type=ThoughtType.OBSERVATION,
        essence="User prefers email over phone",
        content="Stated during onboarding call.",
        priority=Priority.P2,
        lifecycle_status=LifecycleStatus.ACTIVE,
        created_cycle=0,
        updated_cycle=0,
        source="human",
    )
    await store.create_thought(note)
    await store.update_thought(note.thought_id, essence="User strongly prefers email")

    # Two entries were recorded automatically (INSERT_THOUGHT, UPDATE_THOUGHT).
    entries = await store.journal.get_entries(target_id=note.thought_id)
    assert [e.mutation_type for e in entries] == ["INSERT_THOUGHT", "UPDATE_THOUGHT"]

    # The chain verifies.
    result = await store.journal.verify_integrity()
    assert result.valid and result.entries_checked == 2

Security model & guarantees

The journal is a keyless SHA-256 integrity chain stored in the same SQLite file it protects. verify_integrity() recomputes each entry's hash from that entry's own stored data — there is no secret key, HMAC, signature, or external anchor.

What it protects against (in scope):

Accidental corruption — bit-rot, a truncated file, a half-written row: the recomputed hash or the parent linkage will not match, and verification fails.
Naive tampering — someone who edits, deletes, or reorders a journal row (or an audited record) without recomputing the rest of the chain: the break is detected at the first inconsistent entry.

What it does NOT protect against (out of scope):

A chain-aware actor with write access to the database file. Because the chain is keyless and self-contained, anyone who can write to the .db can edit an entry and recompute every subsequent hash, producing a fully self-consistent chain that passes verify_integrity() with valid=True. The journal is not forgery-proof against an adversary (including the agent process itself) who controls the file.

If you need genuine, multi-party tamper-evidence, treat the in-file chain as one layer and add at least one of:

Restrict write access — store the .db on a volume only the trusted writer process can modify (OS file permissions / ownership).
Anchor the chain externally — periodically export the latest entry_hash (the chain tail) to an append-only / WORM store, a signed log, or another system out of the writer's control. A later verify_integrity() plus a match against the externally-anchored tail hash detects a full-file rewrite.
Verify on a schedule — run verify_integrity() from a separate monitored process so a detected mismatch raises an alert.

State this boundary plainly to stakeholders: Engrava's journal gives you integrity detection for accidental damage and unsophisticated edits, not cryptographic non-repudiation against a file-level adversary.

Backup & retention note

The logical snapshot/restore path (engrava snapshot / engrava restore) covers the thought / edge / embedding / action tables — it does not include the journal_entry table. A snapshot is therefore not a backup of the audit trail, and restoring from one starts a fresh chain. To preserve the journal, back up the database file itself (see the upgrade/backup guidance), and note that hard-deleting an audited thought still leaves its content in the journal's before/after delta — relevant when handling erasure requests.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Audit Trail (hash-chain journal)

Enabling the journal

What gets recorded

The `JournalEntry` schema

Querying history

Verifying integrity

Worked example

Security model & guarantees

Backup & retention note

See also

FilesExpand file tree

audit-trail.md

Latest commit

History

audit-trail.md

File metadata and controls

Audit Trail (hash-chain journal)

Enabling the journal

What gets recorded

The JournalEntry schema

Querying history

Verifying integrity

Worked example

Security model & guarantees

Backup & retention note

See also

The `JournalEntry` schema