Skip to content

Add provenance & review-status metadata convention with read filtering#63

Merged
imonroe merged 1 commit into
mainfrom
claude/ob1-provenance-metadata
Jun 7, 2026
Merged

Add provenance & review-status metadata convention with read filtering#63
imonroe merged 1 commit into
mainfrom
claude/ob1-provenance-metadata

Conversation

@imonroe

@imonroe imonroe commented Jun 7, 2026

Copy link
Copy Markdown
Owner

Summary

Adds a lightweight provenance & review-status metadata convention for governed agent memory, and lets the REST read endpoints filter on it (backlog issue #52 — the last active OB1-review item). It stores as ordinary metadata in the existing Qdrant payload: no tables, no schema change, no migration.

Verified against mem0 2.0.2 source: both get_all and search pass arbitrary metadata keys straight through to the (already-validated) vector_store.list payload-filter path, and return custom fields under each result's metadata. So filtering is real, not aspirational.

The convention

Key Recommended values Meaning
source user, agent, import:chatgpt, capture:telegram, tool:n8n Origin (importers + capture bot already set this)
confidence high, medium, low, unknown Trust in the content
review_status unreviewed, approved, rejected, stale Whether it's been vetted
reviewed_by / reviewed_at free-form / ISO 8601 Who vetted it, when
expires_at ISO 8601 When the fact should stop being trusted

On the issue comment (re-checked, as asked)

An external commenter (norika1207-lab) suggested keeping confidence and review_status orthogonal and adding expiry/staleness. It's sound and I adopted it:

  • confidence and review_status are documented as independent states.
  • expires_at + an exclude_expired read filter directly address their key point — "old memory stayed trusted after the world changed."

I declined the nested source.kind/source.ref/source.agent_id shape: memserv already has a top-level agent_id writer tag, and the importers/capture bot already write a flat metadata.source string — nesting would duplicate/conflict. So source stays a flat string. (Rationale captured in the docs.)

Changes

  • app/rest.py: _provenance_filters(); GET /api/v1/memories gains source / confidence / review_status (exact-match) + exclude_expired query params; POST /memories/search gains the same fields (works in both semantic and keyword modes). MCP reads stay unfiltered per the shared-store invariant.
  • app/memory.py: drop_expired() post-filter (handles expires_at both top-level and nested under metadata); keyword_search() gains an extra_filters passthrough.
  • Storing the keys already works via the existing metadata field (REST) / metadata arg (MCP) — no write-path change needed.

Tests

  • tests/test_memory.py: drop_expired (past removed / future + missing kept / top-level + nested / non-results passthrough), keyword extra_filters passthrough.
  • tests/test_rest.py: list filters by source/confidence/review_status, list exclude_expired, search provenance filter, search exclude_expired.
  • Full suite: 167 passed, ruff clean.

Docs

  • User Guide: a "Provenance and review metadata" convention section (table, orthogonality + expiry rationale, a write example) and the new read-filter query params in the REST reference.
  • Developer Guide: rest.py/memory.py notes.

Optional post-deploy check

GET /api/v1/memories?review_status=approved&exclude_expired=true should return only approved, non-expired memories.

Closes #52.

https://claude.ai/code/session_017835DVrvURaYnbQiPQwzue


Generated by Claude Code

Formalize a lightweight metadata convention for governed agent memory and let
REST reads filter on it. Stores as ordinary metadata in the existing Qdrant
payload — no tables, no schema change (verified: mem0 get_all/search pass
arbitrary metadata keys through to the payload filter).

Reserved keys: source, confidence, review_status, reviewed_by/reviewed_at,
expires_at. confidence and review_status are kept independent (orthogonal
states), and expires_at addresses the "stale memory stays trusted" failure mode
raised in the issue discussion. Adopts that feedback; keeps `source` a flat
string (consistent with the importers/capture bot and the existing top-level
agent_id writer tag) rather than a nested object.

- app/rest.py: _provenance_filters(); GET /api/v1/memories gains source/
  confidence/review_status/exclude_expired query params; search gains the same
  fields (semantic + keyword). MCP reads stay unfiltered (architecture invariant).
- app/memory.py: drop_expired() post-filter (handles top-level and nested
  metadata expires_at); keyword_search() gains extra_filters passthrough.
- tests: drop_expired, keyword extra_filters, and REST list/search provenance
  filtering + exclude_expired.
- docs: USER_GUIDE convention table + filter examples; DEVELOPER_GUIDE notes.

Closes #52.

https://claude.ai/code/session_017835DVrvURaYnbQiPQwzue

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a provenance + review-status metadata convention for governed memories and exposes it via REST read/search filtering (including optional expiry exclusion), without changing storage schema (still ordinary Qdrant payload metadata).

Changes:

  • Add REST query/body filters for source, confidence, review_status plus exclude_expired on list/search.
  • Extend keyword search to accept additional exact-match payload filters, and add a shared drop_expired() post-filter for REST reads.
  • Add tests and update user/developer documentation to describe the convention and new filter parameters.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated no comments.

Show a summary per file
File Description
app/rest.py Adds provenance filter construction for list/search and optional expiry exclusion via drop_expired().
app/memory.py Adds extra_filters support to keyword search and implements drop_expired() with expires_at parsing.
tests/test_rest.py Verifies REST list/search provenance filters and exclude_expired behavior.
tests/test_memory.py Adds unit tests for keyword extra_filters passthrough and drop_expired() behavior.
docs/USER_GUIDE.md Documents the metadata convention and the new REST read/search filter parameters.
docs/DEVELOPER_GUIDE.md Notes the new helpers (_provenance_filters, drop_expired) in the codebase overview.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@imonroe imonroe merged commit 7de79a0 into main Jun 7, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Provenance & review-status metadata convention (governed "agent memory")

3 participants