diff --git a/skills/ai-security/agentic-top-10/SKILL.md b/skills/ai-security/agentic-top-10/SKILL.md index 2b42fa56..ac408c7f 100644 --- a/skills/ai-security/agentic-top-10/SKILL.md +++ b/skills/ai-security/agentic-top-10/SKILL.md @@ -196,6 +196,21 @@ In 2024, researchers demonstrated a persistent memory poisoning attack against a 4. Implement memory decay and review cycles. Periodically audit long-term memory for anomalous entries. Apply TTLs to user-sourced memories. 5. In multi-agent systems, isolate memory per agent. Shared memory must be mediated by a trusted memory broker that validates writes. +**Memory Integrity Evidence Gates:** + +When AG04 is in scope, verify the full lifecycle of persistent memory, not just whether a memory store exists. + +- **Write authorization:** Identify every path that can create or update memory, including user uploads, tool outputs, agent summaries, operator notes, and background jobs. Confirm untrusted inputs cannot be promoted to long-term memory without validation or approval. +- **Provenance metadata:** Saved memories must retain source type, source identity, timestamp, creating agent, approval state, trust tier, TTL, and the original content or immutable reference used to derive the memory. +- **Trust-tiered retrieval:** Retrieval must filter or rank by trust tier so user-sourced or externally sourced memories cannot override system/developer instructions or trusted operational knowledge. +- **Integrity and replay:** Memory records should be append-only or versioned with hash/integrity metadata, allowing reviewers to reconstruct what changed, when, and by whom. +- **Poisoning detection:** Review whether the system tests memory writes with prompt-injection payloads, malicious summaries, conflicting facts, and tool-output instructions before release. +- **Containment and removal:** The design must include quarantine, tombstone/delete, downstream cache invalidation, re-embedding, and audit replay procedures for confirmed poisoned memory. +- **Residual derived data:** Deleting a memory entry is insufficient if derived summaries, embeddings, vector replicas, prompt caches, or analytics exports can still reintroduce the poisoned content. +- **Scope boundaries:** Distinguish ephemeral session memory, per-user memory, per-agent memory, and globally shared memory. Higher trust and wider reuse require stronger approval and audit evidence. + +**False positive to avoid:** Do not mark AG04 as mitigated because the system uses a managed vector database, has a delete endpoint, or says memory is reviewed periodically. The review must prove that untrusted content cannot be silently written, retrieved as trusted context, or persist through derived caches after removal. + **Framework Mapping:** - OWASP LLM Top 10 2025: LLM01 — Prompt Injection, LLM02 — Sensitive Information Disclosure @@ -495,6 +510,12 @@ Structure the final report as follows: - Human approval gates: [present/absent, description] - Multi-agent communication: [method] +## Memory Integrity Review + +| Store | Write Sources | Trust Labels | Approval Gate | Retrieval Filter | Integrity Metadata | Removal/Quarantine | Residual Cache Risk | AG04 Status | +|---|---|---|---|---|---|---|---|---| +| [store name] | [user/tool/agent/operator] | [present/absent] | [present/absent] | [trust-tier/filter details] | [hash/version/audit] | [procedure] | [low/medium/high] | [Pass/Partial/Fail] | + ## Findings by Threat Category ### AG01 — Excessive Agency and Permissions diff --git a/skills/ai-security/agentic-top-10/tests/memory-integrity-edge-cases.md b/skills/ai-security/agentic-top-10/tests/memory-integrity-edge-cases.md new file mode 100644 index 00000000..fe852670 --- /dev/null +++ b/skills/ai-security/agentic-top-10/tests/memory-integrity-edge-cases.md @@ -0,0 +1,97 @@ +# Memory Integrity Edge Cases + +These fixtures validate AG04 review behavior for persistent agent memory systems. + +## Case 1: Untrusted Document Summary Promoted to Long-Term Memory + +```yaml +memory: + store: pgvector + write_sources: + - user_uploaded_documents + - agent_summaries + promotion_policy: automatic + metadata: + source: optional + trust_tier: none +``` + +**Expected result:** Fail for AG04. + +**Reason:** User-controlled content can become cross-session memory without approval, source identity, trust tier, TTL, or immutable provenance. + +## Case 2: Retrieval Mixes Trust Tiers by Similarity Only + +```yaml +retrieval: + index: shared_agent_memory + top_k: 8 + rank_by: cosine_similarity + filters: [] +prompt_assembly: + memory_position: before_system_policy_summary +``` + +**Expected result:** Fail or High severity Partial. + +**Reason:** User-sourced and agent-generated memories can outrank trusted operational context. The prompt assembly order increases the chance that poisoned memory influences future instructions. + +## Case 3: Delete Endpoint Leaves Derived Embeddings and Caches + +```yaml +memory_delete: + endpoint: DELETE /memory/{id} + removes: + - primary_record + does_not_remove: + - embedding_vector + - summarized_profile + - retrieval_cache + - analytics_export +``` + +**Expected result:** Partial. + +**Reason:** The primary record can be deleted, but derived data can still reintroduce poisoned content into future prompts or audits. + +## Case 4: Quarantine With Audit Replay and Trust Labels + +```yaml +memory: + store: signed_append_log + write_policy: + user_context: requires_validation + tool_output: requires_sanitization + system_notes: operator_approved + metadata: + required: + - source_type + - source_identity + - creating_agent + - approval_state + - trust_tier + - ttl + - content_hash + retrieval: + required_filters: + - trust_tier + - user_scope + - agent_scope + incident_response: + quarantine: true + tombstone: true + reembed_after_removal: true + invalidate_prompt_cache: true + audit_replay: true +``` + +**Expected result:** Pass for the AG04 memory integrity lifecycle if implementation evidence exists. + +**Reason:** The design covers controlled writes, provenance, trust-tiered retrieval, containment, derived data cleanup, and replayable audit history. + +## Review Assertions + +- Do not treat a managed vector database as proof of memory integrity. +- Require source attribution and trust labels before long-term reuse. +- Check retrieval filters and prompt assembly order, not only write controls. +- Confirm poisoning cleanup covers embeddings, summaries, caches, replicas, and exports.