Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 21 additions & 0 deletions skills/ai-security/agentic-top-10/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -196,6 +196,21 @@ In 2024, researchers demonstrated a persistent memory poisoning attack against a
4. Implement memory decay and review cycles. Periodically audit long-term memory for anomalous entries. Apply TTLs to user-sourced memories.
5. In multi-agent systems, isolate memory per agent. Shared memory must be mediated by a trusted memory broker that validates writes.

**Memory Integrity Evidence Gates:**

When AG04 is in scope, verify the full lifecycle of persistent memory, not just whether a memory store exists.

- **Write authorization:** Identify every path that can create or update memory, including user uploads, tool outputs, agent summaries, operator notes, and background jobs. Confirm untrusted inputs cannot be promoted to long-term memory without validation or approval.
- **Provenance metadata:** Saved memories must retain source type, source identity, timestamp, creating agent, approval state, trust tier, TTL, and the original content or immutable reference used to derive the memory.
- **Trust-tiered retrieval:** Retrieval must filter or rank by trust tier so user-sourced or externally sourced memories cannot override system/developer instructions or trusted operational knowledge.
- **Integrity and replay:** Memory records should be append-only or versioned with hash/integrity metadata, allowing reviewers to reconstruct what changed, when, and by whom.
- **Poisoning detection:** Review whether the system tests memory writes with prompt-injection payloads, malicious summaries, conflicting facts, and tool-output instructions before release.
- **Containment and removal:** The design must include quarantine, tombstone/delete, downstream cache invalidation, re-embedding, and audit replay procedures for confirmed poisoned memory.
- **Residual derived data:** Deleting a memory entry is insufficient if derived summaries, embeddings, vector replicas, prompt caches, or analytics exports can still reintroduce the poisoned content.
- **Scope boundaries:** Distinguish ephemeral session memory, per-user memory, per-agent memory, and globally shared memory. Higher trust and wider reuse require stronger approval and audit evidence.

**False positive to avoid:** Do not mark AG04 as mitigated because the system uses a managed vector database, has a delete endpoint, or says memory is reviewed periodically. The review must prove that untrusted content cannot be silently written, retrieved as trusted context, or persist through derived caches after removal.

**Framework Mapping:**

- OWASP LLM Top 10 2025: LLM01 — Prompt Injection, LLM02 — Sensitive Information Disclosure
Expand Down Expand Up @@ -495,6 +510,12 @@ Structure the final report as follows:
- Human approval gates: [present/absent, description]
- Multi-agent communication: [method]

## Memory Integrity Review

| Store | Write Sources | Trust Labels | Approval Gate | Retrieval Filter | Integrity Metadata | Removal/Quarantine | Residual Cache Risk | AG04 Status |
|---|---|---|---|---|---|---|---|---|
| [store name] | [user/tool/agent/operator] | [present/absent] | [present/absent] | [trust-tier/filter details] | [hash/version/audit] | [procedure] | [low/medium/high] | [Pass/Partial/Fail] |

## Findings by Threat Category

### AG01 — Excessive Agency and Permissions
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
# Memory Integrity Edge Cases

These fixtures validate AG04 review behavior for persistent agent memory systems.

## Case 1: Untrusted Document Summary Promoted to Long-Term Memory

```yaml
memory:
store: pgvector
write_sources:
- user_uploaded_documents
- agent_summaries
promotion_policy: automatic
metadata:
source: optional
trust_tier: none
```

**Expected result:** Fail for AG04.

**Reason:** User-controlled content can become cross-session memory without approval, source identity, trust tier, TTL, or immutable provenance.

## Case 2: Retrieval Mixes Trust Tiers by Similarity Only

```yaml
retrieval:
index: shared_agent_memory
top_k: 8
rank_by: cosine_similarity
filters: []
prompt_assembly:
memory_position: before_system_policy_summary
```

**Expected result:** Fail or High severity Partial.

**Reason:** User-sourced and agent-generated memories can outrank trusted operational context. The prompt assembly order increases the chance that poisoned memory influences future instructions.

## Case 3: Delete Endpoint Leaves Derived Embeddings and Caches

```yaml
memory_delete:
endpoint: DELETE /memory/{id}
removes:
- primary_record
does_not_remove:
- embedding_vector
- summarized_profile
- retrieval_cache
- analytics_export
```

**Expected result:** Partial.

**Reason:** The primary record can be deleted, but derived data can still reintroduce poisoned content into future prompts or audits.

## Case 4: Quarantine With Audit Replay and Trust Labels

```yaml
memory:
store: signed_append_log
write_policy:
user_context: requires_validation
tool_output: requires_sanitization
system_notes: operator_approved
metadata:
required:
- source_type
- source_identity
- creating_agent
- approval_state
- trust_tier
- ttl
- content_hash
retrieval:
required_filters:
- trust_tier
- user_scope
- agent_scope
incident_response:
quarantine: true
tombstone: true
reembed_after_removal: true
invalidate_prompt_cache: true
audit_replay: true
```

**Expected result:** Pass for the AG04 memory integrity lifecycle if implementation evidence exists.

**Reason:** The design covers controlled writes, provenance, trust-tiered retrieval, containment, derived data cleanup, and replayable audit history.

## Review Assertions

- Do not treat a managed vector database as proof of memory integrity.
- Require source attribution and trust labels before long-term reuse.
- Check retrieval filters and prompt assembly order, not only write controls.
- Confirm poisoning cleanup covers embeddings, summaries, caches, replicas, and exports.