Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 45 additions & 3 deletions skills/incident-response/post-incident-review/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -280,6 +280,38 @@ Convert analysis findings into specific, measurable, assignable, and time-bound
| P2 | Moderate gap that represents a defense-in-depth weakness | 90 days |
| P3 | Minor improvement or best-practice enhancement | Next quarter |

### Step 7: Remediation Verification and Recurrence Monitoring

Do not treat a remediation action as closed because a tracking ticket is closed. Each action must have objective acceptance criteria, verification evidence, and a named verifier who is not the action owner when independence is required.

**Verification gates:**

| Gate | Requirement | Evidence |
|---|---|---|
| Acceptance Criteria | Define what must be true before the action can close | Testable criteria linked to the PIR finding and control failure |
| Implementation Evidence | Prove the change was deployed or process update was adopted | Configuration export, merged PR, policy version, change ticket, training record |
| Independent Validation | Verify the fix works and addresses the root cause | Retest result, control test, tabletop result, audit sample, screenshots or logs |
| Detection Validation | Confirm new or updated detections fire and route correctly | Test event, alert screenshot/log, routing destination, runbook link |
| Recurrence Monitoring | Watch for the same failure mode after closure | Monitoring query, dashboard, owner, watch period, success criteria |
| Closure Approval | Document who accepted closure and residual risk | Approver, date, evidence links, risk acceptance if gaps remain |

**Verification by control type:**

| Remediation Type | Minimum Closure Evidence | False Positive Closure |
|---|---|---|
| Preventive control | Retest showing the original attack path is blocked or mitigated | Ticket says "MFA enabled" but no login test or policy export exists |
| Detective control | Test signal generates expected alert, severity, owner, and routing | Rule was created but never tested with a representative event |
| Corrective control | Recovery or rollback exercise proves the process works | Backup job exists but restore was not tested |
| Process control | Updated playbook, trained owners, and tabletop or walkthrough result | Document was edited but responders were not trained |
| Governance action | Risk owner approval, deadline, review cadence, and residual risk | Action is deferred without time-bounded risk acceptance |

**Recurrence monitoring rules:**
- Define a watch period for each P0/P1 remediation, typically 30 to 90 days depending on incident severity and business cycle.
- Monitor for the original precursor, indicator, control failure, or detection gap that contributed to the incident.
- Reopen the PIR action if the same failure mode recurs, if detection tests fail, or if validation evidence cannot be produced.
- Escalate overdue P0 actions to executive visibility after the deadline; P1/P2 actions require documented exception approval before extension.
- Record residual risk when a remediation is only partially effective, including the compensating controls and next review date.

---

## 4. Findings Classification
Expand Down Expand Up @@ -354,9 +386,19 @@ root cause, and the number/priority of remediation actions identified.]
- [Gap or failure identified during retrospective]

### Remediation Plan
| ID | Finding | Action | Owner | Priority | Deadline | Ticket |
|---|---|---|---|---|---|---|
| REM-001 | [Finding] | [Action] | [Owner] | [P0-P3] | [Date] | [ID] |
| ID | Finding | Action | Owner | Priority | Deadline | Ticket | Acceptance Criteria |
|---|---|---|---|---|---|---|---|
| REM-001 | [Finding] | [Action] | [Owner] | [P0-P3] | [Date] | [ID] | [Evidence required before closure] |

### Remediation Verification
| ID | Verification Owner | Implementation Evidence | Validation Method | Result | Residual Risk |
|---|---|---|---|---|---|
| REM-001 | [Name/team] | [Config/PR/change/policy evidence] | [Retest/control test/tabletop/detection test] | [Pass/Fail/Partial] | [None/description] |

### Recurrence Monitoring
| ID | Watch Period | Signal or Query | Owner | Success Criteria | Escalation Condition |
|---|---|---|---|---|---|
| REM-001 | [30/60/90 days] | [Detection, metric, log query, audit sample] | [Name/team] | [No recurrence / alert fires / control remains effective] | [When to reopen or escalate] |

### Follow-Up Schedule
- **Remediation Review Date:** [YYYY-MM-DD -- typically 30 days after PIR]
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
# Remediation Verification Edge Cases

Use these cases to validate that `post-incident-review` does not accept ticket closure as proof that incident root causes have been remediated.

## Case 1: Closed ticket with no control evidence

**Input**

```yaml
incident: exposed admin panel without MFA
root_cause: privileged access policy did not require MFA for admin application
remediation:
id: REM-001
ticket: SEC-1842
status: closed
owner: identity-team
action: enable MFA for admin application
verification:
acceptance_criteria: missing
config_export: missing
retest_result: missing
verifier: missing
recurrence_monitoring: missing
```

**Expected result**

Fail closure. The PIR must keep the action open or mark it partial until MFA configuration evidence and a successful login test prove the original access path is blocked.

## Case 2: Detection rule added but never tested

**Input**

```yaml
incident: data exfiltration over unusual user agent
root_cause: no alert for high-volume download with rare user agent
remediation:
id: REM-002
ticket: DET-778
status: closed
action: add SIEM detection
verification:
rule_id: captured
test_event: missing
alert_routing: missing
on_call_owner: missing
runbook_link: missing
```

**Expected result**

Fail detection validation. A rule definition alone does not prove that a representative event creates the expected alert, severity, owner assignment, and routing.

## Case 3: Backup remediation without restore validation

**Input**

```yaml
incident: destructive malware wiped file server
root_cause: backups were online and deleted by attacker
remediation:
id: REM-003
ticket: DR-220
status: closed
action: create immutable backups
verification:
backup_job: captured
immutability_policy: captured
restore_test: missing
recovery_time_result: missing
residual_risk: undocumented
```

**Expected result**

Mark closure as partial. The implementation evidence is useful, but corrective control verification requires a restore test and documented recovery result.

## Case 4: Complete closure evidence with recurrence watch

**Input**

```yaml
incident: repeated suspicious admin logins from unmanaged device
root_cause: conditional access policy excluded admin role
remediation:
id: REM-004
ticket: IAM-402
status: ready_for_closure
action: enforce conditional access for admin role
verification:
acceptance_criteria: admin login from unmanaged device must fail
config_export: captured
retest_result: pass
verifier: security-engineering
detection_test: alert routes to soc-primary queue
closure_approver: ciso-delegate
recurrence_monitoring:
watch_period: 60 days
query: failed and blocked admin logins from unmanaged devices
owner: soc-detection
success_criteria: blocked attempts alert and no successful bypasses
escalation: reopen REM-004 if bypass succeeds or alert fails
```

**Expected result**

Pass closure. The action has clear acceptance criteria, implementation evidence, independent validation, detection validation, approval, and recurrence monitoring.