Skill Being Reviewed
Skill name: post-incident-review
Skill path: skills/incident-response/post-incident-review/
False Positive Analysis
Benign PIR that can be scored higher than warranted by shallow root cause analysis:
root_cause_analysis:
method: "5 Whys"
chain:
- "Why: Attacker exploited unpatched Apache Struts CVE-2023-XXXXX"
- "Because: Vulnerability was published 30 days before exploitation"
- "Because: Patch was not applied within the 7-day SLA"
- "Because: The system was excluded from the automated patch cycle"
- "Because: The system owner classified it as 'low priority' in the CMDB"
root_cause_statement: "Apache Struts vulnerability not patched within SLA due to incorrect CMDB classification."
remediation:
- Ensure all Struts systems are correctly classified in CMDB
- Reduce patching SLA from 7 to 3 days for internet-facing systems
Why this is a false positive:
The 5 Whys analysis stops at "incorrect CMDB classification" without asking the next-level questions: Why was the classification incorrect? Was the CMDB inaccurate because there is no automated discovery? Was there a process gap in system onboarding? Did the patch management team have visibility into the correct classification but no enforcement authority? The root cause statement is a proximate cause (misclassification) rather than a systemic root cause (missing automated discovery and classification governance). The skill's RCA guidance says "Stop when you reach a cause that is within the organization's control to change," but this stopping rule is ambiguous — an organization can change a CMDB entry, but that does not prevent the misclassification pattern from recurring for other systems. The skill should require that the 5 Whys chain demonstrate that the identified root cause prevents recurrence of the class of incident, not just this specific instance.
Benign PIR with comprehensive remediation but no blast radius quantification:
remediation_plan:
- Finding: Missing EDR on Linux servers
Action: Deploy CrowdStrike Falcon to all Linux servers
Owner: Platform Team
Priority: P1
Deadline: 30 days
- Finding: Insufficient log retention
Action: Extend CloudTrail retention to 365 days
Owner: Security Engineering
Priority: P2
Deadline: 90 days
metrics:
dwell_time: "28 days"
mttd: "28 days"
mttc: "4 hours"
mttr: "48 hours"
Why this is a false positive:
The skill's metrics section (Step 4) covers MTTD, MTTC, MTTR, Dwell Time, and related metrics, but it does not include blast radius quantification. Without a blast radius metric (number of systems affected, data records exposed, business processes disrupted, revenue impact), the PIR cannot distinguish a low-severity incident with good detection from a high-severity incident with good detection. The remediation priority is context-dependent: a P1 for a critical-system incident is more urgent than a P1 for a sandbox incident, but the output format does not capture the blast radius context that justifies priority assignments.
Coverage Gaps
Missed variant 1: Root cause depth score — 5 Whys without recurrence prevention evidence.
root_cause_analysis:
method: "5 Whys"
chain:
- "Why: Attacker exploited vulnerable library in production"
- "Because: Library version was 2 years old"
- "Because: Dependabot PR was created but never merged"
- "Because: PR required manual approval and was deprioritized"
- "Because: No SLA enforcement for dependency update PRs"
root_cause: "Missing automated dependency update policy with SLA enforcement"
recurrence_risk: |
The remediations address Dependabot auto-merge for this specific repo,
but 15 other repos have similar Dependabot PRs waiting for manual approval.
No org-wide dependency update policy has been created.
Why it should be caught:
The skill's root cause analysis guidance and output format do not require a recurrence_risk assessment or scope_of_root_cause field that identifies whether the root cause is specific to one system/team/process or is an organizational pattern. Without this field, remediation actions may be scoped too narrowly (fixing one CMDB entry or one repo's merge process) while the same pattern exists in many other places. The skill should require that every root cause analysis output includes a scope field (single-instance / team-pattern / org-wide) and a recurrence_likelihood assessment.
Missed variant 2: Detection engineering feedback loop not documented in PIR output.
detection_improvement:
new_rules_created:
- rule_name: "Apache Struts CVE-2023-XXXXX exploitation attempt"
status: deployed (SIEM + WAF)
existing_rules_tuned:
- rule: "Outbound SMB connection detection"
action: threshold lowered from 100MB to 10MB
rules_not_updated:
- reason: "Rule for this attack technique already exists but was evaded; no update identified"
detection_coverage_map:
mitre_ttps_covered: ["T1190", "T1505", "T1078", "T1021"]
mitre_ttps_missed: []
Why it should be caught:
NIST SP 800-61 Rev 2 Section 3.4.2 ("Using Collected Incident Data") recommends using post-incident data to improve detection capability. The current PIR output format includes a "Detection Rule Updates Required" checkbox in the Follow-Up Schedule, but it does not have a structured section for the detection engineering feedback loop — what specific rules were created/tuned as a result of this incident, and what ATT&CK techniques were covered or remain uncovered. Without this structured section, the PIR may identify detection gaps but not translate them into verifiable detection engineering actions.
Missed variant 3: Cross-team communication and escalation path not evaluated.
communication_failures:
- event: SOC identified suspicious activity at T+2h
delay: T+8h before contacting system owner
cause: SOC did not have on-call contact for the affected system
escalation_matrix: outdated (system ownership changed 3 months ago)
- event: Legal notification required due to data breach notification law
delay: T+24h after confirmation
cause: No pre-established notification template or legal contact workflow
Why it should be caught:
The PIR process includes communication logs and escalation times in the timeline, and common pitfalls mention "Communication failure — stakeholders were not notified, or notification was delayed." However, the structured output format does not require a dedicated communication and coordination section that evaluates notification timeliness, escalation matrix accuracy, external notification SLA compliance, and coordination quality across teams. Without this section, the PIR may document communication delays but not capture the systemic pattern (e.g., out-of-date escalation matrix) or the compliance implications (e.g., GDPR 72-hour notification breach).
Edge Cases
- Incidents involving third-party or managed security service providers (MSSP) where handoff between internal and external teams creates detection/response gaps not captured in a single-organization PIR format.
- Incidents that span multiple cloud providers or jurisdictions where data localization, privacy law, and law enforcement access create coordination complexity not reflected in the communication assessment.
- PIR for a "near miss" (incident prevented by defense-in-depth) where no actual compromise occurred — the current PIR format assumes a confirmed incident with measurable dwell time and blast radius.
Remediation Quality
Recommended additions:
- Add a root cause
scope field (single-instance / team-pattern / org-wide) and recurrence_prevention_evidence requirement to ensure RCA goes beyond proximate cause.
- Add blast radius metrics: affected system count, data records exposed, business process impact, regulatory notification requirement.
- Add a dedicated "Detection Engineering Feedback Loop" section with new rules created, existing rules tuned, and ATT&CK coverage map.
- Add a "Communication and Coordination Assessment" section with escalation matrix accuracy, notification SLA compliance, and cross-team coordination quality evaluation.
Comparison to Other Tools
| Tool |
Catches this? |
Notes |
| NIST SP 800-61 Rev 2 |
Partial |
Recommends using incident data for detection improvement but does not define a structured output format |
| Google SRE postmortem culture |
Partial |
Emphasizes blamelessness but does not require blast radius metrics or recurrence scope scoring |
| Jeli / FireHydrant (incident analysis platforms) |
Partial |
Commercial tools offer timeline and action tracking but leave root cause depth and detection feedback to reviewer judgment |
| PagerDuty Incident Response |
Partial |
Focuses on response coordination; post-incident analysis depth depends on reviewer |
Overall Assessment
Strengths:
- Strong adherence to NIST SP 800-61 Rev 2 methodology with blameless retrospective, timeline reconstruction, RCA, and metrics.
- Good control failure mapping with common pattern reference table.
- Clear remediation prioritization (P0-P3) with SLA deadlines.
Needs improvement:
- Root cause analysis lacks a depth/scope scoring mechanism. The 5 Whys stopping rule is ambiguous and can produce proximate-cause-level RCA.
- Blast radius quantification is absent from the metrics and remediation sections.
- Detection engineering feedback loop is limited to a checkbox rather than a structured output section.
- Communication and coordination assessment is not a distinct section in the PIR output format.
Priority recommendations:
- Add root cause scope and recurrence prevention evidence requirements to the RCA output format.
- Add blast radius metrics to the incident metrics section.
- Add a dedicated detection engineering feedback loop output section with rule creation/tuning and ATT&CK coverage mapping.
- Add a communication and coordination assessment section with escalation matrix accuracy and notification SLAs.
Sources Checked
Bounty Info
Skill Being Reviewed
Skill name: post-incident-review
Skill path:
skills/incident-response/post-incident-review/False Positive Analysis
Benign PIR that can be scored higher than warranted by shallow root cause analysis:
Why this is a false positive:
The 5 Whys analysis stops at "incorrect CMDB classification" without asking the next-level questions: Why was the classification incorrect? Was the CMDB inaccurate because there is no automated discovery? Was there a process gap in system onboarding? Did the patch management team have visibility into the correct classification but no enforcement authority? The root cause statement is a proximate cause (misclassification) rather than a systemic root cause (missing automated discovery and classification governance). The skill's RCA guidance says "Stop when you reach a cause that is within the organization's control to change," but this stopping rule is ambiguous — an organization can change a CMDB entry, but that does not prevent the misclassification pattern from recurring for other systems. The skill should require that the 5 Whys chain demonstrate that the identified root cause prevents recurrence of the class of incident, not just this specific instance.
Benign PIR with comprehensive remediation but no blast radius quantification:
Why this is a false positive:
The skill's metrics section (Step 4) covers MTTD, MTTC, MTTR, Dwell Time, and related metrics, but it does not include blast radius quantification. Without a blast radius metric (number of systems affected, data records exposed, business processes disrupted, revenue impact), the PIR cannot distinguish a low-severity incident with good detection from a high-severity incident with good detection. The remediation priority is context-dependent: a P1 for a critical-system incident is more urgent than a P1 for a sandbox incident, but the output format does not capture the blast radius context that justifies priority assignments.
Coverage Gaps
Missed variant 1: Root cause depth score — 5 Whys without recurrence prevention evidence.
Why it should be caught:
The skill's root cause analysis guidance and output format do not require a
recurrence_riskassessment orscope_of_root_causefield that identifies whether the root cause is specific to one system/team/process or is an organizational pattern. Without this field, remediation actions may be scoped too narrowly (fixing one CMDB entry or one repo's merge process) while the same pattern exists in many other places. The skill should require that every root cause analysis output includes ascopefield (single-instance / team-pattern / org-wide) and arecurrence_likelihoodassessment.Missed variant 2: Detection engineering feedback loop not documented in PIR output.
Why it should be caught:
NIST SP 800-61 Rev 2 Section 3.4.2 ("Using Collected Incident Data") recommends using post-incident data to improve detection capability. The current PIR output format includes a "Detection Rule Updates Required" checkbox in the Follow-Up Schedule, but it does not have a structured section for the detection engineering feedback loop — what specific rules were created/tuned as a result of this incident, and what ATT&CK techniques were covered or remain uncovered. Without this structured section, the PIR may identify detection gaps but not translate them into verifiable detection engineering actions.
Missed variant 3: Cross-team communication and escalation path not evaluated.
Why it should be caught:
The PIR process includes communication logs and escalation times in the timeline, and common pitfalls mention "Communication failure — stakeholders were not notified, or notification was delayed." However, the structured output format does not require a dedicated communication and coordination section that evaluates notification timeliness, escalation matrix accuracy, external notification SLA compliance, and coordination quality across teams. Without this section, the PIR may document communication delays but not capture the systemic pattern (e.g., out-of-date escalation matrix) or the compliance implications (e.g., GDPR 72-hour notification breach).
Edge Cases
Remediation Quality
Recommended additions:
scopefield (single-instance / team-pattern / org-wide) andrecurrence_prevention_evidencerequirement to ensure RCA goes beyond proximate cause.Comparison to Other Tools
Overall Assessment
Strengths:
Needs improvement:
Priority recommendations:
Sources Checked
skills/incident-response/post-incident-review/SKILL.mdBounty Info