Skip to content

Improve HIPAA deidentification scope evidence#1367

Open
wowsofine wants to merge 1 commit into
UnitOneAI:mainfrom
wowsofine:improve/hipaa-deidentification-scope-evidence
Open

Improve HIPAA deidentification scope evidence#1367
wowsofine wants to merge 1 commit into
UnitOneAI:mainfrom
wowsofine:improve/hipaa-deidentification-scope-evidence

Conversation

@wowsofine
Copy link
Copy Markdown

Skill Improvement ($50-150 Bounty)

Skill Modified

Skill name: hipaa-review
Skill path: skills/compliance/hipaa-review/

What Was Wrong

Issue #1326 identifies a scoping false positive: analytics, AI, warehouse, feature-store, dashboard, and logging systems can be excluded from ePHI scope too quickly when a dataset is merely labeled deidentified.

That label is not enough evidence. HHS guidance for HIPAA de-identification recognizes Expert Determination and Safe Harbor methods under 45 CFR 164.514, and downstream exports, derived identifiers, small cohorts, prompts, and logs can preserve or recreate linkable ePHI context.

What This PR Fixes

This PR adds de-identification scope evidence gates to hipaa-review:

  • HIPAA-DEID-01 through HIPAA-DEID-08
  • method evidence for Expert Determination, Safe Harbor, limited data set, or another documented basis
  • expert determination support fields
  • Safe Harbor checklist evidence
  • quasi-identifier inventory
  • derived identifier and linkage handling
  • downstream lineage for dashboards, feature stores, model data, prompts, logs, vendors, and BA/subcontractor systems
  • re-identification risk review
  • not_evaluable_treat_as_ephi fallback when evidence is incomplete
  • report output sections for de-identification scope and re-identification gaps
  • HHS de-identification guidance and 45 CFR 164.514 references

Evidence

Before (scope can be over-excluded):

warehouse_dataset: patient_outcomes_analytics
claimed_status: deidentified
deidentification_method: missing
fields:
  - birth_year
  - zip3
  - diagnosis_group
  - visit_month
  - device_id_hash
exports:
  - destination: ml_feature_store
  - destination: marketing_attribution_dashboard
logs:
  prompt_debug: full_query_and_results
scope_decision:
  excluded_from_ephi_inventory: true
  rationale: dataset label says deidentified

After (now requires method evidence and downstream risk review):

deidentification_method: expert_determination
expert_determination:
  expert_name: privacy-statistician@example.org
  determination_date: 2026-06-01
  methods_and_results_document: DEID-2026-14
  residual_risk_conclusion: very_small
quasi_identifier_controls:
  geography: state_only
  dates: year_only
  rare_diagnosis_groups: suppressed_when_cohort_under_20
downstream_lineage:
  approved_destinations:
    - population_health_dashboard
logging_controls:
  prompt_debug: disabled
scope_decision:
  excluded_from_ephi_inventory: true
  rationale: expert determination documented and downstream linkage controlled

Test Cases Added/Updated

  • Added vulnerable test cases (tests/vulnerable/)
  • Added benign test cases (tests/benign/)
  • Existing tests still pass

Added fixtures:

  • skills/compliance/hipaa-review/tests/vulnerable/deidentified-analytics-linkage.md
  • skills/compliance/hipaa-review/tests/benign/expert-determined-low-risk-export.md

Validation

  • git diff --check
  • git diff --cached --check
  • frontmatter required-field check across skills/**/SKILL.md and roles/**/SKILL.md
  • Markdown fence balance check for touched HIPAA files
  • prompt-injection pattern scan across skills/ and roles/
  • ASCII scan for new fixtures
  • content marker checks for HIPAA-DEID-*, de-identification scope output, not_evaluable_treat_as_ephi, official references, and fixtures

Bounty Tier

  • Minor ($50) -- Doc update, small logic tweak, typo fix
  • Moderate ($100) -- New edge case coverage, FP reduction with evidence
  • Substantial ($150) -- Rewritten detection logic, major coverage expansion

Bounty Info

  • I have read and agree to the CONTRIBUTING.md bounty terms
  • Preferred payment method: GitHub Sponsors or PayPal; payment details can be provided privately after maintainer acceptance.

Closes #1326.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[REVIEW] hipaa-review: add de-identification and re-identification evidence gates

1 participant