feat: MetaAttack model by shamykyzer · Pull Request #441 · AI-SDC/SACRO-ML

shamykyzer · 2026-04-12T09:53:47Z

Closes #428

Added a MetaAttack class that runs multiple privacy attacks on the same target and combines their per-record results into one DataFrame.

Runs LiRA, QMIA, and/or Structural attacks (with optional repeated runs)
Extracts each record's vulnerability score from every attack
Aggregates scores in two levels:
- Within-attack: mean, std, and consistency across repeated runs
- Cross-attack: MIA ensemble mean (arithmetic + geometric), structural flag, and a count of how many attacks flagged each record
Outputs a vulnerability_matrix.csv and a standard JSON report

meta = MetaAttack(
    attacks=[("lira", {"n_shadow_models": 100}, 5), ("qmia", {}), ("structural", {})],
)
meta.attack(target)
df = meta.vulnerability_df  # one row per record, one column group per attack

available via the factory: factory.attack(target, "meta", attacks=[...])

…sues, remove CatBoost refs

…gitignore

…8-meta-attack

Add MetaAttack(Attack) with validated constructor, _parse_attacks(), and abstract method stubs. Register as "meta" in the attack factory. Supports (name, params, n_reps) tuples with validation against supported attacks (lira, qmia, structural). Loads k-anonymity threshold from ACRO config when not explicitly provided. Includes design spec and staged implementation plan.

Add _run_sub_attack() and the orchestration loop in _attack(). Each sub-attack runs in an isolated subdirectory under output_dir to prevent shadow model and report collisions between runs. MIA attacks (LiRA, QMIA) get report_individual=True injected automatically. Structural always computes record-level results. Sub-attack objects are collected for score extraction in Stage 3.

Add _extract_mia_scores() and _extract_structural_scores() with a field-mapping dict (_MIA_SCORE_FIELDS) for LiRA/QMIA score paths. Wire extraction into _attack() loop: scores collected immediately after each sub-attack run into mia_scores and structural_scores dicts, keyed by attack name with one list per repetition.

- Guard against sub-attack not running: check return value from attack() and raise RuntimeError with clear message if empty - Reject empty attacks list in _parse_attacks with ValueError - Use copy.deepcopy(params) instead of shallow dict(params) to prevent nested mutable values leaking between repetitions - Add logging.basicConfig to match peer attack file conventions

Implement _build_dataframe() with: - Level 1 (within-attack): mean, std, consistency per MIA attack across n_reps; mean k / majority vote for structural reps - Level 2 (cross-attack): arithmetic and geometric mean of MIA per-attack means; binary structural flag; n_vulnerable count - NaN padding for structural columns on test records - Epsilon-stabilised geometric mean to handle log(0) Wire into _attack(): DataFrame stored on self.vulnerability_df after all sub-attacks complete and scores are extracted.

- Clip MIA scores to [0, 1] during extraction to handle LiRA Carlini modes that produce unbounded log-likelihood ratios - Document LiRA score convention: score = CDF under out-distribution, high values = evidence for membership (not against) - Replace 'v is True' identity check with truthiness test 'if v:' to handle numpy bools correctly - Round averaged k-anonymity to int for multi-rep structural runs (fractional k is not meaningful)

Complete the MetaAttack pipeline: - _compute_global_metrics: uses mia_mean as membership predictor with get_metrics() for AUC/TPR/Advantage; falls back to summary dict for structural-only configs - _construct_metadata: enriches report with thresholds and key metrics - _get_attack_metrics_instances: standard report structure with sub-attack summary and full DataFrame under "individual" - CSV export: saves vulnerability_matrix.csv alongside JSON report - _attack() now returns a proper report dict (no more NotImplementedError)

10 test cases covering: - Validation: unsupported attack, invalid tuple, empty list, bad n_reps - Integration: QMIA + structural basic run, DataFrame shape and columns - Structural NaN for test records - Repeated runs: std column exists, consistency in [0, 1] - Threshold effects: lower threshold flags more records - Global metrics: AUC and TPR in [0, 1] - Report structure: standard nested JSON keys - Factory integration: factory.attack(target, "meta", ...) works - CSV export: vulnerability_matrix.csv written and loadable

Demonstrates end-to-end usage: synthetic data, Target construction, MetaAttack with QMIA (2 reps) + structural, DataFrame inspection, summary statistics, and top-10 most vulnerable records.

shamykyzer · 2026-05-11T08:31:03Z

Hi @jim-smith, could you please review these changes that I addressed:

3-way behaviour flag (run_all, use_existing_only, fill_missing) per your sketch
MetaAttack output appended to report_dir/report.json by default (keep_separate=True opts out)
use_existing_only and fill_missing read both canonical single-file report.json and subdirectory layouts
_make_pdf produces a PDF via report.create_meta_report with a bar chart of records flagged by N attacks
CHANGELOG and README mention MetaAttack
Graceful degradation across sub-attacks (narrow except, warnings on expected failures)
Structural n_reps clamped to 1 (deterministic)

Follow-ups (separate PRs / issues):

score vs member_prob standardisation: Standardise per-record score field name across attacks #450
StructuralAttack ignores report_individual flag: StructuralAttack ignores report_individual flag #451
WorstCaseAttack to expose per-record scores: WorstCaseAttack to expose per-record scores #452
Coverage gap (currently 87%)

Thanks.

Overview

Architecture flow with behaviour modes

Aggregation pipeline

Class hierarchy (click to expand)

shamykyzer · 2026-05-11T08:35:08Z

also -- thank you Jim for the thorough review and the time you put into walking through the code.

jim-smith · 2026-05-13T17:45:01Z

+        super().__init__(output_dir=output_dir, write_report=write_report)
+        # MetaAttack does not use shadow models; remove the empty directory
+        # created by the base class so the output directory stays clean.
+        with contextlib.suppress(OSError):


just checking, this doesn't remove shadow models created by other attacks ? that would be catastrophic@!

good thing to check, that would be catastrophic if true, as i read it the safety holds via two lines, but please gut check me

base class at attacks/attack.py:43:

self.shadow_path = os.path.normpath(f"{self.output_dir}/shadow_models")

so self.shadow_path is rooted at this metaattack instance's own output_dir, it can never resolve to another attack's directory

then at meta_attack.py lines 131 to 134:

with contextlib.suppress(OSError): os.rmdir(self.shadow_path)

os.rmdir raises OSError on a populated directory which suppress swallows, so the call does nothing rather than destructively deleting if anything is in there

i think the worst case is "remove the empty shadow_models/ this metaattack just created, otherwise do nothing"

jim-smith · 2026-05-13T17:57:33Z

@shamykyzer have made one comment in the example file. Will make a local copt of this branch and see what it gives me when I run it.

…tacks/constants.py

…rage

ssrhaso · 2026-05-22T11:17:46Z

added unit tests covering the previously-uncovered branches in meta_attack.py (defensive guards, report-scanning edge cases, sub-attack failure paths, non-numeric scores) and the finite-auc branch in create_meta_report.

takes meta_attack.py from 87% to 100% and should clear the codecov/patch and codecov/project checks. test-only, no changes to attack code :)

@shamykyzer

shamykyzer · 2026-05-25T08:21:15Z

hi @jim-smith, just wanted to let you know this PR is ready for review

the constants file is in, the behaviour kwarg is in the example and README, and hasaan added the coverage on the 22nd

the follow ups are each their own PR now, #460 structural respecting report_individual, #461 worstcase per record scores, #462 canary test parametrised over QMIA and LiRA (worstcase is a follow up), #463 lira score to member_prob rename

mind taking another look when you have a chance

#460)

ssrhaso and others added 25 commits March 13, 2026 18:48

feat: initial skeleton files with docstring and signatures

270f6b9

feat: QMIA CatBoost attack with multiclass and (x,y) conditioning

c550567

test: QMIA hinge score, multiclass, and attack tests

6e93c3f

chore: clean up unused files and ignore catboost artifacts

094a282

feat: QMIA benchmark scripts with formatted comparison tables

3c155a0

docs: add QMIA usage and benchmark documentation to README

d2b7074

Merge remote-tracking branch 'origin/main' into HEAD

fcd8a19

feat: QMIA hinge score and label remapping utilities

1ecc311

feat: add QMIAAttack quantile regression membership inference attack

9e63e1c

refactor: switch QMIA to HistGradientBoostingRegressor, fix review is…

e984a9d

…sues, remove CatBoost refs

fix: remove stale CatBoost references from README, factory test, and …

a981711

…gitignore

fix: resolve ruff lint errors in benchmarks, tests, and summarize script

8489312

style: pre-commit fixes

c964ad5

Merge branch 'main' into 306-quantile-regression

45b41d5

fix: tighten factory test assertions, fix conftest RNG seed

293376c

Merge remote-tracking branch 'origin/306-quantile-regression' into 42…

7e3415f

…8-meta-attack

docs: add MetaAttack example script

4774fe0

Demonstrates end-to-end usage: synthetic data, Target construction, MetaAttack with QMIA (2 reps) + structural, DataFrame inspection, summary statistics, and top-10 most vulnerable records.

shamykyzer self-assigned this Apr 12, 2026

shamykyzer added the in-progress label Apr 12, 2026

shamykyzer changed the title ~~428 meta attack~~ MetaAttack Apr 12, 2026

shamykyzer changed the title ~~MetaAttack~~ feat: MetaAttack model Apr 12, 2026

style: pre-commit fixes

5515e43

refactor: move MetaAttack constants to module level

bcad245

shamykyzer requested a review from jim-smith May 11, 2026 08:19

shamykyzer mentioned this pull request May 11, 2026

StructuralAttack ignores report_individual flag #451

Closed

shamykyzer mentioned this pull request May 11, 2026

WorstCaseAttack to expose per-record scores #452

Closed

jim-smith reviewed May 13, 2026

View reviewed changes

Comment thread examples/sklearn/meta_attack_example.py

jim-smith reviewed May 13, 2026

View reviewed changes

Comment thread sacroml/attacks/meta_attack.py Outdated

jim-smith reviewed May 13, 2026

View reviewed changes

Comment thread sacroml/attacks/meta_attack.py Outdated

jim-smith reviewed May 13, 2026

View reviewed changes

shamykyzer added 2 commits May 14, 2026 10:18

refactor: extract MetaAttack EPS_META and DEFAULT_MIA_THRESHOLD to at…

b5dd7b6

…tacks/constants.py

docs: add behaviour kwarg to MetaAttack example

34a499d

shamykyzer requested a review from jim-smith May 14, 2026 08:30

Merge branch 'main' into 428-meta-attack

ed088f0

This was linked to issues May 22, 2026

WorstCaseAttack to expose per-record scores #452

Closed

StructuralAttack ignores report_individual flag #451

Closed

Standardise per-record score field name across attacks #450

Closed

test(meta): cover MetaAttack/report branches to reach 100% patch cove…

38ca47c

…rage

shamykyzer added 2 commits May 25, 2026 08:12

test(meta): cover non-finite sub-attack AUC branch in create_meta_report

e69b5ee

docs: add behaviour kwarg to MetaAttack README example

4abeed4

shamykyzer mentioned this pull request May 25, 2026

test(canary): parametrise canary detection across QMIA and LiRA #462

Merged

shamykyzer and others added 2 commits May 26, 2026 17:03

fix(structural): gate per-record results behind report_individual flag (

97c07e2

#460)

Merge branch 'main' into 428-meta-attack

02804f4

jim-smith approved these changes May 26, 2026

View reviewed changes

jim-smith merged commit 483b641 into main May 26, 2026
4 checks passed

jim-smith deleted the 428-meta-attack branch May 26, 2026 16:49

rpreen mentioned this pull request May 28, 2026

feat(worstcase): add report_individual flag for per-record member_prob #461

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: MetaAttack model#441

feat: MetaAttack model#441
jim-smith merged 54 commits into
mainfrom
428-meta-attack

shamykyzer commented Apr 12, 2026

Uh oh!

shamykyzer commented May 11, 2026 •

edited

Loading

Uh oh!

shamykyzer commented May 11, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jim-smith May 13, 2026

Uh oh!

shamykyzer May 14, 2026

Uh oh!

jim-smith commented May 13, 2026

Uh oh!

ssrhaso commented May 22, 2026

Uh oh!

shamykyzer commented May 25, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

shamykyzer commented Apr 12, 2026

Uh oh!

shamykyzer commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

shamykyzer commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jim-smith May 13, 2026

Choose a reason for hiding this comment

Uh oh!

shamykyzer May 14, 2026

Choose a reason for hiding this comment

Uh oh!

jim-smith commented May 13, 2026

Uh oh!

ssrhaso commented May 22, 2026

Uh oh!

shamykyzer commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

shamykyzer commented May 11, 2026 •

edited

Loading

shamykyzer commented May 11, 2026 •

edited

Loading

shamykyzer commented May 25, 2026 •

edited

Loading