feat: MetaAttack model#441
Conversation
…sues, remove CatBoost refs
Add MetaAttack(Attack) with validated constructor, _parse_attacks(), and abstract method stubs. Register as "meta" in the attack factory. Supports (name, params, n_reps) tuples with validation against supported attacks (lira, qmia, structural). Loads k-anonymity threshold from ACRO config when not explicitly provided. Includes design spec and staged implementation plan.
Add _run_sub_attack() and the orchestration loop in _attack(). Each sub-attack runs in an isolated subdirectory under output_dir to prevent shadow model and report collisions between runs. MIA attacks (LiRA, QMIA) get report_individual=True injected automatically. Structural always computes record-level results. Sub-attack objects are collected for score extraction in Stage 3.
Add _extract_mia_scores() and _extract_structural_scores() with a field-mapping dict (_MIA_SCORE_FIELDS) for LiRA/QMIA score paths. Wire extraction into _attack() loop: scores collected immediately after each sub-attack run into mia_scores and structural_scores dicts, keyed by attack name with one list per repetition.
- Guard against sub-attack not running: check return value from attack() and raise RuntimeError with clear message if empty - Reject empty attacks list in _parse_attacks with ValueError - Use copy.deepcopy(params) instead of shallow dict(params) to prevent nested mutable values leaking between repetitions - Add logging.basicConfig to match peer attack file conventions
Implement _build_dataframe() with: - Level 1 (within-attack): mean, std, consistency per MIA attack across n_reps; mean k / majority vote for structural reps - Level 2 (cross-attack): arithmetic and geometric mean of MIA per-attack means; binary structural flag; n_vulnerable count - NaN padding for structural columns on test records - Epsilon-stabilised geometric mean to handle log(0) Wire into _attack(): DataFrame stored on self.vulnerability_df after all sub-attacks complete and scores are extracted.
- Clip MIA scores to [0, 1] during extraction to handle LiRA Carlini modes that produce unbounded log-likelihood ratios - Document LiRA score convention: score = CDF under out-distribution, high values = evidence for membership (not against) - Replace 'v is True' identity check with truthiness test 'if v:' to handle numpy bools correctly - Round averaged k-anonymity to int for multi-rep structural runs (fractional k is not meaningful)
Complete the MetaAttack pipeline: - _compute_global_metrics: uses mia_mean as membership predictor with get_metrics() for AUC/TPR/Advantage; falls back to summary dict for structural-only configs - _construct_metadata: enriches report with thresholds and key metrics - _get_attack_metrics_instances: standard report structure with sub-attack summary and full DataFrame under "individual" - CSV export: saves vulnerability_matrix.csv alongside JSON report - _attack() now returns a proper report dict (no more NotImplementedError)
10 test cases covering: - Validation: unsupported attack, invalid tuple, empty list, bad n_reps - Integration: QMIA + structural basic run, DataFrame shape and columns - Structural NaN for test records - Repeated runs: std column exists, consistency in [0, 1] - Threshold effects: lower threshold flags more records - Global metrics: AUC and TPR in [0, 1] - Report structure: standard nested JSON keys - Factory integration: factory.attack(target, "meta", ...) works - CSV export: vulnerability_matrix.csv written and loadable
Demonstrates end-to-end usage: synthetic data, Target construction, MetaAttack with QMIA (2 reps) + structural, DataFrame inspection, summary statistics, and top-10 most vulnerable records.
|
Hi @jim-smith, could you please review these changes that I addressed:
Follow-ups (separate PRs / issues):
Thanks. |
|
also -- thank you Jim for the thorough review and the time you put into walking through the code. |
| super().__init__(output_dir=output_dir, write_report=write_report) | ||
| # MetaAttack does not use shadow models; remove the empty directory | ||
| # created by the base class so the output directory stays clean. | ||
| with contextlib.suppress(OSError): |
There was a problem hiding this comment.
just checking, this doesn't remove shadow models created by other attacks ? that would be catastrophic@!
There was a problem hiding this comment.
good thing to check, that would be catastrophic if true, as i read it the safety holds via two lines, but please gut check me
base class at attacks/attack.py:43:
self.shadow_path = os.path.normpath(f"{self.output_dir}/shadow_models")so self.shadow_path is rooted at this metaattack instance's own output_dir, it can never resolve to another attack's directory
then at meta_attack.py lines 131 to 134:
with contextlib.suppress(OSError):
os.rmdir(self.shadow_path)os.rmdir raises OSError on a populated directory which suppress swallows, so the call does nothing rather than destructively deleting if anything is in there
i think the worst case is "remove the empty shadow_models/ this metaattack just created, otherwise do nothing"
|
@shamykyzer have made one comment in the example file. Will make a local copt of this branch and see what it gives me when I run it. |
|
added unit tests covering the previously-uncovered branches in meta_attack.py (defensive guards, report-scanning edge cases, sub-attack failure paths, non-numeric scores) and the finite-auc branch in create_meta_report. takes meta_attack.py from 87% to 100% and should clear the codecov/patch and codecov/project checks. test-only, no changes to attack code :) |
|
hi @jim-smith, just wanted to let you know this PR is ready for review the constants file is in, the the follow ups are each their own PR now, #460 structural respecting mind taking another look when you have a chance |
Closes #428
Added a MetaAttack class that runs multiple privacy attacks on the same target and combines their per-record results into one DataFrame.
vulnerability_matrix.csvand a standard JSON reportavailable via the factory:
factory.attack(target, "meta", attacks=[...])