Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
54 commits
Select commit Hold shift + click to select a range
270f6b9
feat: initial skeleton files with docstring and signatures
ssrhaso Mar 13, 2026
c550567
feat: QMIA CatBoost attack with multiclass and (x,y) conditioning
shamykyzer Mar 28, 2026
6e93c3f
test: QMIA hinge score, multiclass, and attack tests
shamykyzer Mar 28, 2026
094a282
chore: clean up unused files and ignore catboost artifacts
shamykyzer Mar 28, 2026
3c155a0
feat: QMIA benchmark scripts with formatted comparison tables
shamykyzer Mar 28, 2026
d2b7074
docs: add QMIA usage and benchmark documentation to README
shamykyzer Mar 28, 2026
fcd8a19
Merge remote-tracking branch 'origin/main' into HEAD
shamykyzer Mar 28, 2026
1ecc311
feat: QMIA hinge score and label remapping utilities
shamykyzer Mar 28, 2026
9e63e1c
feat: add QMIAAttack quantile regression membership inference attack
ssrhaso Mar 30, 2026
e984a9d
refactor: switch QMIA to HistGradientBoostingRegressor, fix review is…
shamykyzer Mar 31, 2026
a981711
fix: remove stale CatBoost references from README, factory test, and …
ssrhaso Mar 31, 2026
8489312
fix: resolve ruff lint errors in benchmarks, tests, and summarize script
ssrhaso Apr 1, 2026
c964ad5
style: pre-commit fixes
pre-commit-ci[bot] Apr 1, 2026
45b41d5
Merge branch 'main' into 306-quantile-regression
ssrhaso Apr 1, 2026
293376c
fix: tighten factory test assertions, fix conftest RNG seed
shamykyzer Apr 1, 2026
7e3415f
Merge remote-tracking branch 'origin/306-quantile-regression' into 42…
shamykyzer Apr 12, 2026
e17e4e7
feat: add MetaAttack class skeleton and factory registration
shamykyzer Apr 12, 2026
61707ef
feat: implement sub-attack orchestration in MetaAttack
shamykyzer Apr 12, 2026
a902292
feat: implement per-record score extraction from sub-attacks
shamykyzer Apr 12, 2026
3b01eb7
fix: address review issues in MetaAttack stages 1-3
shamykyzer Apr 12, 2026
4454852
feat: build vulnerability DataFrame with two-level aggregation
shamykyzer Apr 12, 2026
eccd60b
fix: address Stage 4 review issues
shamykyzer Apr 12, 2026
822cf3d
feat: add global metrics, JSON report, and CSV export
shamykyzer Apr 12, 2026
b1db337
test: add MetaAttack test suite
shamykyzer Apr 12, 2026
4774fe0
docs: add MetaAttack example script
shamykyzer Apr 12, 2026
5515e43
style: pre-commit fixes
pre-commit-ci[bot] Apr 12, 2026
4654005
fix: harden QMIA regressor, calibration drift, margin rank
shamykyzer Apr 15, 2026
a968996
docs: fix smart quotes in QMIA README example
shamykyzer Apr 15, 2026
93c9ac2
docs: update citation.cff
shamykyzer Apr 15, 2026
ff802f0
style: pre-commit fixes
pre-commit-ci[bot] Apr 15, 2026
ad016c6
refactor: drop extract_true_label_probs, resolve ruff errors
shamykyzer Apr 15, 2026
ce8388e
test: tighten QMIA public FPR tolerance
shamykyzer Apr 15, 2026
3b335b6
fix: warn when QMIA skips non-sklearn target remapping
shamykyzer Apr 15, 2026
34213b9
fix: reject non-finite predict_proba in QMIA
shamykyzer Apr 15, 2026
215664c
perf: enable HGBR early stopping for QMIA fits with n >= 1000
shamykyzer Apr 16, 2026
21d67aa
Merge branch '306-quantile-regression' into 428-meta-attack
shamykyzer Apr 17, 2026
f0213a5
fix: resolve merge conflict in utils.py (merge main into 428-meta-att…
shamykyzer Apr 17, 2026
65762bb
style: fix ruff lint errors in meta_attack and test
shamykyzer Apr 17, 2026
1b9a03a
merge origin/main into 428-meta-attack
shamykyzer May 10, 2026
124f459
chore: remove QMIA files (out of scope)
shamykyzer May 11, 2026
2502eb5
feat: MetaAttack review-response and audit fixes
shamykyzer May 11, 2026
d9b50a6
feat: MetaAttack reporting (append-to-report.json + PDF)
shamykyzer May 11, 2026
75b517b
style: pre-commit fixes
pre-commit-ci[bot] May 11, 2026
25d0700
fix: MetaAttack reads canonical single-file report.json layout
shamykyzer May 11, 2026
0c11c8d
style: pre-commit fixes
pre-commit-ci[bot] May 11, 2026
bcad245
refactor: move MetaAttack constants to module level
shamykyzer May 11, 2026
b5dd7b6
refactor: extract MetaAttack EPS_META and DEFAULT_MIA_THRESHOLD to at…
shamykyzer May 14, 2026
34a499d
docs: add behaviour kwarg to MetaAttack example
shamykyzer May 14, 2026
ed088f0
Merge branch 'main' into 428-meta-attack
shamykyzer May 22, 2026
38ca47c
test(meta): cover MetaAttack/report branches to reach 100% patch cove…
ssrhaso May 22, 2026
e69b5ee
test(meta): cover non-finite sub-attack AUC branch in create_meta_report
shamykyzer May 25, 2026
4abeed4
docs: add behaviour kwarg to MetaAttack README example
shamykyzer May 25, 2026
97c07e2
fix(structural): gate per-record results behind report_individual fla…
shamykyzer May 26, 2026
02804f4
Merge branch 'main' into 428-meta-attack
jim-smith May 26, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 18 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,30 @@
## [Unreleased]

Changes:
* Feat: `MetaAttack`: aggregate per-record vulnerability across multiple privacy attacks (LiRA,
QMIA, Structural) into a unified vulnerability DataFrame with within-attack (mean, std,
consistency) and cross-attack (arithmetic/geometric MIA mean, structural flag, total
vulnerability count) aggregation. Supports three operating modes via `behaviour`:
`'run_all'` (fresh execution), `'use_existing_only'` (collate from pre-existing
`report.json` files without re-running — critical for attacks such as LiRA that may
take weeks on large model grids), and `'fill_missing'` (run only attacks not already
present). Outputs `vulnerability_matrix.csv` alongside the standard JSON report.
By default appends the MetaAttack section to an existing `report_dir/report.json`
(set `keep_separate=True` for a standalone file). PDF report includes a bar chart
of records grouped by the number of attacks flagging them. `use_existing_only`
and `fill_missing` scan both the canonical single-file `report_dir/report.json`
(multi-section, as produced when individual attacks append to the same file)
and any subdirectory-per-attack layout. Registered in the attack factory as
`"meta"`.
* Feat: `QMIAAttack`: membership inference attack via quantile regression (Bertran et al.,
NeurIPS 2023, arXiv:2307.03694). Trains a histogram-based quantile regressor
(`HistGradientBoostingRegressor`) on non-member hinge scores to learn per-sample
membership thresholds. A sample is predicted as a member when its observed score
exceeds the predicted threshold at quantile level (1 - alpha). No shadow models or
architecture knowledge required. Registered in the attack factory as `"qmia"`.
* Fix: `StructuralAttack` now respects the `report_individual` flag. Per-record
`record_level_results` and `attack_metrics["individual"]` are only populated when the
flag is set to `True`, matching the behaviour of `LIRAAttack` and `QMIAAttack`.

## Version 1.4.3 (Jan 29, 2026)

Expand Down
23 changes: 23 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -100,6 +100,29 @@ Run the full benchmark comparing QMIA against WorstCase and LiRA:
python examples/sklearn/benchmark_qmia_full.py
```

## MetaAttack: Unified Per-Record Vulnerability Aggregation

`MetaAttack` runs multiple privacy attacks (LiRA, QMIA, Structural) on the same target and aggregates their per-record results into a single vulnerability DataFrame. Three operating modes are supported via the `behaviour` parameter:

* **`'run_all'`** (default) — run every specified attack from scratch.
* **`'use_existing_only'`** — read per-record scores from pre-existing `report.json` files without re-running anything. Useful when expensive attacks such as LiRA have already been run.
* **`'fill_missing'`** — load existing results and run only the attacks not yet present.

```python
from sacroml.attacks.meta_attack import MetaAttack
from sacroml.attacks.target import Target

target = Target(model=model, X_train=X_train, y_train=y_train, X_test=X_test, y_test=y_test)
meta = MetaAttack(
attacks=[("lira", {}), ("qmia", {}), ("structural", {})],
behaviour="run_all", # alternatives: "use_existing_only", "fill_missing"
output_dir="output_meta",
)
meta.attack(target)
```

The vulnerability matrix is saved as `vulnerability_matrix.csv` in `output_dir`.

## Documentation

See [API documentation](https://ai-sdc.github.io/SACRO-ML/).
Expand Down
104 changes: 104 additions & 0 deletions examples/sklearn/meta_attack_example.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
"""Example: run a MetaAttack combining QMIA and structural attacks.

Trains a RandomForest on synthetic data, wraps it in a Target, then
runs MetaAttack to produce a cross-attack vulnerability DataFrame.

Usage::

python examples/sklearn/meta_attack_example.py
"""

import logging

from sklearn.datasets import make_classification
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split

from sacroml.attacks.meta_attack import MetaAttack
from sacroml.attacks.target import Target

logging.basicConfig(level=logging.INFO)

output_dir = "output_meta_attack"

if __name__ == "__main__":
# --- Prepare target ---
X, y = make_classification(
n_samples=300,
n_features=10,
n_informative=5,
n_classes=2,
random_state=42,
)
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.4, stratify=y, random_state=42
)

model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X_train, y_train)

target = Target(
model=model,
dataset_name="synthetic",
X_train=X_train,
y_train=y_train,
X_test=X_test,
y_test=y_test,
X_train_orig=X_train,
y_train_orig=y_train,
X_test_orig=X_test,
y_test_orig=y_test,
)
for idx in range(X.shape[1]):
target.add_feature(f"feature_{idx}", [idx], "float")

# --- Run MetaAttack ---
meta = MetaAttack(
Comment thread
shamykyzer marked this conversation as resolved.
attacks=[
("qmia", {}, 2), # QMIA with 2 repetitions
("structural", {}), # Structural (single run)
],
behaviour="run_all", # alternatives: "use_existing_only", "fill_missing"
mia_threshold=0.5,
output_dir=output_dir,
)
meta.attack(target)

# --- Inspect results ---
df = meta.vulnerability_df

print("\n=== Vulnerability Matrix (first 10 records) ===")
print(df.head(10).to_string())

print("\n=== Summary Statistics ===")
n_train = int(df["is_member"].sum())
n_test = len(df) - n_train
print(f"Training records: {n_train}")
print(f"Test records: {n_test}")

# MIA vulnerability
if "qmia_vuln" in df.columns:
n_qmia = int(df["qmia_vuln"].sum())
print(f"QMIA vulnerable: {n_qmia}")

# Structural vulnerability (training records only)
if "struct_vuln" in df.columns:
train_df = df[df["is_member"] == 1]
n_struct = int(train_df["struct_vuln"].sum())
print(f"Struct vulnerable: {n_struct} (of {n_train} training)")

# Records vulnerable to all attacks
max_attacks = int(df["n_vulnerable"].max())
n_all = int((df["n_vulnerable"] == max_attacks).sum())
print(f"Vulnerable to all: {n_all} (flagged by {max_attacks} attacks)")

# Top-10 most vulnerable training records by MIA mean
if "mia_mean" in df.columns:
top10 = df[df["is_member"] == 1].nlargest(10, "mia_mean")[
["mia_mean", "mia_gmean", "n_vulnerable"]
]
print("\n=== Top 10 Most Vulnerable Training Records ===")
print(top10.to_string())

print(f"\nReport saved to: {output_dir}/")
print(f"CSV saved to: {output_dir}/vulnerability_matrix.csv")
34 changes: 34 additions & 0 deletions sacroml/attacks/constants.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
"""Shared numerical and default-value constants for the attacks package.

Centralising these here avoids duplication across attack modules and makes
the *why* of each magic number visible at a glance.

Notes
-----
A separate :data:`sacroml.attacks.utils.EPS` (``1e-16``) and an identical
``EPS`` in :mod:`sacroml.attacks.likelihood_attack` are kept independently
for now because they predate this module and migrating them is a wider
refactor. A follow-up PR can converge those onto a single constant defined
here once the call sites have been audited.
"""

from __future__ import annotations

EPS_META: float = 1e-10
"""Tolerance added before ``log()`` in geometric-mean aggregation.

Looser than :data:`sacroml.attacks.utils.EPS` (``1e-16``) because the
geometric mean of MIA scores in :class:`~sacroml.attacks.meta_attack.MetaAttack`
does not need the same precision as normal-distribution CDF/PDF
calculations and benefits from a value comfortably above floating-point
denormals.
"""

DEFAULT_MIA_THRESHOLD: float = 0.5
"""Default cutoff above which a per-record membership-inference score is
flagged as vulnerable.

Used as the ``mia_threshold`` default for
:class:`~sacroml.attacks.meta_attack.MetaAttack` so the value can be
referenced symbolically from tests, examples, and documentation.
"""
2 changes: 2 additions & 0 deletions sacroml/attacks/factory.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
from sacroml.attacks.attack import Attack
from sacroml.attacks.attribute_attack import AttributeAttack
from sacroml.attacks.likelihood_attack import LIRAAttack
from sacroml.attacks.meta_attack import MetaAttack
from sacroml.attacks.qmia_attack import QMIAAttack
from sacroml.attacks.structural_attack import StructuralAttack
from sacroml.attacks.target import Target
Expand All @@ -19,6 +20,7 @@
registry: dict[str, type[Attack]] = {
"attribute": AttributeAttack,
"lira": LIRAAttack,
"meta": MetaAttack,
"qmia": QMIAAttack,
"structural": StructuralAttack,
"worstcase": WorstCaseAttack,
Expand Down
Loading