Skip to content

feat: new instance-based attack for data leakage in SVM/kNN models#431

Open
shamykyzer wants to merge 20 commits into
mainfrom
new-attack-model
Open

feat: new instance-based attack for data leakage in SVM/kNN models#431
shamykyzer wants to merge 20 commits into
mainfrom
new-attack-model

Conversation

@shamykyzer

@shamykyzer shamykyzer commented Mar 27, 2026

Copy link
Copy Markdown
Contributor
  • New InstanceBasedAttack detects training data leakage in models that store raw instances, support vectors (SVC, NuSVC, OneClassSVM) and stored neighbours (KNeighborsClassifier, KNeighborsRegressor)
  • Compares stored instances against training data via np.allclose, reports first ten matches with feature previews, plus storage fraction and match fraction metrics
  • Unwraps sklearn.Pipeline so the comparison runs in the final estimator's feature space
  • Detects differentially private model variants and surfaces a mitigation note
  • Registered in the factory under "instance_based"
  • Match Fraction glossary updated to "A non-zero match fraction confirms data leakage" per Jim's review
  • Module level constants extracted: INSTANCE_MATCH_ATOL = 1e-8, N_EXAMPLES = 10, N_FEATURE_PREVIEW = 10

Closes [New Feature Request] New Attack: Model contains training data#59
Closes #454

@shamykyzer shamykyzer self-assigned this Mar 27, 2026
@shamykyzer shamykyzer requested review from jim-smith and rpreen March 27, 2026 02:45
@shamykyzer shamykyzer marked this pull request as ready for review March 27, 2026 02:54
@codecov

codecov Bot commented Mar 27, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 99.66%. Comparing base (7176627) to head (a0c25cd).

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #431      +/-   ##
==========================================
+ Coverage   99.65%   99.66%   +0.01%     
==========================================
  Files          27       28       +1     
  Lines        3439     3632     +193     
==========================================
+ Hits         3427     3620     +193     
  Misses         12       12              

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Comment thread sacroml/attacks/instance_based_attack.py Outdated
Comment thread sacroml/attacks/instance_based_attack.py Outdated
Comment thread sacroml/attacks/instance_based_attack.py Outdated

@jim-smith jim-smith left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shamykyzer Just a couple of minor changes to make please.
If you want to raise the 'move dealing with pipelines into utils.py' as a separate issue, and leave that code in here for now that is fine.

@shamykyzer shamykyzer requested a review from jim-smith May 18, 2026 14:06
@shamykyzer

Copy link
Copy Markdown
Contributor Author

Hi @jim-smith, could you please review this PR so far?

thanks a lot.

Comment thread sacroml/attacks/instance_based_attack.py
Comment thread sacroml/attacks/instance_based_attack.py Outdated
Comment thread sacroml/attacks/instance_based_attack.py
Comment thread sacroml/attacks/instance_based_attack.py
Comment thread sacroml/attacks/instance_based_attack.py
Comment thread sacroml/attacks/instance_based_attack.py
Comment thread sacroml/attacks/instance_based_attack.py Outdated
shamykyzer and others added 4 commits May 26, 2026 16:58
- add report_individual option, gated like StructuralAttack so the
  per-record block only appears under the 'individual' key when set
- record all matched instances (n_examples now limits PDF display only)
- replace bespoke example_matches with an InstanceBasedRecordLevelResults
  dataclass of parallel lists, consistent with other attacks
- give InstanceBasedAttackResults field defaults to trim the
  graceful-degradation construction sites
@shamykyzer shamykyzer requested a review from jim-smith May 29, 2026 13:34

@jim-smith jim-smith left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still needs changes. I think the way it is implemented is possibly more efficient and the overall message "does this mode lcintain training instances' is answered correctly.

However, the way that the record level results are presented (is this stored instance present in the training set) is inconsistent with the way it is presented for other attacks (which would be 'is this training record stored in the model'. Quick change to create a new field,
individual_risk:np.array = np.zeros(X_train.shape[0],dtype=int) and then a for loop setting to 1 (True) the index of training record you have stored in individual level results

Comment thread sacroml/attacks/instance_based_attack.py Outdated
Comment thread tests/attacks/test_instance_based_attack.py Outdated
Comment thread sacroml/attacks/instance_based_attack.py Outdated
@JessUWE JessUWE requested a review from jim-smith June 17, 2026 12:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

chore: define shared atol semantics across attacks [New Feature Request] New Attack: Model contains training data

3 participants