Skip to content

feat: select on PR-AUC, cross-validate with a custom splitter#15

Merged
ancongui merged 1 commit into
mainfrom
feat/cv-and-prauc-selection
Jun 25, 2026
Merged

feat: select on PR-AUC, cross-validate with a custom splitter#15
ancongui merged 1 commit into
mainfrom
feat/cv-and-prauc-selection

Conversation

@ancongui

Copy link
Copy Markdown
Contributor

What & why

Two robust model-selection gaps in AutoML, closed with TDD on real data (breast_cancer):

1. PR-AUC is now a selectable CV metric

average_precision (PR-AUC) was already reported on holdout, but it was not in the CV scoring map — so fit(metric="average_precision") silently fell back to selecting the winner on accuracy (scoring_name(...) → default). On imbalanced binary problems, that picks the wrong model.

Now "average_precision" maps to scikit-learn's average_precision scorer (greater-is-better), so the leaderboard, the refit winner, and result.cv_scoring all reflect PR-AUC:

result = AutoML().fit(train, metric="average_precision")   # selected by PR-AUC, not accuracy

2. AutoML(cv=...) accepts a scikit-learn splitter

cv was typed int only, yet it's passed straight to cross_val_score, which already accepts any splitter. Broadened to int | Any and documented + tested the cross-validation strategies that avoid silent leakage:

AutoML(cv=TimeSeriesSplit(n_splits=5))   # temporal: forward-chaining, no future leakage
AutoML(cv=StratifiedKFold(5, shuffle=True))
AutoML(cv=GroupKFold(n_splits=5))        # grouped: keep a group out of train & test

Tests (TDD)

  • test_pr_auc_is_a_selectable_cv_metric, test_automl_selects_on_pr_aucRED→GREEN (asserted 'accuracy' == 'average_precision' before the fix).
  • test_automl_accepts_a_cv_splitter, test_automl_accepts_a_time_series_splitter — lock the now-documented cv-splitter contract.
  • No regressions: evaluation + automl + ensemble + calibration suites green locally; ruff format/check + pyright(src) clean.

Docs

docs/automl.md: new "cv accepts a splitter" and "select on PR-AUC for imbalanced binary" tips; the binary metric panel now lists average_precision + brier_score.

Two robust-selection gaps in AutoML:

- average_precision (PR-AUC) was reported on holdout but was NOT a selectable
  CV metric: fit(metric="average_precision") silently fell back to selecting on
  accuracy (scoring_name -> default). On imbalanced binary problems that picks
  the wrong winner. Now "average_precision" maps to sklearn's average_precision
  scorer, so the leaderboard, refit winner, and result.cv_scoring all reflect it.
- AutoML(cv=...) is typed int only, yet cross_val_score already accepts a
  splitter. Broaden to `int | Any` and document/test passing TimeSeriesSplit
  (forward-chaining, no future leakage), StratifiedKFold, GroupKFold.

TDD: PR-AUC tests RED->GREEN; splitter tests lock the now-documented contract.
Docs: automl.md gains "cv accepts a splitter" + "select on PR-AUC" tips and the
binary panel now lists average_precision + brier_score.
@ancongui ancongui merged commit e3dda4a into main Jun 25, 2026
4 checks passed
@ancongui ancongui deleted the feat/cv-and-prauc-selection branch June 25, 2026 19:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant