Skip to content

docs: advanced-AutoML sample + full doc/PDF coverage of the new features#16

Merged
ancongui merged 1 commit into
mainfrom
docs/advanced-sample-and-feature-docs
Jun 25, 2026
Merged

docs: advanced-AutoML sample + full doc/PDF coverage of the new features#16
ancongui merged 1 commit into
mainfrom
docs/advanced-sample-and-feature-docs

Conversation

@ancongui

Copy link
Copy Markdown
Contributor

What & why

The latest features β€” explainability, calibration, stacking ensembles, PR-AUC selection, CV strategies, persisted audit trail β€” shipped with reference docs but were missing from the samples, the landing pages, and the PDF. This PR closes those gaps so every feature is demonstrated, discoverable, and in the guide.

New runnable sample (test-covered, real data)

samples/advanced_automl.py β€” one cohesive sample on the real breast_cancer dataset that exercises all the new capabilities:

  • StratifiedKFold cross-validation strategy
  • PR-AUC selection (metric="average_precision")
  • a calibrated stacking ensemble (calibrate=True, ensemble=True) β†’ Brier score
  • global explainability (result.explain(...))
  • a persisted JSONL audit trail of every GenAI gate decision

It auto-detects an LLM key: deterministic StaticFeatureProposer offline, real AgentFeatureProposer when a key is present β€” so the same sample runs in CI and against a live model.

Tested for real (incl. a live LLM)

  • tests/samples/test_advanced_automl.py: offline smoke test (CI) + an @integration test against a real LLM.
  • All 3 real-LLM integration tests pass with a live Anthropic key (this sample + the existing showcase: feature engineering & agentic loop). The credential was used only as a transient, out-of-repo env var and never written to the repo or any output.

Docs + PDF

  • README.md, docs/README.md: Explainability added to the doc tables; AutoML rows note calibration/ensembling/PR-AUC/CV.
  • docs/index.md: enriched AutoML card + a new "Explainable & trustworthy" pillar.
  • docs/samples.md: five samples now, with the advanced sample documented (offline + real-LLM run lines).
  • docs/brief/firefly-datascience-complete-guide.pdf: regenerated (62pp) β€” Ch.8 gains calibration/ensembling/PR-AUC/CV + explainability; Ch.14 the hands-on advanced selection; Ch.15 the persisted audit trail; glossary terms added. Rendered pages were visually verified.

Gates

ruff format --check + ruff check clean; no src changes this PR (docs/sample/test/PDF only), so existing suites are unaffected; the new offline sample test passes.

The latest features (explainability, calibration, stacking ensembles, PR-AUC
selection, CV strategies, persisted audit trail) shipped with reference docs but
were missing from the samples, the landing pages, and the PDF. This closes those
gaps so every feature is demonstrated, discoverable, and in the guide.

- samples/advanced_automl.py (NEW): one runnable, test-covered sample on the real
  breast_cancer data β€” StratifiedKFold CV, PR-AUC selection, a calibrated stacking
  ensemble, global explainability, and a persisted JSONL audit trail of every gate
  decision. Auto-detects an LLM key: deterministic StaticFeatureProposer offline,
  real AgentFeatureProposer when a key is present (so the same sample runs in CI
  and against a live model).
- tests/samples/test_advanced_automl.py (NEW): offline smoke test + an
  @integration test that runs the sample against a real LLM. All three real-LLM
  integration tests (this + the showcase) pass with a live key.
- README.md, docs/README.md: add Explainability to the doc tables; note
  calibration/ensembling/PR-AUC/CV on the AutoML rows.
- docs/index.md: enrich the AutoML card; add an "Explainable & trustworthy" pillar.
- docs/samples.md: five samples now, with the advanced sample documented.
- docs/brief/firefly-datascience-complete-guide.pdf: regenerated (62pp) β€” Ch.8
  gains calibration/ensembling/PR-AUC/CV + explainability; Ch.14 the hands-on
  advanced selection; Ch.15 the persisted audit trail; glossary terms added.
@ancongui ancongui merged commit aecee5a into main Jun 25, 2026
4 checks passed
@ancongui ancongui deleted the docs/advanced-sample-and-feature-docs branch June 25, 2026 20:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant