-
Notifications
You must be signed in to change notification settings - Fork 0
Development
Rana Faraz edited this page Jun 23, 2026
·
1 revision
git clone https://github.com/ranafaraz/EnsembleKit.git
cd EnsembleKit
python -m venv .venv
# Linux/macOS:
source .venv/bin/activate
# Windows:
.venv\Scripts\activate
pip install -e ".[dev]"
pytest -q # 76 tests, all should passpip install -e ".[sklearn]" # enables scikit-learn AUROC cross-check tests# Full table + RESULTS.md
python -m evals.harness
# Dissociation gate (used in CI)
python -m evals.gate
# Interactive CLI
ensemblekit compare --regime het_competence
ensemblekit compare --regime corrupted
ensemblekit diversity
ensemblekit regimesEnsembleKit/
ensemblekit/
synthesis/ -- Bayes label generator, base learner factory (z_k = a_k * s + noise)
combiners/ -- average.py, weighted.py, robust.py, full.py, single.py
regimes/ -- homogeneous.py, het_competence.py, corrupted.py
eval/ -- auroc.py, gate.py, diversity.py
cli.py -- ensemblekit CLI entry point
evals/
harness.py -- writes evals/RESULTS.md
gate.py -- asserts the 2x2 dissociation
tests/ -- 76 pytest tests
docs/ -- ARCHITECTURE.md, DECISIONS.md, demo.gif
.env.example
Dockerfile
pyproject.toml
- Create
ensemblekit/combiners/my_combiner.pyimplementingcombine(log_odds: np.ndarray, labels: np.ndarray, rng) -> np.ndarraythat takes a(K, N)array of per-learner log-odds and returns a(N,)combined score. - Register the combiner in
ensemblekit/combiners/__init__.pyunder a string key (e.g.,"my_combiner"). - Add tests in
tests/test_combiners.pycovering at least: output shape, that a perfect learner produces AUROC ~1.0 inhomogeneous, and that the combiner is deterministic given the same RNG. - Run
ensemblekit regimesto check it appears in the table with a non-degenerate AUROC. - Add the key to
ENSEMBLEKIT_COMBINERaccepted values inConfiguration.
The learner generation model is z_k = a_k * s + noise_k. To change it:
- Edit or subclass the
LearnerFactoryinensemblekit/synthesis/learners.py. - The factory must expose:
generate(s: np.ndarray, rng) -> np.ndarrayreturning(K, N)log-odds. - If you add a new regime, create
ensemblekit/regimes/my_regime.pyimplementingbuild_learners(s, y, rng) -> (log_odds, competence_hint)wherecompetence_hintis the signal available to the competence estimator. - Register in
ensemblekit/regimes/__init__.pyand add a description toregimes/descriptions.py. - Add tests that: the regime does not change the Bayes label
y, the expected combiner fails, and the fixed-axis combiner passes.
GitHub Actions runs pytest -q and python -m evals.gate on Python 3.10, 3.11, and 3.12.
No secrets are required -- the benchmark is fully offline.
- Format with
black, lint withruff(configured inpyproject.toml). - Type annotations are encouraged but not enforced.
- All random state must flow through
np.random.default_rng(seed)for reproducibility -- never usenp.random.seed()or module-level state. - Keep combiners stateless: they receive all the data they need as arguments and return a score array. Side effects (logging, plotting) belong in the harness, not the combiner.