Skip to content

feat: Add Adaptive Drift Scoring #104

Description

@ReginaldErzoah

Summary

Add adaptive drift scoring that adjusts drift interpretation based on dataset characteristics.


Motivation

A fixed drift score may not work equally well across all datasets.

For example:

  • small datasets may need more cautious scoring
  • noisy datasets may need less aggressive scoring
  • stable datasets may require stricter scoring

Adaptive drift scoring helps Dift provide smarter and more context-aware drift results.


Proposed Improvements

  • Use dataset size and distribution characteristics to adjust drift scoring
  • Improve score sensitivity for different dataset types
  • Preserve existing scoring behavior as the baseline
  • Add adaptive scoring metadata to reports

Suggested Files

Potential implementation areas:

dift/core/stats_diff.py
dift/core/risk.py
dift/reports/models.py
dift/reports/json_report.py
dift/reports/html_report.py
docs/thresholds.md

Suggested Tasks

  • Define adaptive scoring rules
  • Add adaptive scoring utility
  • Integrate adaptive scoring into drift analysis
  • Add report metadata
  • Add tests for small, medium, and noisy datasets
  • Update documentation

How to Test

Run:

pytest
ruff check .

Run targeted tests:

pytest tests/test_stats_diff.py
pytest tests/test_risk.py

Manual validation:

dift examples/old_drift.csv examples/new_drift.csv --key id --report json --output report.json

Verify:

  • adaptive scores are generated
  • existing risk levels remain stable unless intentionally changed
  • JSON and HTML reports still render correctly
  • tests cover different dataset sizes

Documentation Impact

Update:

docs/statistical-analysis.md
docs/thresholds.md
docs/reports.md
docs/developer/architecture.md

Documentation should include:

  • what adaptive drift scoring means
  • how it differs from fixed scoring
  • when it is useful
  • limitations and interpretation guidance

Acceptance Criteria

  • Adaptive drift scoring is implemented
  • Scoring remains backward compatible where possible
  • Tests pass
  • Reports expose adaptive scoring metadata
  • Documentation updated

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions