Skip to content

Releases: tdlabs-ai/tanml

v0.2.0: The Polishing Release

14 Mar 16:45

Choose a tag to compare

TanML v0.2.0: The Polishing Release

This release focuses on industrial-grade refinements, documentation clarity, and professional presentation.

✨ Key Improvements

  • Branding Consistency: Corrected capitalization for Scikit-learn, Statsmodels, and YData Profiling across all files.
  • Refined Terminology: Updated workflow steps to Data Preprocessing and Feature Power Ranking for better UI/Docs alignment.
  • Audit Ready: Removed dashes from audit ready and clarified report generation descriptions.
  • Header Revamp: Tightened README header spacing and added a new OS Compatibility Badge.
  • Scientific Reproducibility: Added scripts/generate_demo_data.py with fixed seeds to ensure consistent validation results.

🛠️ Internal Cleanup

  • Removed tanml_runs clutter and redundant scripts from the main branch.
  • Updated all GitHub Workflows to use Trusted Publishing and fixed dynamic version detection for documentation.
  • Cleared legacy Node.js deprecation warnings in CI/CD pipelines.

v0.1.10

13 Jan 14:47

Choose a tag to compare

Release 0.1.10: UCI support, Python 3.13 fixes, enhanced duplicate/ou…

v0.1.8 - Official Launch 🚀

23 Dec 16:22

Choose a tag to compare

Official Release of TanML v0.1.8

🌟 Key Changes

  • Consolidated Output: All runs now saved to tanml_runs/.
  • UI Refactoring: Improved modularity and navigation.
  • Documentation: Updated README.md and pyproject.toml.
  • Clean History: Removed large legacy files.

Enjoy!

What's Changed

Full Changelog: v0.1.7...v0.1.8

TanML v0.1.7 Beta

11 Oct 00:57

Choose a tag to compare

TanML: Automated Model Validation Toolkit for Tabular Machine Learning

PyPI
Downloads
License: MIT
Cite this repo

TanML validates tabular ML models with a zero-config Streamlit UI and exports an audit-ready, editable Word report (.docx). It covers data quality, correlation/VIF, performance, explainability (SHAP), and robustness/stress tests—built for regulated settings (MRM, credit risk, insurance, etc.).

  • Status: Beta (0.x)
  • License: MIT
  • Python: 3.8–3.12
  • OS: Linux / macOS / Windows (incl. WSL)

Table of Contents

  • Why TanML?
  • Install
  • Quick Start (UI)
  • What TanML Checks
  • Optional CLI Flags
  • Templates
  • Troubleshooting
  • Data Privacy
  • Contributing
  • License & Citation

Why TanML?

  • Zero-config UI: launch Streamlit, upload data, click Run—no YAML needed.
  • Audit-ready outputs: tables/plots + a polished DOCX your stakeholders can edit.
  • Regulatory alignment: supports common Model Risk Management themes (e.g., SR 11-7 style).
  • Works with your stack: scikit-learn, XGBoost/LightGBM/CatBoost, etc.

Install

pip install tanml

Quick Start (UI)

tanml ui

In the app

  1. Load data — upload a cleaned CSV/XLSX/Parquet (optional: raw or separate Train/Test).
  2. Select target & features — target auto-suggested; features default to all non-target columns.
  3. Pick a model — choose library/algorithm (scikit-learn, XGBoost, LightGBM, CatBoost) and tweak params.
  4. Run validation — click ▶️ Refit & validate.
  5. Export — click ⬇️ Download report to get a DOCX (auto-selects classification/regression template).

Outputs

  • Report: ./.ui_runs/<session>/tanml_report_*.docx
  • Artifacts (CSV/PNGs): ./.ui_runs/<session>/artifacts/*

What TanML Checks

  • Raw Data (optional): rows/cols, missingness, duplicates, constant columns

  • Data Quality & EDA: summaries, distributions

  • Correlation & Multicollinearity: heatmap, top-pairs CSV, VIF table

  • Performance

    • Classification: AUC, PR-AUC, KS, decile lift, confusion
    • Regression: R², MAE, MSE/RMSE, error stats
  • Explainability: SHAP (auto explainer; configurable background size)

  • Robustness/Stress Tests: feature perturbations → delta-metrics

  • Model Metadata: model class, hyperparameters, features, training info


Optional CLI Flags

Most users just run tanml ui. These help on teams/servers:

# Share on LAN
tanml ui --public

# Different port
tanml ui --port 9000

# Headless (server/CI; no auto-open browser)
tanml ui --headless

# Larger limit (e.g., 2 GB)
tanml ui --max-mb 2048

Env var equivalents (Linux/macOS bash):

TANML_SERVER_ADDRESS=0.0.0.0 TANML_PORT=9000 TANML_MAX_MB=2048 tanml ui

Windows PowerShell:

$env:TANML_SERVER_ADDRESS="0.0.0.0"; $env:TANML_PORT="9000"; $env:TANML_MAX_MB="2048"; tanml ui

Defaults: address 127.0.0.1, port 8501, limit 1024 MB, telemetry OFF.


Templates

TanML ships DOCX templates (packaged in wheel & sdist):

  • tanml/report/templates/report_template_cls.docx
  • tanml/report/templates/report_template_reg.docx

Data Privacy

  • TanML runs locally; no data is sent to external services.
  • Telemetry is disabled by default (and can be forced off via --no-telemetry).
  • UI artifacts and reports are written under ./.ui_runs/<session>/ in your working directory.

Troubleshooting

  • Page didn’t open? Visit http://127.0.0.1:8501 or run tanml ui --port 9000.
  • Large CSVs are slow/heavy? Prefer Parquet; CSV → DataFrame can use several GB RAM.
  • Artifacts missing? Check ./.ui_runs/<session>/artifacts/.
  • Corporate networks: use tanml ui --public to share on LAN.

Contributing

We welcome issues and PRs!

  • Create a virtual environment and install dev extras:
    • python -m venv .venv && source .venv/bin/activate (or \.venv\Scripts\activate on Windows)
    • pip install -e .[dev]
  • Format/lint: black . && isort .
  • Run tests: pytest

Before opening a PR, please describe the change and include a brief test or reproduction steps where applicable.


License & Citation

License: MIT. See LICENSE.
SPDX-License-Identifier: MIT

© 2025 Tanmay Sah and Dolly Sah. You may use, modify, and distribute this software with appropriate attribution.

How to cite

If TanML helps your work or publications, please cite:

Sah, T., & Sah, D. (2025). TanML: Automated Model Validation Toolkit for Tabular Machine Learning [Software]. Available at https://github.com/tdlabs-ai/tanml

Or in BibTeX (version-agnostic):

@misc{tanml,
  author = {Sah, Tanmay and Sah, Dolly},
  title  = {TanML: Automated Model Validation Toolkit for Tabular Machine Learning},
  year   = {2025},
  note   = {Software; MIT License},
  url    = {https://github.com/tdlabs-ai/tanml}
}

A machine-readable citation file (CITATION.cff) is included for citation tools and GitHub’s “Cite this repository” button.