Skip to content

feat(M90): Hardware Precision Baseline Library#193

Merged
hlin99 merged 1 commit into
mainfrom
feat/m90-hw-baseline
Apr 6, 2026
Merged

feat(M90): Hardware Precision Baseline Library#193
hlin99 merged 1 commit into
mainfrom
feat/m90-hw-baseline

Conversation

@hlin99
Copy link
Copy Markdown
Member

@hlin99 hlin99 commented Apr 6, 2026

Summary

Hardware Precision Baseline Library — maintains reference precision profiles for common GPU hardware configurations and classifies observed numerical differences as expected hardware variance or likely software bugs.

Changes

  • hw_baseline.py: HardwareProfile, PrecisionRange, BaselineDB, classify_difference(), formatting functions
  • 8 built-in profiles: A100-BF16-TP1, A100-FP16-TP1, H100-BF16-TP4, H100-FP8-TP4, H200-BF16-TP8, Gaudi2-BF16-TP1, Gaudi3-FP8-TP4, A100-INT8KV-TP2
  • CLI: baseline-db {list|show|export|import|find|classify} subcommands
  • 33 tests covering all functionality
  • ROADMAP.md updated

Closes #192

- hw_baseline.py module: HardwareProfile, PrecisionRange, BaselineDB, classify_difference()
- 8 built-in profiles: A100/H100/H200/Gaudi2/Gaudi3 across BF16/FP16/FP8/INT8-KV
- classify_difference() returns per-metric verdicts: expected/suspicious/likely_bug/unknown
- BaselineDB: list, get, add, remove, find, export/import JSON
- CLI: baseline-db {list|show|export|import|find|classify}
- 33 tests covering all dataclasses, DB ops, classification, formatting, serialization
- ROADMAP.md updated with M90 ✅

Closes #192
Copy link
Copy Markdown

@hlin99-Review-Bot hlin99-Review-Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Clean implementation — dataclasses, classification logic, CLI integration, and 33 tests all look solid. CI green across 3.10/3.11/3.12. Approve ✅

Copy link
Copy Markdown

@hlin99-Review-BotX hlin99-Review-BotX left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean implementation — well-structured dataclasses, solid classification logic with proper severity thresholds, comprehensive 33 tests, and good CLI integration. 8 built-in profiles cover the key hardware configs. Approve ✅

@hlin99 hlin99 merged commit c42dfb3 into main Apr 6, 2026
5 checks passed
@hlin99 hlin99 deleted the feat/m90-hw-baseline branch April 6, 2026 12:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(M90): Hardware Precision Baseline Library

3 participants