Models rot. Quantization errors, silent degradation, and "drift" can destroy your AI product's reliability overnight. You need a smoke detector.
NerfStatus is the a scientifically grounded suite of probes to detect when a model is losing its edge.
| Component | Description |
|---|---|
| nerfprobe | The Python Library. pip install nerfprobe. 17 research-backed instruments (Math, Code, JSON, Style, Fact). |
| nerfstatus-core | Shared Logic. Core scoring algorithms and probe definitions. |
| hf-space | HuggingFace NerfStatus. Inference Quality Monitor. |
We don't just use vibes. We use Instruments:
- Math Probe: arithmetic consistency.
- JSON Probe: syntax validity under stress.
- Fingerprint Probe: detecting model identity masquerading.
- Timing Probe: latency and token generation diagnostics.
- Live Dashboard: nerfstatus.com
- Maintainer: @skew202 (Stefan Wiest)
