Skip to content
@nerfstatus

NerfStatus

Scientifically-grounded LLM degradation detection

NerfStatus

The Solar Eclipse
Scientific LLM Degradation Detection

🔭 The Problem

Models rot. Quantization errors, silent degradation, and "drift" can destroy your AI product's reliability overnight. You need a smoke detector.

NerfStatus is the a scientifically grounded suite of probes to detect when a model is losing its edge.


📦 The Toolkit

Component Description
nerfprobe The Python Library. pip install nerfprobe. 17 research-backed instruments (Math, Code, JSON, Style, Fact).
nerfstatus-core Shared Logic. Core scoring algorithms and probe definitions.
hf-space HuggingFace NerfStatus. Inference Quality Monitor.

🔬 Methodology

We don't just use vibes. We use Instruments:

  • Math Probe: arithmetic consistency.
  • JSON Probe: syntax validity under stress.
  • Fingerprint Probe: detecting model identity masquerading.
  • Timing Probe: latency and token generation diagnostics.

🔗 Connect

Pinned Loading

  1. nerfprobe-core nerfprobe-core Public

    Shared probe and scorer implementations for NerfStatus LLM degradation detection.

    Python

  2. nerfprobe nerfprobe Public

    Scientific LLM degradation detection. A suite of 17 research-backed probes to measure model quality drift.

    Python

Repositories

Showing 3 of 3 repositories

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Loading…

Most used topics

Loading…