Releases: fabioconcina/claumon
v0.12.1
Bug fixes
-
Daily history is now a continuous calendar. The "Daily Tokens (14 days)"
and cost charts skipped days with no Claude usage, packing active days
together so the date axis wasn't continuous.GetHistorynow returns the full
window of N days ending today, zero-filling idle days so they render as empty
bars with their date labels. This also corrects a latent off-by-one where the
window spanned N+1 days. -
Stopped a self-inflicted rate-limit storm after a rejected token. When the
usage API rejected the access token with a 401, the poller marked auth expired
but the next credential reload reset it to OK on the still-future local
ExpiresAt, so polling resumed, drew another 401, and looped every 30s until
the API returned 429. The auth provider now remembers the rejected token and
stays expired until a genuinely different token is written to disk (i.e.
Claude Code refreshed it), so the poller waits quietly instead of hammering
the API.
v0.12.0
Forecast model → v1.2
-
The projection interval was over-spread - the complement of the v0.11.1
fix, not a reversal. That release corrected an under-spread interval by
widening the rate-uncertainty term. This one corrects a different
component, the path-noise term: its §5 calibration fit the variance with
an unweighted regression of squared errors, which is heteroskedastic, so the
few long-horizon points dominated the fit and inflated it (and the
over-subtraction silently pinned the rate prior). On real data the over-spread
was path-dominated, so the two fixes touch orthogonal terms of the same
interval. v1.2 weights that regression by1/Δt², calibrating the spread and
reviving the prior; net, out-of-sample coverage converges toward its 80%
target. The v1.1 spec is preserved under
internal/forecast/archive/v1.1/; the
math is documented in
internal/forecast/CHANGELOG.md. -
Removed the Low/Medium/High confidence badge. It scored how much recent
data the forecast had, not how reliable the forecast actually was, so it
could read "High" next to a wide interval and was uncorrelated with real
accuracy. The 80% CI already conveys the uncertainty.
New: claumon bench
- An out-of-sample benchmark for the forecast model: leave-one-session-out and
temporal-holdout protocols, CRPS/pinball proper scoring with coverage, MAE,
and bias breakdowns, segmented by engagement and horizon. Datasets are
reproducible: frozen fixtures exported from any device's store
(claumon bench export --db ...) and seeded synthetic regimes. A development
and validation tool; it does not affect the dashboard.
v0.11.1
Fixes
-
Forecast 80% CI was systematically under-spread. Live monitoring
showed only ~25% of session outcomes falling inside the displayed band.
The §5 calibration regression was discarding its quadratic coefficient,
which carries the historical-average rate variance; v1.1 retains it and
uses it as a floor on the per-forecast rate uncertainty. CIs widen
accordingly, most visibly at long horizons.Forecast model version →
v1.1. The v1.0 spec is preserved under
internal/forecast/archive/v1.0/;
the math change is documented in
internal/forecast/CHANGELOG.md.
Diagnostics
claumon diagnosticsnow prints a "Spread sanity" block: mean squared
error ofF, mean predicted variance, an underspread ratio
(1.0= calibrated,> 1= bands too narrow), and the components feeding
the newEffectiveRateVar. Use it after a few days of v1.1 data to check
the fix actually landed for your usage pattern.
Docs
- Model spec §3 picks up a paragraph documenting the Brownian-motion
simplification (utilization is monotone, BM isn't) and the conditions
under which the bias matters.
v0.11.0
Features
-
Forecast trajectory modal. Click "Projected X% at reset" on a session or weekly gauge to open a modal showing the actual Monte Carlo trajectories the §6 simulator is producing: a fog of ~120 sampled paths, the empirical 10/90 percentile band traced from the same paths, the posterior mean line, observed snapshots so far, and a small first-passage histogram on the threshold line with the MC median ETA. New endpoint
GET /api/forecast/sample?gauge=session|weeklyre-runs the MC with trajectories collected; payload is subsampled in both the trajectory and time dimensions so weekly stays under ~200KB. The §6 MC core is shared withEstimateETAso the modal and the cached ETA summary always agree. -
Versioned forecast model. The forecast spec now carries a
forecast.ModelVersionidentifier (v1.0today). It's surfaced on every/api/forecastand/api/forecast/sampleresponse, in the modal subtitle, and in the diagnostics report. Retired specs will move tointernal/forecast/archive/<version>/when the model changes meaningfully;internal/forecast/CHANGELOG.mdsummarises each bump. -
claumon diagnosticssubcommand. Replays the forecaster across past completed sessions and prints calibration metrics (80% CI coverage, MAE ofF, ETA accuracy) stamped with the model version. Useful for deciding whether a model change actually improves accuracy before shipping it.
Internals
- CI workflows bumped to Node-24-compatible action versions (
actions/checkoutv4→v6,actions/setup-gov5→v6,goreleaser/goreleaser-actionv6→v7); workflow Go version aligned togo.mod(1.26).
v0.10.1
Fixes
- Forecast lower bound no longer dips below current utilization. The 80% CI was being clipped to
[0%, 100%]for display, but utilization within a reset window is monotone non-decreasing, so the Gaussian left tail going belowu_nowproduced unphysical readouts like "current 6%, 80% CI: 0%-16%". The displayed bounds are now floored atu_now. The unclipped point forecastFis preserved for downstream ETA computation, and the model spec ininternal/forecast/MODEL.pdfis updated to match.
v0.10.0
Forecasts on the rate-limit gauges
Each gauge now shows a projection of where utilization will land at reset, with an 80% credible band, an ETA to threshold (default 100%), and a LOW/MED/HIGH confidence pill. This is an empirical Bayes forecast: the prior on the rate and the path-noise variance are both estimated from past data and plugged in to a posterior update, rather than being marginalized under a hyperprior. The full spec lives in internal/forecast/MODEL.pdf; the gist:
-
Rate posterior. Inside the open window, utilization is modeled as
u(t) = u_now + r·t + W(t), withran unknown per-hour rate andWBrownian path noise with per-hour varianceσ². The current rate is fit by OLS on the last 30 minutes of snapshots; that OLS slope and its standard error are treated as a Gaussian likelihoodr̂_OLS | r ~ N(r, SE²_OLS), then fused with a Gaussian empirical prior onr(meanμ₀, varianceτ₀²) via the standard conjugate normal-normal update. The prior is refit daily from up to 200 completed past windows. -
Closed-form projection. Both the rate-uncertainty and path-noise pieces are Gaussian, so projected utilization at reset is
F ~ N(u_now + r̂·ΔT, ΔT²·τ_post² + ΔT·σ²). The 80% credible interval isF ± z₀.₉·σ_F, clipped to[0%, 100%]for display. (Surfaced to users as "80% CI" for brevity.) -
Path-noise calibration.
σ²is recovered by replaying the forecaster across past windows: at each replay point, the squared forecast errore²against the actualu_finalis regressed against[ΔT, ΔT²]with no intercept. The linear coefficient is path noise; the quadratic coefficient absorbs rate-uncertainty contamination and is discarded. This is a method-of-moments estimator, then plugged into the posterior; the prior is refit once more with the noise correction applied to its sample variance. -
Monte Carlo ETA. For each threshold, 500 trajectories of the SDE are simulated at 5-minute steps, drawing one rate sample per trajectory from the posterior. The reported ETA is the median first-passage time with the 80% CI from the 10th/90th percentiles. If at least half the trajectories never cross before reset, the threshold is reported as unreachable (
p_infis exposed on the API payload). -
Confidence tag. Derived from effective sample size
n_eff = min(n_recent, τ₀²/SE_OLS² + N_sessions).n_eff ≥ 50 → HIGH,≥ 15 → MEDIUM, elseLOW. Falls back to the prior alone when fewer than three recent snapshots exist.
The forecast is computed once per poll and shipped inside the existing usage SSE event; the same payload is also exposed at GET /api/forecast for pull-style clients. It is suppressed when fewer than two completed past windows exist, when the window has just reset, or when a drop in the recent snapshot series indicates a missed reset between polls.
Fixes
- Canonicalize
reset_attimestamps at write time. The Claude API returnsreset_atrecomputed asnow + remainingon each poll, so the same nominal window drifts by hundreds of milliseconds across polls and occasionally straddles a minute boundary. Snapshots were being written with these drifting strings, soGROUP BY session_reset_atshattered every window into singletons and downstream aggregation lost track of session boundaries. Reset times are now rounded to the nearest minute on write, and a one-time idempotent migration canonicalizes existing rows on startup.
v0.9.3
Performance
- Stop re-parsing every JSONL on every session change. The watcher callback called
DiscoverDailyAggregates, which re-read and re-parsed every session file under~/.claude/projectson each event. With many concurrent live sessions, this burned hundreds of MB/s of I/O and pegged a CPU core. Parsed per-day buckets are now memoized in-process and invalidated by(path, mtime, size), so only changed files are re-parsed.
v0.9.2
Fixed
- Poll loop no longer burns API requests after auth expires. When the API rejected the cached token with 401 and the keychain still held a stale token,
authStatuscould remainok, so each wake-up hit the API four times before a 429 forced a 1h backoff. The provider is now force-marked expired on any 401 from the usage API, so subsequent polls skip the API entirely until credentials are refreshed externally.
v0.9.1
Fixed
- Heatmap and today's tokens no longer include prior days — sessions resumed across midnight credited their full lifetime tokens to today's aggregates and the hourly heatmap. Both are now bucketed by each message's own timestamp.
- Session detail timestamps now show dates across day boundaries.
- Process liveness detection on Windows.
- Time-sensitive store tests that assumed March 2026 dates.
Changed
- Expanded tool call detail in the session view.
v0.9.0
Added
- Claude Design weekly usage gauge — dashboard now surfaces the weekly Claude Design utilization (from the
seven_day_omelettefield in the OAuth usage API) alongside the existing Opus and Sonnet gauges. The card is hidden when usage is 0%.