hwLedger

Work state: SCAFFOLD · Progress: ███░░░░░░░ 25% LLM capacity planner + fleet ledger + desktop inference runtime (Rust core + per-OS native GUIs). Pre-alpha Phase 0; ambitious 11-crate plan, 42 files committed so far (mostly docs/scaffold). · updated 2026-06-02

hwLedger

LLM capacity planner + fleet ledger + desktop inference runtime.

Not a financial ledger. hwLedger tracks hardware fleet audit and provenance for machine learning workloads. It provides per-layer VRAM estimation for LLMs, reconciles predictions against live telemetry from inference engines (MLX, mistral.rs, llama.cpp, vLLM, TGI), and maintains an event-sourced audit log for heterogeneous compute fleets (Apple Silicon, NVIDIA/AMD, cloud rentals).

Status: pre-alpha, Phase 0 bootstrap. See PLAN.md for the implementation roadmap.

hwLedger is an Apache-2.0 desktop app + agent/server pair that:

Plans VRAM and throughput for any HF / GGUF / MLX / Ollama model, correctly handling dense, MoE, MLA, GQA, sliding-window, SSM/Mamba, and hybrid-attention architectures — with a slider UX over a live per-layer breakdown.
Reconciles predictions against live telemetry from MLX, mistral.rs, llama.cpp, vLLM, or TGI.
Runs inference locally on Apple Silicon via a forked oMlx sidecar with SSD-paged KV cache.
Ledgers a heterogeneous fleet — local NVIDIA/AMD boxes, Apple Silicon laptops, cheap cloud rentals (Vast.ai, RunPod, Lambda) — with a shared event-sourced audit log, dispatch planner, and spot-price-aware cost model.
Ships as per-OS native GUIs (SwiftUI / WinUI 3 / Qt 6 + Slint) over a shared Rust FFI core.

A hobbyist-sized fleet with enterprise bones.

Quickstart

Install the CLI from source — this is the fastest path to a working hwLedger today:

cargo install --path crates/hwledger-cli
hwledger --help

Web fallback (no compilation needed): A Streamlit web interface is available at apps/streamlit/ — run cargo run -p hwledger-devtools -- up to launch it locally on localhost:8501 along with the API server.

macOS DMG note: Native macOS DMG distribution is currently blocked on Apple Developer certificate renewal — see WP21 Apple Developer secrets for setup instructions. Use the Streamlit fallback or CLI above for now.

Why

Every existing public VRAM calculator (HF Accelerate, can-it-run-llm, LM Studio's gauge) gets MoE and MLA wrong — they under-count KV cache and over-count MoE throughput. hwLedger's math core is architecture-keyed: it dispatches per AttentionKind (MHA / GQA / MQA / MLA / Sliding / SSM / Hybrid / Sink) and treats resident-vs-active parameters separately for MoE. See PLAN.md §5.

Architecture

Core: Rust workspace (hwledger-core, -arch, -ingest, -probe, -inference, -ledger, -fleet-proto, -agent, -server, -cli, -ffi)
Sidecar: sidecars/omlx-fork/ — fat fork of jundot/omlx, Apache-2.0
Native apps: apps/macos/ (SwiftUI + UniFFI + XCFramework), apps/windows/ (WinUI 3 + .NET 9 + csbindgen), apps/linux-qt/ (Qt 6 + cxx-qt + QML), apps/linux-slint/ (Rust-native)
Fleet wire: Axum + rustls mTLS for agents; russh + deadpool for SSH agentless; reqwest for Vast/RunPod/Lambda/Modal; tailscale status --json for tailnet discovery

See the component diagram in PLAN.md §4.1.

Substrate (per ADR-023 + ADR-035A)

KooshaPari/pheno-capacity — VRAM estimation, model-fit scoring, Chinchilla tokens, optimizer state. no_std-compatible Rust crate. Active since 2026-06-18 (L5-105). Streamlit Planner/WhatIf pages will consume this in Phase 2 (see docs/integrations/cost-model-migration.md).

Dev setup

One-liner to build FFI + launch server, docs-site, and Streamlit:

cargo run -p hwledger-devtools -- up

See docs-site/getting-started/dev-setup.md for ports, log locations, and troubleshooting (FFI auto-build, Swift "engine missing" sheet, streamlit hot-reload).

Documentation

PLAN.md — phased WBS + DAG + risks + reuse opportunities
PRD.md — product requirements (forthcoming)
ADR.md — index of architecture decisions (see docs/adr/)
CHARTER.md — scope + principles (forthcoming)
AGENTS.md — AI-agent operating notes (forthcoming)
docs/research/ — archived Haiku research briefs (oMlx, MLX IPC, inference engines, KV cache formulas, config ingestion, GPU telemetry, Swift/WinUI/Qt FFI, fleet wire, competitor survey)

Development status

Phase	Status
P0 Foundation	in progress
P1 Math core	planned
P2 Ingestion + probe	planned
P3 macOS GUI MVP	planned
P4 Inference	planned (macOS only in MVP)
P5 Fleet	planned
P6 Windows GUI	deferred
P7 Linux GUI	deferred
WP21 macOS Release	code complete (waiting notarization creds)

Tracked in AgilePlus: feature hwledger-v1-macos-mvp (see agileplus status).

WP21 deliverables (macOS distribution):

Codesigning infrastructure: READY (Developer ID cert installed, entitlements defined, scripts complete)
GitHub Actions release workflow: READY (release.yml deployed)
DMG + notarization flow: READY (scripts deployed, awaiting App Store Connect credentials)
Sparkle integration: READY (Package.swift updated, updater wired, key generation documented)
Documentation: READY (docs/reports/WP21-APPLE-DEV-SECRETS.md with step-by-step setup)

License

Apache-2.0. See LICENSE.

Rich Media

CLI Quickstart — `cargo install` + `hwledger --help`

Journey: install-cargo — Install hwledger from source with cargo, then verify version and help

Intent: Terminal prompt, about to run cargo install. Verified: pass.

Full recorded journey: apps/cli-journeys/manifests/install-cargo/manifest.verified.json

VRAM Plan — First Run (Llama 70B GQA)

Journey: first-plan — Run your first plan with colored output showing token distribution for 4 users

Intent: Full VRAM breakdown table — weights · KV cache · activations · overhead. Verified: overall score 0.92.

Annotated keyframe (VRAM fits indicator):

Full manifest: apps/cli-journeys/manifests/first-plan/manifest.verified.json

Fleet Register — Add a Device

Journey: fleet-register — Register a new agent with the fleet, then verify it appears in fleet status

Intent: Device announces GPU inventory, receives mTLS cert, joins gossip network.

Annotated keyframe (registration confirmed):

Full manifest: apps/cli-journeys/manifests/fleet-register/manifest.verified.json

Traceability Report — Audit Log + Provenance Chain

Journey: traceability-report — Generate a markdown traceability report with coverage data and inspect the output

Full MP4 (richer quality): apps/cli-journeys/recordings/traceability-report/traceability-report.rich.mp4

Annotated keyframe (traceability runner start):

Full manifest: apps/cli-journeys/manifests/traceability-report/manifest.verified.json

VRAM Reconcile — Prediction vs Live Telemetry

Nearest recorded journey: plan-mla-deepseek — Show MLA classification and KV sequence invariance across 2K, 32K, 128K sequences (dedicated vram-reconcile journey not yet recorded; this shows the prediction side)

Full MP4: apps/cli-journeys/recordings/plan-mla-deepseek/plan-mla-deepseek.rich.mp4

Annotated keyframe (MLA classification + KV invariance):

Full manifest: apps/cli-journeys/manifests/plan-mla-deepseek/manifest.verified.json

Inference Run — Local GGUF Model Load

Nearest recorded journey: ingest-local-gguf — Ingest a local GGUF model file and output JSON metadata (dedicated inference-run journey not yet recorded; this shows the ingest/load side)

Full manifest: apps/cli-journeys/manifests/ingest-local-gguf/manifest.verified.json

Fleet Probe — SSH Agentless Hardware Scan

Journey: probe-list — List all available probes in both table and JSON formats

Full MP4: apps/cli-journeys/recordings/probe-list/probe-list.rich.mp4

Annotated keyframe (probe table output):

Full manifest: apps/cli-journeys/manifests/probe-list/manifest.verified.json

Cost Model — Spot-Price Fleet Dispatch

Nearest recorded journey: probe-watch — Watch probe metrics update in real time (dedicated cost-model journey not yet recorded; this shows live fleet telemetry)

Annotated keyframe (probe watch start):

Full manifest: apps/cli-journeys/manifests/probe-watch/manifest.verified.json

Audit Log — Event-Sourced Fleet Timeline

Journey: fleet-audit — Audit the fleet with a 3-agent limit to see agent metadata and status

Full MP4: apps/cli-journeys/recordings/fleet-audit/fleet-audit.rich.mp4

Annotated keyframe (fleet audit agent metadata):

Full manifest: apps/cli-journeys/manifests/fleet-audit/manifest.verified.json

Fleet Dispatch — Assign Run to Cheapest Node

Nearest recorded journey: fleet-register — Register a new agent with the fleet (dedicated fleet-dispatch journey not yet recorded; registration shows the fleet membership side of dispatch)

Full manifest: apps/cli-journeys/manifests/fleet-register/manifest.verified.json

Model Ingest — HuggingFace / GGUF Config Auto-Parse

Journey: ingest-local-gguf — Ingest a local GGUF model file and output JSON metadata

See also: ingest-error journey (error path) —

Annotated keyframe (ingest error path):

Full manifests: ingest-local-gguf · ingest-error

Telemetry Sync — Live GPU Stats

Nearest recorded journey: probe-watch — Watch probe metrics update in real time with 1-second refresh intervals (dedicated telemetry-sync journey not yet recorded)

Full manifest: apps/cli-journeys/manifests/probe-watch/manifest.verified.json

KV-Cache Plan — MLA vs GQA Breakdown

Journey: plan-mla-deepseek — Show MLA classification and KV sequence invariance across 2K, 32K, 128K sequences

Full MP4: apps/cli-journeys/recordings/plan-mla-deepseek/plan-mla-deepseek.rich.mp4

Intent: MLA latent projection compresses KV by 16x vs full-rank GQA — sequence length invariant.

Annotated keyframe:

Full manifest: apps/cli-journeys/manifests/plan-mla-deepseek/manifest.verified.json

Spot Price Scan — Cloud Rental Comparison

Nearest recorded journey: plan-hf-resolve — Plan via HF resolver: bare repo id, full HF URL, and gold fixture shortcut (dedicated spot-price-scan journey not yet recorded; HF resolve shows the model-to-hardware cost estimation entry point)

Full manifest: apps/cli-journeys/manifests/plan-hf-resolve/manifest.verified.json

Name		Name	Last commit message	Last commit date
Latest commit History 329 Commits
.github		.github
docs		docs
sidecars		sidecars
.dockerignore		.dockerignore
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CITATION.cff		CITATION.cff
CLAUDE.md		CLAUDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
FUNDING.yml		FUNDING.yml
LICENSE		LICENSE
README.md		README.md
RICH_MEDIA.md		RICH_MEDIA.md
SECURITY.md		SECURITY.md
audit_scorecard.json		audit_scorecard.json
clippy.toml		clippy.toml
deny.toml		deny.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

hwLedger

Quickstart

Why

Architecture

Substrate (per ADR-023 + ADR-035A)

Dev setup

Documentation

Development status

License

Rich Media