Advisor is a standalone extraction of the local pre-execution steering module that was originally embedded inside Hermes.
It does 4 things:
- builds a compact context packet for a task
- runs a local MLX model to produce structured steering advice
- validates and stores advice/output traces in SQLite
- exports basic metrics and training examples
This repo contains the advisor module itself, not the full Hermes runtime.
agent/advisor/— core advisor codescripts/— metrics/export helperstests/agent/advisor/— focused testsdocs/— existing architecture diagrams and images
schemas.py— pydantic contracts for requests, advice, outcomessettings.py— local config/env loadingcontext_builder.py— repo slice, candidate files, task typing, failure lookupruntime_mlx.py— MLX/MLX-LM inference wrappervalidator.py— trims/dedupes model output into safe structured advicegateway.py— main entrypoint + optional FastAPI app factorytrace_store.py— SQLite persistence for runs, outcomes, patternsmetrics.py— summary metrics over stored runslabeling.py— JSONL training export helpersinjector.py— renders advice into an injected hint block
- local advisor runtime
- trace store
- export scripts
- focused unit tests
- architecture docs copied from the earlier design work
- full auth/tenancy hardening for multi-tenant service use
- paper-faithful final results reporting
Python 3.11+
Create and activate a local virtualenv:
python3 -m venv .venv
source .venv/bin/activateInstall base deps:
pip install -e .Install advisor runtime + dev deps:
pip install -e '.[advisor,dev]'Run tests:
pytest tests/agent/advisor -qCheck the installed CLI:
advisor version
advisor serve --host 127.0.0.1 --port 8000
advisor operator-overview
advisor deployment-profile --mode hosted
advisor hardening-profile --mode hosted
advisor export-bundle --output-dir ./bundleCurrent repo status:
- standalone package install works
- focused advisor test suite passes
- no Hermes-specific Python references remain in this repo
- config can now load from
ADVISOR_CONFIG=/path/to/advisor.toml - health checks now expose runtime/config state on
/healthz - CI workflow now runs lint-only on Python 3.12
- Ruff linting is configured and passes locally
- inference runtime now supports retries, timeout handling, warm-load, and fallback behavior
- reward weights now support named config presets (
balanced,conservative,human-first) plus explicit overrides - Phase 10 orchestration now supports executor/verifier plug-ins, deterministic A/B routing, replayable manifests, and optional second-pass review
- Phase 11 adds redacted packet exports, structured run-event logs, live metrics export, and audit reporting
- Phase 12 adds real HTTP/subprocess executor integrations, real verifier adapters, integration registry construction, and parity-tested baseline vs advisor-assisted execution
- Phase 13 adds frozen benchmark suites, benchmark run manifests, deterministic baseline-vs-advisor summaries, and ablation-friendly reporting
- Phase 14 adds persisted training manifests, checkpoint registry lifecycle, and benchmark-driven promotion/rollback decisions
- Phase 15 adds operator deployment profiles, run inspection endpoints, persistent background job queueing/resume, and retention enforcement with archival rotation
- Phase 16 adds a paper-faithful results-pass layer with canonical study summaries, ablation/transfer reporting, failure taxonomy, provenance coverage, and explicit paper-divergence reporting
- Phase 17 adds finished-product hardening with release gates, auth/tenancy/isolation profiles, backup/import-export bundle paths, alert summaries, and locked truth-surface contract versions
- GitHub CI installs
.[dev]only, since MLX runtime deps are Apple-specific and not required for the test suite
See CONTRIBUTING.md for local setup, test, and lint workflow.
Export successful runs as JSONL:
python scripts/export_advisor_training_examples.py ./out/train.jsonl --min-quality-score 0.5Summarize trace metrics:
python scripts/advisor_metrics_summary.pySee docs/ARCHITECTURE.md for the image index and diagram notes.
See docs/PRODUCTION_CHECKLIST.md for the staged production roadmap toward a generic, paper-faithful advisor implementation with reward-driven improvement.
This repo is grounded in How to Train Your Advisor: Steering Black-Box LLMs with Advisor Models.
See docs/PAPER_FOUNDATION.md for the repo-level design rules derived from that paper.
- How to Train Your Advisor: Steering Black-Box LLMs with Advisor Models
- arXiv abstract: https://arxiv.org/abs/2510.02453
- PDF: https://arxiv.org/pdf/2510.02453
Apache License 2.0. See LICENSE.