An MLX-native toolkit for understanding and reshaping how language models behave on Apple Silicon.
Named after Sébastien Le Prestre de Vauban, the military engineer who worked both siege and fortification. Vauban does the same for model behavior: measure it, cut it, probe it, steer it, or harden it.
Vauban is a TOML-first CLI for workflows built around activation-space geometry:
- measure behavioral directions from model activations
- cut or sparsify those directions in weights
- probe and steer models at runtime
- map refusal surfaces before and after intervention
- run defense, sanitization, optimization, and attack loops
The primary interface is not a pile of subcommands. It is:
vauban <config.toml>All pipeline behavior lives in the TOML file.
- Apple Silicon Mac
- Python 3.12+
- uv
Install the released CLI:
uv tool install vaubanIf your shell cannot find vauban, update your shell config once:
uv tool update-shellThen open a new shell and check the command:
vauban --help
vauban man quickstart
vauban tree --helpFor development from this repo:
uv tool install --editable .Start with the built-in manual:
vauban man
vauban man quickstart
vauban man commandsScaffold a starter config:
vauban init --mode default --output run.tomlThe verified scaffolded modes are:
cast, circuit, default, depth, detect, features, optimize, probe, sic, softprompt, steer, surface
Validate before a real run:
vauban --validate run.tomlThen run the pipeline:
vauban run.tomlBy default, output goes to output/ relative to the TOML file.
This is the minimal config the code accepts for the default pipeline:
[model]
path = "mlx-community/Llama-3.2-3B-Instruct-4bit"
[data]
harmful = "default"
harmless = "default"[model].path is required.
[data].harmful and [data].harmless are required for most runs. "default" uses Vauban's bundled prompt sets.
You can also choose the output directory explicitly:
[output]
dir = "runs/baseline"Vauban has built-in experiment lineage tracking through an optional [meta] section. This metadata does not change pipeline execution. It exists so you can organize runs as a tech tree.
Minimal example:
[meta]
id = "cut-alpha-1"
title = "Baseline cut, alpha 1.0"
status = "baseline"
parents = ["measure-v1"]
tags = ["cut", "baseline"]
date = 2026-03-02
notes = "First stable reference run."Verified status values are:
archived, baseline, dead_end, promising, superseded, wip
If [meta].id is omitted, Vauban uses the TOML filename stem.
Render the tree from a directory of TOML configs:
vauban tree experiments/
vauban tree experiments/ --format mermaid
vauban tree experiments/ --status promising
vauban tree experiments/ --tag gcgEach run also appends an experiment_log.jsonl file inside the configured output directory with the resolved config path, pipeline mode, report files, metrics, and selected [meta] fields.
vauban <config.toml> loads one TOML file and decides what to do from the sections you include.
The default path is:
- measure
- cut
- export
You extend that run by adding more sections. Common examples:
[eval]adds post-cut evaluation reports.[surface]adds before/after refusal-surface mapping.[detect]adds hardening detection during measurement.- some sections switch Vauban into dedicated mode-specific runs instead of the default pipeline.
Mode precedence is not trivial and changes with the code, so rely on the generated manual instead of memorizing it:
vauban man modesFor field-level reference, use:
vauban man model
vauban man data
vauban man measure
vauban man cut
vauban man eval
vauban man surface
vauban man outputInspect the manual:
vauban man
vauban man quickstart
vauban man commands
vauban man formats
vauban man output
vauban tree --helpScaffold configs:
vauban init --help
vauban init --mode default --output run.toml
vauban init --mode probe --output probe.tomlValidate config and prompt files without loading model weights:
vauban --validate run.tomlExport the current JSON Schema for editor tooling:
vauban schema
vauban schema --output vauban.schema.jsonCompare two run directories:
vauban diff run_a run_b
vauban diff --format markdown run_a run_b
vauban diff --threshold 0.05 run_a run_bvauban diff --threshold ... is a CI gate: it exits non-zero if any absolute metric delta crosses the threshold.
Render the experiment lineage tree:
vauban tree experiments/
vauban tree experiments/ --format mermaid
vauban tree experiments/ --status promisingVerified by the generated manual:
- prompt JSONL for
[data]and[eval]: one JSON object per line with apromptkey - surface JSONL for
[surface].prompts: requireslabelandcategory, plus eitherpromptormessages - refusal phrase files: plain text, one phrase per line
- relative paths resolve from the TOML file's directory
Minimal prompt dataset example:
{"prompt":"What is the capital of France?"}
{"prompt":"Write a haiku about rain."}This README is aligned to the code in this repo:
- package name:
vauban - console script:
vauban = vauban.__main__:main - verified commands:
vauban <config.toml>,--validate,schema,init,diff,tree,man - verified manual topics and scaffolded modes were checked against the live CLI help and generated manual
The current README previously had some stale mode/output claims; this version removes those and points readers to vauban man ... for the parts generated directly from code.
Full docs: vauban.readthedocs.io
| Resource | Description |
|---|---|
| Spinning Up in Abliteration | Seven-part progressive curriculum |
| Getting Started | Guided walkthrough |
| Configuration Reference | TOML field reference |
| Surface Mapping | Surface mapping reference |
examples/config.toml |
Annotated example config |
Apache-2.0