Sequential Probability Ratio Test (SPRT) tool for validating engine strength changes.
← Back to README | Setup Guide | Engine Architecture | Contributing Guide
SPRT is a statistical test used to determine if a change to the engine results in a strength gain, loss, or is neutral. It is used for tuning search algorithms, evaluation terms, and other parameters.
There are two ways to run SPRT: the native CLI (lightweight) and the web UI (visual).
The CLI is built directly into the sprt binary. It manages game pairs, subprocess engines, clocks, adjudication, and reports results.
Before making your changes, build the current source as the baseline:
cargo build --release --features sprt --bin sprtCopy or rename the binary so it doesn't get overwritten:
# Windows
copy target\release\sprt.exe target\release\sprt_old.exe
# Linux/macOS
cp target/release/sprt target/release/sprt_oldEdit the engine source code with whatever changes you want to test.
The CLI will automatically build the new binary from the current source:
cargo run --release --bin sprt --features sprt -- run --old-bin target/release/sprt_old| Option | Default | Description |
|---|---|---|
--new-bin <PATH> |
auto-build | Path to the new engine binary |
--old-bin <PATH> |
required | Path to the old (baseline) engine binary |
--tc <TC> |
10+0.1 |
Time control: base+inc (seconds), depth N, or fixed Ns |
--concurrency <N> |
logical CPU count | Number of parallel games |
--max-games <N> |
unlimited | Maximum games to run |
--min-games <N> |
250 |
Minimum games before SPRT can terminate |
--elo0 <F> |
0.0 |
H0 bound (Elo where new is NOT better) |
--elo1 <F> |
5.0 |
H1 bound (Elo where new IS better) |
--alpha <F> |
0.05 |
Type I error rate (false positive) |
--beta <F> |
0.05 |
Type II error rate (false negative) |
--adjudication <N> |
0 |
Eval difference (cp) to auto-adjudicate (0 = disabled) |
--max-moves <N> |
300 |
Max plies before forced draw |
--search-noise <N> |
50 |
Noise amplitude (cp) for first 8 ply |
--old-strength <N> |
3 |
Strength level for old engine (1-3) |
--games <PATH> |
— | Write game ICNs to a JSON |
--results <PATH> |
— | Write results to a JSON |
--variants <LIST> |
all except custom eval | Comma-separated variant list |
--verbose |
off | Print detailed game info |
cargo run --release --bin sprt --features sprt -- run --old-bin target/release/sprt_old \
--tc 1+0.01 \
--concurrency 8 \
--max-games 200 \
--games games.jsonAfterwards you can drop the games JSON into the ICN validator to catch illegal moves or bad terminations, though some discrepancies are expected in certain insufficient material and Huygen mate cases.
For visual feedback and interactive configuration, use the browser-based SPRT. You'll need Node for this.
wasm-pack build --target web --out-dir pkg-oldAfter making changes:
cd sprt
npm run devThis builds the current source into sprt/web/pkg-new and starts the test server at http://localhost:3000.
- Open
http://localhost:3000in your browser - Select bounds preset, time control, and concurrency
- Start the test
SPSA (Simultaneous Perturbation Stochastic Approximation) is used to automatically tune search and evaluation constants through self-play.
The tuner lives in src/bin/spsa.rs and uses a single feature gate for dynamic parameter injection during tuning:
cargo run --release --bin spsa --features sprt,param_tuning -- run--params controls which knobs are tuned:
| Selector | Meaning |
|---|---|
all |
Tune every exposed search and eval parameter |
search |
Tune only search parameters from src/search/params.rs |
eval |
Tune only evaluation parameters from src/evaluation/base.rs |
piece-values |
Tune only evaluation material / piece-value style knobs |
pawn,knight,... |
Tune only the explicitly named parameters |
| Option | Default | Description |
|---|---|---|
run --iterations <N> |
100 |
Number of SPSA iterations |
run --pairs <N> |
400 |
Paired openings per iteration; total games = pairs * 2 |
run --checkpoint-every <N> |
1 |
Save a checkpoint every N iterations |
run --resume <PATH> |
latest checkpoint | Resume from a specific checkpoint |
run --fresh |
off | Ignore checkpoints and start from defaults |
run --tc <TC> |
3+0.03 |
Time control: base+inc, depth N, or fixed Ns |
run --concurrency <N> |
16 |
Number of parallel game workers |
run --variants <LIST> |
default set | Comma-separated variant list |
run --adjudication <N> |
2000 |
Eval threshold for adjudication |
run --max-moves <N> |
300 |
Maximum plies before forced draw |
run --search-noise <N> |
50 |
Noise amplitude for first 8 ply |
run --params <SELECTOR> |
all |
Parameter preset or comma-separated names |
run --config <PATH> |
none | Optional JSON override for bounds/defaults/c_end/r_end |
run --results <PATH> |
sprt/spsa_final.json |
Final result JSON output |
run --games <PATH> |
off | Write latest iteration ICNs as JSON |
run --big-a <F> |
iterations / 10 |
SPSA stability constant A |
run --alpha <F> |
0.602 |
SPSA learning-rate decay |
run --gamma <F> |
0.101 |
SPSA perturbation decay |
run --verbose |
off | Inherit search subprocess stderr |
list --params <SELECTOR> |
all |
Print selected tunables with bounds, c_end, and R_end |
apply --input <PATH> |
latest checkpoint | Apply tuned constants back into Rust source |
revert --params <SELECTOR> |
all |
Revert selected constants back to defaults |
--config accepts a JSON object keyed by parameter name. Each entry can override any subset of default, min, max, c_end, and r_end.
{
"knight": { "min": 180, "max": 340, "c_end": 4.0, "r_end": 0.0020 },
"bishop": { "default": 430, "c_end": 4.0, "r_end": 0.0015 },
"razoring_linear": { "min": 300, "max": 650, "c_end": 16.0, "r_end": 0.0020 }
}Tune all exposed params on the default variant set:
cargo run --release --bin spsa --features param_tuning -- run --pairs 100 --iterations 500 --concurrency 20Tune only piece values at 5+0.1 across the default variants:
cargo run --release --bin spsa --features param_tuning -- run --tc 5+0.1 --params piece-valuesTune only a hand-picked subset:
cargo run --release --bin spsa --features param_tuning -- run --params pawn,knight,bishop,rook,mg_bishop_pair_bonusInspect the tunable set before a run:
cargo run --bin spsa --features param_tuning -- list --params evalApply the latest checkpoint back into source:
cargo run --bin spsa --features param_tuning -- applyCheckpoints are saved to sprt/spsa_checkpoints/ by default and resume automatically unless --fresh is passed.
src/bin/sprt.rs— Native CLI (SPRT manager + search subprocess)sprt.js— Build and server script (web UI)src/bin/spsa.rs— Match-based SPSA CLI (runner + search subprocess + apply/revert)web/— Web UI for running SPRT testsweb/pkg-old/— Baseline WebAssembly packageweb/pkg-new/— Modified WebAssembly package
- SPRT on Chess Programming Wiki
- SPSA on Chess Programming Wiki
- Stockfish Testing — Production SPRT system