Skip to content

iampram/pi-crunch

Repository files navigation

pi_crunch

High-performance Pi computation, verification, and benchmarking in Rust — inspired by y-cruncher.

pi_crunch computes π with the Chudnovsky algorithm evaluated by binary splitting (parallelised with rayon), independently verifies the result with Machin's formula, and writes a benchmark report comparing your machine to y-cruncher's published numbers.

It is fully self-contained: after git clone, the default build needs only cargo build --release — no C libraries, no system dependencies.


What it does

  • Compute π to a fixed digit count, or run a timed budget that doubles the digit count each round until either the time runs out or a digit ceiling is reached — whichever comes first (default: 30s or 1.2M digits).
  • Verify the digits independently via π/4 = 4·arctan(1/5) − arctan(1/239) in exact scaled-integer arithmetic — a completely different algorithm from the one that produced the digits, so agreement is strong evidence of correctness.
  • Benchmark — collects CPU/RAM/OS info and emits a per-round results table plus y-cruncher reference times.

Three files are written to the output directory:

File Contents
pi_digits.txt The digits — 100 per line (10 groups of 10), with a counter
pi_verification.txt Per-block PASS/FAIL report and overall verdict
pi_benchmark.txt System info, round-results table, references

Progress is printed to stderr, so stdout stays clean and the digit output remains redirectable.


Quick start

git clone https://github.com/iampram/pi_crunch
cd pi_crunch
cargo build --release
./target/release/pi_crunch            # timed run: 30s or 1.2M digits, whichever first (default)

On Windows, the binary is ./target/release/pi_crunch.exe.


Prerequisites

You need a Rust toolchain with a working linker. Install Rust via rustup. The big-integer dependencies are pure Rust, so the default build needs no C compiler — but every Rust program still needs a linker to produce an executable.

Linux

  • Rust via rustup.
  • A linker/C toolchain (usually already present): sudo apt install build-essential (Debian/Ubuntu) or the equivalent gcc/binutils package.
  • For --features gmp: sudo apt install libgmp-dev.

macOS

  • Rust via rustup.
  • Xcode Command Line Tools (provides the linker): xcode-select --install.
  • For --features gmp: brew install gmp.

Windows

Pick one of the following so the toolchain can link:

  • MSVC (default Rust on Windows): install the Build Tools for Visual Studio with the "Desktop development with C++" workload (this provides link.exe and the Windows SDK). VS Code alone is not sufficient.

  • GNU (no Visual Studio needed): install the self-contained GNU toolchain and use it for this project only — this leaves your global default (and other projects) untouched:

    rustup toolchain install stable-x86_64-pc-windows-gnu
    # from the pi_crunch directory:
    rustup override set stable-x86_64-pc-windows-gnu

    After this, plain cargo build / cargo run work here with no +toolchain prefix. To make GNU your machine-wide default instead, use rustup default stable-x86_64-pc-windows-gnu.

    The default build links fully self-contained on this toolchain — no MSYS2 or MinGW install required. (The dependencies deliberately avoid raw-dylib crates, which would otherwise need MinGW's dlltool/as.)

    Heads-up: if you build from a Unix-style shell (Git Bash / MSYS2) on the MSVC toolchain without the VS Build Tools installed, the linker error will mention link: extra operand / Try 'link --help'. That is the GNU coreutils link (a hard-link utility) shadowing MSVC's link.exe on PATH — the real fix is to install the VS C++ Build Tools or switch to the GNU toolchain as above, not to touch PATH.

  • For --features gmp on Windows: install MSYS2, then in the MSYS2 shell pacman -S mingw-w64-x86_64-gmp, and point the build at it:

    $env:GMP_LIB_DIR = "C:\msys64\mingw64\lib"
    cargo build --release --features gmp

Building

Default (pure-Rust, no system deps)

cargo build --release

With the GMP backend (faster, requires libgmp)

When enabled, the big-integer backend switches from num-bigint to rug (GMP), typically ~3–5× faster.

# Linux:   sudo apt install libgmp-dev
# macOS:   brew install gmp
# Windows: MSYS2 + pacman -S mingw-w64-x86_64-gmp, then set GMP_LIB_DIR
cargo build --release --features gmp

With the faster allocator (fast-alloc)

The program creates and frees a lot of big numbers while it works, and the default system memory manager can become a bottleneck when many threads do this at once. The fast-alloc feature swaps in mimalloc, a memory manager built for exactly this kind of multi-threaded churn — usually a small (single-digit-percent) speedup.

cargo build --release --features fast-alloc

Needs a C compiler. mimalloc is written in C, so this feature only builds if a C compiler is on your PATH (the MSVC Build Tools on Windows, or MSYS2's mingw-w64-gcc; gcc/clang on Linux/macOS). It's off by default so the standard build stays dependency-free. If you don't have a C compiler, just leave it off — the program runs fine without it.


Usage

All flags (defaults shown):

pi_crunch [OPTIONS]

  --mode <timed|fixed>      Run mode                              [default: timed]
  --seconds <N>             Wall-clock budget (timed mode)        [default: 30]
  --max-digits <N>          Digit ceiling (timed mode)            [default: 1200000]
  --digits <N>              Exact digit count (fixed mode)        [required if fixed]
  --threads <N>             Worker threads                        [default: all logical cores]
  --out-dir <PATH>          Output directory                      [default: .]
  --skip-verify             Skip verification (benchmark only)
  --verify-blocks <N>       Digits per verification block         [default: 1000]
  --verify-max-digits <N>   Cap on digits to verify (see note)    [default: 1000000]
  -h, --help                Print help
  -V, --version             Print version

Run examples (use .\target\release\pi_crunch.exe on Windows):

# Default timed run: stops at 30s or 1.2M digits, whichever comes first
./target/release/pi_crunch

# Timed run, 30-second budget
./target/release/pi_crunch --mode timed --seconds 30

# Timed run, but stop at 500k digits even if the budget allows more
./target/release/pi_crunch --mode timed --seconds 90 --max-digits 500000

# Compute exactly 1,000 digits (quick sanity check)
./target/release/pi_crunch --mode fixed --digits 1000

# Compute exactly 1 million digits
./target/release/pi_crunch --mode fixed --digits 1000000

# Pin to 4 threads
./target/release/pi_crunch --mode fixed --digits 1000000 --threads 4

# Write the three reports to /tmp
./target/release/pi_crunch --seconds 30 --out-dir /tmp

# Pure compute benchmark — no verification
./target/release/pi_crunch --skip-verify

# Smaller verification blocks (more granular report)
./target/release/pi_crunch --mode fixed --digits 50000 --verify-blocks 250

# Verify fewer digits to keep very large runs fast (final division still costs)
./target/release/pi_crunch --mode fixed --digits 5000000 --verify-max-digits 100000

# Verify more digits (slower) on a smaller run
./target/release/pi_crunch --mode fixed --digits 2000000 --verify-max-digits 2000000

# Everything at once
./target/release/pi_crunch --mode timed --seconds 45 --threads 8 \
    --out-dir ./out --verify-blocks 500 --verify-max-digits 250000

A note on --verify-max-digits

Machin's formula is evaluated by binary splitting (the same technique as the Chudnovsky path): the arctan series is summed via a divide-and-conquer tree of large multiplications — sub-quadratic, O(M(n)·log n) with num-bigint's Karatsuba/Toom multiplication — rather than the naive term-by-term sum (which divided an n-digit number per term and was genuinely O(n²)). The remaining cost is dominated by a single final scaled division, which num-bigint performs in O(n²); that constant is small, so verifying ~512k digits is a few seconds and ~1,000,000 is well under a minute. To keep very large runs responsive, verification is still capped at min(computed_digits, --verify-max-digits) (default 1,000,000).

  • Small and mid-size runs verify fully in well under a second.
  • Lower the cap (e.g. --verify-max-digits 100000) to trim the final-division cost on multi-million-digit runs.
  • Raise it (e.g. --verify-max-digits 5000000) to verify deeper, at the cost of time. Use --skip-verify to skip verification entirely.

When the cap truncates coverage, both the stderr log and pi_verification.txt note it, e.g. "verified first 1,000,000 of 5,000,000 digits".


Output files

pi_digits.txt

Pi Computation Result
=====================

Timestamp : 2026-06-09T14:23:01Z
Digits    : 1000000
Algorithm : Chudnovsky (binary splitting)
Compute   : 4213 ms
Threads   : 16
Backend   : pure-rust

3.
1415926535 8979323846 2643383279 5028841971 6939937510 5820974944 5923078164 0628620899 8628034825 3421170679  :  100
8214808651 3282306647 0938446095 5058223172 5359408128 4811174502 8410270193 8521105559 6446229489 5493038196  :  200
...

pi_verification.txt

Pi Verification Report
======================

Timestamp       : 2026-06-09T14:23:07Z
Digits verified : 1000000
Method          : Machin formula (integer arithmetic)
Duration        : 6841 ms
Result          : PASS

Block 1 [digits 1–1000] : PASS
Block 2 [digits 1001–2000] : PASS
...
════════════════════════════════════════
Verification: PASSED  1000000 digits
════════════════════════════════════════

A failing block reads: Block N [digits …] : FAIL — first mismatch at digit position XXXX, and the final verdict becomes FAILED.

pi_benchmark.txt

System header, a box-drawing round-results table (Digits, Compute (ms), Digits/sec, Est. RAM MB), summary, and the y-cruncher reference numbers.

RAM estimate. The Est. RAM MB column uses digits × 3.32193 × 1.5 × 3 bytes — log2(10) bits/digit, ×1.5 for division scratch, ×3 for the three big-integer trees (P, Q, R). It is an estimate, not a measured peak; actual usage depends on the backend and allocator.


Performance notes

y-cruncher reference times (published at numberworld.org/y-cruncher):

Digits Fastest Slowest
25M 0.21s (Ryzen AI Max+ 395) 1.09s (Ryzen 7 1800X)
100M 0.76s (Ryzen AI Max+ 395) 9.27s (Core i3 8121U)
1B 10.0s (Ryzen AI Max+ 395) 142s (Core i3 8121U)

y-cruncher uses GMP + AVX-512 + hand-tuned assembly. The pure-Rust backend here will be slower; build with --features gmp for a closer comparison.

How it uses your CPU cores

The heavy lifting is the Chudnovsky series: a long sum of terms that all get combined into one fraction. Instead of adding the terms one-by-one, the program splits the sum down the middle, then splits each half again, and again — like a tournament bracket — until each piece is tiny. The tiny pieces are handed out to all your CPU cores at once, and the results are combined back up the bracket. Combining two pieces means multiplying very large numbers, and at the top of the bracket those multiplications are themselves shared across cores. (--threads lets you cap how many cores it uses; the default is all of them.)

There's one catch worth knowing, because it explains the timings you'll see:

The final step doesn't split. After the bracket is combined, recovering the actual digits of π needs one giant long-division (and a square root). That step is inherently sequential — it can't be spread across cores — and its cost grows with the square of the digit count. For runs into the hundreds of thousands of digits and beyond, this single step dominates the total time, so adding more cores helps less than you might expect. This is the main reason the GMP backend (--features gmp) is much faster: it has a smarter, faster division built in. The fast-alloc feature shaves a little more off by reducing memory-management overhead, but doesn't change this fundamental limit.


Running tests

cargo test                  # default pure-Rust backend
cargo test --features gmp   # GMP backend (requires libgmp)

The suite includes:

  • chudnovsky: the first 100 digits exactly match the known value; 1000 digits are cross-checked against the independent Machin verifier.
  • verify: Machin verification passes on correct input and pinpoints the first mismatch on corrupted input; the reference reproduces the known 100 digits.
  • output: line formatting (groups, spacing, right-aligned counter).
  • benchmark: thousands formatting, RAM estimate, table rendering.

Lint & format

cargo clippy -- -D warnings
cargo fmt
cargo fmt --check

Known limitations

  • A linker is required to build. The pure-Rust default needs no C libraries but, like any Rust binary, still needs a linker. On Windows that means either the MSVC Build Tools (link.exe) or the GNU toolchain — see Prerequisites. A link: extra operand error means a Unix-shell link is shadowing MSVC's link.exe; the fix is the toolchain, not PATH (see the heads-up there).
  • Verification uses binary splitting (sub-quadratic), but its final scaled division is still O(n²) in num-bigint, and it is capped by --verify-max-digits (default 1M). It is not meant to verify hundreds of millions of digits; use --skip-verify for pure compute benchmarks at that scale.
  • Pure-Rust is slower than y-cruncher. No SIMD/assembly; the gmp feature narrows the gap but still won't match y-cruncher's hand-tuned kernels.
  • RAM figures are estimates, not measured peaks (see the note above).
  • GMP on Windows requires MSYS2 and a correctly set GMP_LIB_DIR; it is the most involved build path.
  • The last computed digit may occasionally differ from a rounded reference by one ULP; guard digits make the reported digits stable, and verification covers the rest.

License

MIT. See Cargo.toml.

About

A Rust application designed to compute one million digits of Pi - thresholds as per need.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages