The property methods and the compute engine (the executor). This repository is independent and stateless: nothing here imports the other repos, and it stores no information about any shard, pack, release, or lifecycle. It is used exactly twice per shard — compute (by a contributor) and verify (by a maintainer).
- Define each integer property as a small registered method.
- Provide the executor that computes shards and verifies them.
- Cut algorithm releases that downstream shards pin to.
Algorithms are a single flat directory — one method per file, the filename matching the column it produces. Pack, shard, and release names never appear in the layout. Shared helpers live in _lib. There are no work-order or manifest files here — those live in the Shards repository.
integer_atlas_algos/ the installable package (atlas-algos)
registry.py @property_method + the flat column registry
context.py per-n memoized context (factorization, divisors)
properties/ one method per file; filename = column name (47 columns)
_lib/ shared helpers
factorization.py prime-table factorization
multiplicative.py sigma() shared by divisor functions
blake3_py.py pure-Python BLAKE3 fallback
precomputed/ regenerable resource cache (not state)
primes_le_31623.txt base primes up to ceil(sqrt(1e9))
executor/ the stateless engine
cli.py argparse CLI (compute / verify, estimate, resume)
compute.py chunked, resumable, crash-safe, streaming finalize
verify.py sampled recompute + compare
estimate.py pre-run work estimate from static complexity
manifest.py work-order loading, draft manifest, hashing
atomicio.py atomic write / checkpoint primitives
backends/ csv_backend (stdlib) + parquet_backend (pyarrow)
tools/ bench.py, perfrun.py, make_work_order.py (dev only)
tests/ unittest suite + sample work-order manifests
pyproject.toml package metadata, console script, extras
COMMANDS.md INTERFACE.md reference docs
Run it with pip install -e . then atlas-algos …, or from a source checkout as
python3 -m integer_atlas_algos.executor … (run from the repository root).
All 47 properties are implemented. See INTERFACE.md for the complete command reference, output layout, exit codes, and the resume model.
To factor any n up to BOUND² you only need primes up to BOUND; with
BOUND = 31623 (≥ √1e9) that is ~3401 primes covering the whole 0..1e9 range.
precomputed/primes_le_31623.txt holds them — a deterministic, regenerable cache
(sieved on first use if missing), not state about any shard or pack. It bounds
worst-case factorization to ~3401 trial divisions regardless of n's size in range.
- A single shard is computed single-threaded (~3,700 rows/s/core in pure Python); scale by computing many shards concurrently, one process each.
nis stored as int64 (covers 0..2^63).partition_countis practical only for small n — its values grow to hundreds of digits — so omit it from large ranges.
- Signature is
f(n, ctx) -> value. The method name becomes the column name. ctxis a memoized context of shared intermediates (factorization, divisor list, binary representation) declared viarequires, so expensive work is computed once per number, not once per method.- Metadata (canonical column id, dtype, nullable, zero/negative behavior,
requires, test vectors) is registered next to the function, so the schema, column ids, and manifest columns are generated from code. A method declares only its own column — it says nothing about packs, shards, releases, or any other entity. - A method may provide an optional vectorized fast-path for speed; the scalar form is always the verification ground truth.
- Test vectors live with the method and are reused by both the property-proposal gate and shard verification.
The executor is stateless — both verbs are pure functions of an input manifest that names a start, an end, and the requested columns. It does not interpret packs, the grid, or policy; if the manifest does not follow project conventions that is fine, and it errors only if a requested column name is unknown.
compute --manifest <work-order> --out <shard>— run the requested column functions over[start, end]and write the shard, filling the output manifest's column types and hashes from the method metadata.verify --manifest <entry> --shard <file> --degree <fraction>— recompute a share of the requested columns (0.1 sampled … 1.0 full) and compare against the shard; report pass/fail.
Cut an algorithm release once enough methods have merged. It is stamped with the commit id and PR URLs, and is what a shard's algorithm_versions pins to. Only released methods are eligible for official shards; unreleased methods exist only for local side-loading.
Published to PyPI as integer-atlas-algos; install with pip install integer-atlas-algos
(add the parquet extra for Parquet output, hash for the native BLAKE3 speedup).
Factorization speed comes from the precomputed prime table; native libraries
(gmpy2, primesieve) and a column's optional vectorized fast-path can speed it further.
This repository stores no information about any shard, release, or lifecycle — work
orders and manifests live in the Shards repository. The executor sees a given shard
exactly twice: compute (by a contributor) and verify (by a maintainer).
Add an algorithm by dropping a file in properties/ (one method per file) with its
metadata and test vectors, then open a pull request. See INTERFACE.md
for the method contract.
Code: MIT — see LICENSE. If you use Integer Atlas, please cite it — see CITATION.cff.