rustfuzz is a high-performance fuzzy string matching library implemented entirely in Rust, published as rustfuzz on PyPI. It uses:
- Rust (via PyO3 + maturin) for all core fuzzy-matching algorithms
- Python (3.10+) for the public API surface — thin wrappers that re-export Rust symbols
- uv for Python package management
src/ ← Rust source (compiled as rustfuzz._rustfuzz)
lib.rs ← PyO3 module root — fn _rustfuzz
algorithms.rs ← Core algorithm implementations (Myers, LCS, Jaro, etc.)
fuzz.rs ← ratio, partial_ratio, token_*, WRatio, QRatio
utils.rs ← default_process
types.rs ← Seq type + Python object extraction helpers
distance/
mod.rs
initialize.rs ← Editop, Editops, Opcode, Opcodes, MatchingBlock, ScoreAlignment
metrics.rs ← All distance/similarity pyfunction wrappers
rustfuzz/ ← Python package (thin wrappers, imports from ._rustfuzz)
Cargo.toml ← lib name = "_rustfuzz"
pyproject.toml ← module-name = "rustfuzz._rustfuzz"
ratio, partial_ratio, partial_ratio_alignment, token_sort_ratio, token_set_ratio,
token_ratio, partial_token_sort_ratio, partial_token_set_ratio, partial_token_ratio,
WRatio, QRatio
extract, extractOne, extract_iter, cdist
default_process
Data types: Editop, Editops, Opcode, Opcodes, MatchingBlock, ScoreAlignment
Per-metric (all modules): distance, similarity, normalized_distance,
normalized_similarity, editops (where applicable), opcodes (where applicable)
Modules: Levenshtein, Hamming, Indel, Jaro, JaroWinkler, LCSseq,
OSA, DamerauLevenshtein, Prefix, Postfix
- All Python commands MUST use
uv run— never use.venv/bin/pythonor barepython - Tests:
uv run pytest tests/ -x -q - Benchmarks:
uv run pytest tests/test_benchmarks.py --benchmark-save=baseline - Benchmark regression:
uv run pytest tests/test_benchmarks.py --benchmark-compare=baseline --benchmark-compare-fail=mean:10% - Type checking:
uv run pyright - Smoke test:
uv run python -c "import rustfuzz; print(rustfuzz.__version__)"
cargo check— fast compilation checkuv run maturin develop --release— build optimised.so
cargo check— no Rust errorsuv run maturin develop --releaseuv run pytest tests/ -x -q— all tests must passuv run pytest tests/test_benchmarks.py --benchmark-compare=baseline --benchmark-compare-fail=mean:10%— no regressionsuv run pyright— type checking passes
- No file should exceed 1000 lines of code
- Always run the full original test suite before committing
- Run e2e tests after implementing new features
- Create a branch for each feature/algorithm group
- Each metric module (
Levenshtein,Hamming, etc.) must agree with the reference algorithms process.cdistconsumes any scorer callable accepting(str, str, **kwargs) -> float- Benchmarks baseline is saved in
.benchmarks/— commit it so CI can compare
The CI automatically builds wheels for all platforms, generates a changelog, and publishes to PyPI when a git tag is pushed.
- Bump version in
Cargo.toml(theversionfield under[package]) - Commit the version bump:
git commit -am "release: v0.X.Y" - Tag the commit:
git tag v0.X.Y - Push the tag:
git push origin main && git push origin v0.X.Y - CI will:
- Run tests on all Python versions
- Build wheels (linux, musllinux, macos, windows)
- Generate changelog from conventional commits (via
git-cliff) - Publish to PyPI
- Create GitHub Release with changelog and wheel assets
Use Conventional Commits for automatic changelog categorization:
| Prefix | Category | Example |
|---|---|---|
feat: |
🚀 Features | feat: implement Jaro-Winkler in Rust |
fix: |
🐛 Bug Fixes | fix: handle empty string in partial_ratio |
perf: |
⚡ Performance | perf: use SIMD in Levenshtein |
refactor: |
🔧 Refactoring | refactor: split distance module |
docs: |
📖 Documentation | docs: update README |
ci: |
🔄 CI/CD | ci: add Python 3.13 to matrix |
chore: |
📦 Miscellaneous | chore: update deps |
release: |
(skipped) | release: v3.15.0 |
cargo check
uv run maturin develop --release
uv run pytest tests/ -x -q
uv run pytest tests/test_benchmarks.py --benchmark-compare=baseline --benchmark-compare-fail=mean:10%
uv run pyright