espeak-ng-rs

A pure-Rust port of eSpeak NG text-to-speech, built with a test-first, bottom-up approach.

The C library is used as an oracle: the Rust implementation must produce bit-identical output for every input.

Status

Module	Status	Notes
`encoding`	✅ Complete	All codepages (UTF-8, ISO-8859-*, KOI8-R, ISCII, …)
`phoneme`	✅ Complete	Phoneme table loader, IPA rendering, instruction scanner
`dictionary`	✅ Complete	Hash lookup, rule engine, suffix stripping, `SetWordStress`
`translate`	✅ Complete	Full text → IPA pipeline, multi-language, numbers, punctuation
`synthesize`	✅ Complete	Full harmonic synthesis reading espeak-ng binary phoneme data

319 tests passing — 27/27 IPA oracle comparisons + 10 synthesis integration tests.

Two synthesis paths are available:

Path	How	Quality
`Synthesizer::synthesize(ipa_str)`	Hand-coded IPA → 3-formant cascade	Works anywhere, generic voice
`Synthesizer::synthesize_codes(codes, phdata)`	Phoneme bytecodes → real espeak-ng frame data → harmonic synth	Requires espeak-ng data files, authentic espeak-ng character

Quick start

# Run all tests (unit + integration + oracle)
cargo test

# Run oracle comparison tests with verbose output
cargo test --test oracle_comparison -- --nocapture

# Run benchmarks (requires espeak-ng binary on PATH for C baseline)
./benches/bench.sh

Usage

// Text → IPA phonemes
let ipa = espeak_ng::text_to_ipa("en", "hello world")?;
assert_eq!(ipa, "həlˈəʊ wˈɜːld");

// More examples
espeak_ng::text_to_ipa("en", "42")?;          // "fˈɔːti tˈuː"
espeak_ng::text_to_ipa("en", "walked")?;      // "wˈɔːkt"
espeak_ng::text_to_ipa("en", "happily")?;     // "hˈapɪli"
espeak_ng::text_to_ipa("de", "schön")?;       // "ʃˈøːn"
espeak_ng::text_to_ipa("fr", "bonjour")?;     // "bɔ̃ʒˈuːɹ"

// Text → raw PCM (22050 Hz, mono, 16-bit) — not yet implemented
let (samples, rate) = espeak_ng::text_to_pcm("en", "hello world")?;

What is implemented

`encoding/`

Full UTF-8 encode/decode (utf8_decode_one, encode_one)
All eSpeak NG codepage tables: ISO-8859-1 through -16, KOI8-R, ISCII
Encoding::from_name() lookup matching C's encoding.c

`phoneme/`

Binary phoneme table loader (ph_data files, phonindex)
Per-language table selection (select_table_by_name)
Phoneme attribute access: type, flags, mnemonic, program address
IPA string extraction via bytecode scanner (phoneme_ipa_string):
- Handles i_IPA_NAME instructions
- Correctly handles synthesis-only phonemes (first instruction ≥ i_FMT)
- Language-specific scanning depth to avoid bleed-through

`dictionary/`

Binary en_dict-format reader (Dictionary::from_bytes)
TransposeAlphabet decompression for Latin-script entries
Hash-based word lookup (hash_word, lookup)
Full rule engine (TranslateRules / MatchRule):
- Pre/post context matching (letter groups, syllable counts, stress, …)
- RULE_ENDING suffix detection with end_type and separated end_phonemes
- RULE_NO_SUFFIX, RULE_DOUBLE, RULE_LETTERGP, RULE_DOLLAR, …
- Score-based rule selection, condition bitmask, spell-word flag
SetWordStress — full port of the C function:
- Vowel stress array construction (GetVowelStress)
- All stress placement strategies (trochaic, iambic, left-to-right, …)
- $strend / $strend2 end-stress promotion
- Clause-level final-stress demotion
Suffix stripping (SUFX_I): re-translates stem with FLAG_SUFFIX_REMOVED, combining stem phonemes + suffix phonemes correctly
Word-final devoicing for German / Dutch / Afrikaans / Slovak / Slovenian / Albanian

`translate/`

text_to_ipa(lang, text) → String public API
Tokeniser: words, digit strings, punctuation, clause boundaries
word_to_phonemes: dictionary lookup → suffix stripping → translation rules
Number-to-phonemes (English):
- Integers 0–999 999 999 999 via dict entries (_0–_19, _NX, _0C, _0M1, _0M2)
- NUM_1900 year format (1900 → "nineteen hundred")
- Decimal numbers: integer + point + individual digits
- END_WORD (||) markers preserved → correct inter-word spacing
IPA rendering (phonemes_to_ipa_full):
- Primary (ˈ) / secondary (ˌ) stress marks before vowels
- END_WORD → word-boundary space
- Language-specific overrides (English schwa, French 'r' → ʁ, …)
- Context-sensitive phonemes: d# → 't'/'d', z# → 's'/'z' based on voicing
- French liaison phoneme suppression at word-final position
- German word-final devoicing (Auslautverhärtung)
Multi-clause stress promotion (mirrors phonemelist.c)
Language routing: en, fr, de, es, and many more via data files

Oracle test coverage

Test	Text	Expected IPA
`en_hello`	"hello"	hɛlˈəʊ
`en_hello_world`	"hello world"	hɛlˈəʊ wˈɜːld
`en_silent_e`	"cake"	kˈeɪk
`en_gh_digraph`	"night"	nˈaɪt
`en_silent_consonants`	"pneumonia"	njuːmˈəʊniə
`en_suffixes`	"walked", "happily", …	wˈɔːkt, hˈapɪli, …
`en_numbers_cardinal`	"0" … "1000000"	zˈiəɹəʊ … wˈɒn mˈɪliən
`en_numbers_with_decimal`	"3.14", "0.5"	θɹˈiː pɔɪnt wˈɒn fˈɔː, …
`en_sentence_period`	"Hello. Goodbye."	hɛlˈəʊ ɡʊdbˈaɪ
`en_comma`	"yes, no, maybe"	jˈɛs nˈəʊ mˈeɪbi
`de_guten_tag`	"guten Tag"	ɡˈuːtən tˈaːk
`de_umlauts`	"über", "schön", "müde"	ˈyːbɜ, ʃˈøːn, mˈyːdə
`de_ch_digraph`	"Bach", "ich"	bˈax, ˈɪç
`es_hola`	"hola"	ˈola
`es_ll_digraph`	"llamar"	ʎamˈaɾ
`fr_bonjour`	"bonjour"	bɔ̃ʒˈuːɹ
`fr_nasal_vowels`	"bon"	bˈɔ̃
`fr_liaison`	"les amis"	le-z amˈi

Testing approach

Tests are written before the implementation (TDD).

tests/
  encoding_integration.rs   22 golden-value tests for all encodings
  oracle_comparison.rs      27 tests comparing Rust ↔ C oracle output
  common/mod.rs             shared helpers (espeak_available, try_espeak_ipa, …)

Oracle tests use an assert_matches_oracle! macro with three outcomes:

Condition	Result
`espeak-ng` not on PATH	Skip with `[SKIP]` notice
Rust returns `NotImplemented`	Print C oracle value as a target, pass
Rust returns a real string	Must exactly match C oracle output

This means all comparison tests can be written now, run in any environment, and automatically start enforcing correctness as each module is implemented.

Data directory

The crate reads compiled eSpeak NG data files at runtime. The data resolution order is:

ESPEAK_DATA_PATH environment variable
espeak-ng-data/ next to the running executable
espeak-ng-data/ in the current working directory
/usr/share/espeak-ng-data (system installation)

A complete copy of the compiled data directory (from eSpeak NG 1.52.0 + additional language files from 1.52.0.1) is bundled at espeak-ng-data/ in this repository. This makes the crate fully self-contained without requiring a system eSpeak NG installation.

The bundle contains:

114 compiled dictionaries (*_dict files) for 114 languages
145 language definition files (lang/) — includes ps, rup, crh, mn not in 1.52.0
200 voice definition files (voices/) — includes asia/ps, ps voices
Binary phoneme data (phondata, phonindex, phontab, intonations)

For selective embedding, the repository also contains per-language dictionary crates under data-crates/espeak-ng-data-dict-<lang> in addition to the aggregate espeak-ng-data-dicts crate.

# Use bundled data explicitly
ESPEAK_DATA_PATH=/path/to/espeak-ng-rs/espeak-ng-data cargo test

Features

Feature	What it does
`c-oracle`	Links `libespeak-ng` via FFI; enables the `oracle` module for comparison tests and benchmarks. Requires `libespeak-ng` to be installed (`pkg-config: espeak-ng`).
`bundled-data`	Embeds the full eSpeak NG dataset via the aggregate data crates and enables `install_bundled_data()`.
`bundled-data-<lang>`	Embeds phoneme data plus a single language dictionary crate and enables selective installers such as `install_bundled_language()`.
`bundled-espeak`	Downloads eSpeak NG 1.52.0 from GitHub, builds it with CMake, and bakes the binary/data paths into the benchmarks. Requires `cmake`, a C compiler, `curl`/`wget`, `tar`.

# FFI oracle
cargo test --features c-oracle

# Full embedded data
cargo test --features bundled-data

# Selective embedded data
cargo test --features bundled-data-en,bundled-data-uk

# Selective bundled-data demo
cargo run --example bundled_data_selective_demo --features bundled-data-en,bundled-data-uk

# Bundled build (no system install needed)
cargo bench --features bundled-espeak
cargo bench --features bundled-espeak,c-oracle   # both

Selective bundled-data helpers exposed by the main crate:

espeak_ng::bundled_languages()
espeak_ng::has_bundled_language("uk")
espeak_ng::install_bundled_language(&data_dir, "uk")
espeak_ng::install_bundled_languages(&data_dir, &["en", "uk"])

Publishing checklist

Before anything is published, make sure tests are valid and passing.

# 1) Baseline test suite
cargo test

# 2) Oracle + bundled-espeak path
cargo test --features "c-oracle,bundled-espeak"

# 3) Optional selective bundled-data checks
cargo test --test bundled_data_selective --features bundled-data-en,bundled-data-de

# 4) Preview publish order/commands
python3 scripts/publish_all_crates.py

# 5) Dry-run publish checks (local changes allowed)
python3 scripts/publish_all_crates.py --execute --dry-run --allow-dirty

# 6) Actual publish (when ready)
python3 scripts/publish_all_crates.py --execute

scripts/publish_all_crates.py --execute enforces these preflight checks before any crate is published and aborts on first failure.

The same gates are enforced in CI on push/PR by .github/workflows/ci.yml.

Use PUBLISHING.md for full publication details.

Benchmarks

Metric	Rust	C subprocess	Speedup
First-phoneme latency	~606 ns	~5.5 ms	~9 000×
Synthesizer throughput	380× real-time	—	—
Resonator DSP (per sample)	3.2 ns	—	—
Encoding name lookup	3.0 ns	—	—

The Rust speedup over C subprocess comes entirely from eliminating process-spawn and shared-library initialisation overhead — the in-process dictionary lookup + rule engine returns the first phoneme in under a microsecond.

See BENCHMARK.md for the full Criterion HTML report.

./benches/bench.sh               # run + snapshot + generate BENCHMARK.md
./benches/bench.sh --no-run      # regenerate BENCHMARK.md from last run
./benches/bench.sh --filter resonator   # one group only

Benchmark groups:

Group	What is measured
`encoding/utf8_decode`	UTF-8 decode throughput across scripts and input sizes
`encoding/name_lookup`	`Encoding::from_name()` lookup latency
`synthesize/resonator`	Single resonator DSP filter tick (`Resonator::tick()`)
`text_to_ipa/rust`	Full Rust pipeline: text → IPA
`text_to_ipa/c_cli`	C subprocess baseline (process spawn included)
`latency/first_phoneme`	First-phoneme latency: Rust vs C subprocess
`text_to_ipa/ffi_vs_rust`	Rust vs C FFI baseline (`--features c-oracle`)

Project layout

espeak-ng-rs/
├── src/
│   ├── lib.rs              public API + module declarations
│   ├── error.rs            EspeakError enum, Result alias
│   ├── encoding/
│   │   ├── mod.rs          Encoding enum, TextDecoder, utf8_decode_one/encode_one
│   │   └── codepages.rs    ISO-8859-*, KOI8-R, ISCII lookup tables
│   ├── phoneme/
│   │   ├── mod.rs          PhonemeType, PhonemeFlags, PhonemeTable
│   │   ├── load.rs         Binary phoneme table loader
│   │   ├── table.rs        Table selection, mnemonic access
│   │   └── feature.rs      Phoneme feature extraction
│   ├── dictionary/
│   │   ├── mod.rs          Constants, flag definitions
│   │   ├── file.rs         Dictionary binary parser, group index
│   │   ├── lookup.rs       Hash-based word lookup
│   │   ├── rules.rs        MatchRule + TranslateRules engine
│   │   ├── stress.rs       SetWordStress, GetVowelStress
│   │   ├── phonemes.rs     Phoneme encoding helpers
│   │   └── transpose.rs    TransposeAlphabet decompression
│   ├── translate/
│   │   ├── mod.rs          Translator, text_to_ipa, word_to_phonemes,
│   │   │                   tokeniser, number-to-phonemes, IPA renderer
│   │   └── ipa_table.rs    Kirschenbaum → IPA lookup, mnemonic overrides
│   ├── synthesize/
│   │   ├── mod.rs          Synthesizer API, synthesize_codes() high-quality path
│   │   ├── engine.rs       IPA → cascade formant synthesizer (generic path)
│   │   ├── targets.rs      IPA → FormantTarget table (60 phonemes)
│   │   ├── phondata.rs     Binary SPECT_SEQ / frame_t parser from phondata
│   │   ├── bytecode.rs     Phoneme bytecode scanner (finds i_FMT address)
│   │   ├── wavegen.rs      Harmonic synthesizer (PeaksToHarmspect + wavegen loop)
│   │   └── sintab_data.rs  2048-entry sine lookup table (from sintab.h)
│   └── oracle/mod.rs       FFI to libespeak-ng  (feature = c-oracle)
├── tests/
│   ├── common/mod.rs
│   ├── encoding_integration.rs
│   ├── dictionary_integration.rs
│   └── oracle_comparison.rs
├── benches/
│   ├── vs_c.rs             Criterion benchmark suite
│   ├── bench.sh            Run benchmarks + generate BENCHMARK.md
│   └── results/            Criterion JSON + SVG snapshots (committed)
├── build.rs                pkg-config link (c-oracle) + CMake build (bundled-espeak)
├── Cargo.toml
├── BENCHMARK.md
└── README.md

Known limitations

Number translation covers English only; other languages fall back to spelling out digits.
Prefix stripping not yet implemented (very rare in English).
phonSWITCH (mid-word language switching) not yet handled.
Ordinal numbers ("1st", "2nd") not yet supported.

Licence

GPL-3.0-or-later — same as eSpeak NG.

Authors

Eugene Hauptmann

Copyright

This project is a from-scratch Pure Rust reimplementation and does not copy C source from eSpeak NG, but it is licensed under the same terms: GPL-3.0-or-later.

Source: https://github.com/eugenehp/espeak-ng-rs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

espeak-ng-rs

Status

Quick start

Usage

What is implemented

`encoding/`

`phoneme/`

`dictionary/`

`translate/`

Oracle test coverage

Testing approach

Data directory

Features

Publishing checklist

Benchmarks

Project layout

Known limitations

Licence

Authors

Copyright

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.github/workflows		.github/workflows
benches		benches
data-crates		data-crates
espeak-ng-data		espeak-ng-data
examples		examples
scripts		scripts
src		src
tests		tests
.gitignore		.gitignore
BENCHMARK.md		BENCHMARK.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
PUBLISHING.md		PUBLISHING.md
README.md		README.md
build.rs		build.rs

Folders and files

Latest commit

History

Repository files navigation

espeak-ng-rs

Status

Quick start

Usage

What is implemented

encoding/

phoneme/

dictionary/

translate/

Oracle test coverage

Testing approach

Data directory

Features

Publishing checklist

Benchmarks

Project layout

Known limitations

Licence

Authors

Copyright

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`encoding/`

`phoneme/`

`dictionary/`

`translate/`

Packages