Word substitution hallucinations during synthesis

**Environment:** VoXtream v0.2.0, herimor/voxtream2 model, RTX 3060 12GB, CUDA 12.8, Ubuntu Linux, Python 3.12

**Description:**
The model occasionally substitutes entirely unrelated words in the output. For example, the word "small" in the input text was spoken as "fish" in the audio output. These are not mispronunciations — they are completely different words with no phonetic similarity.

**Steps to reproduce:**
Intermittent — no reliable reproduction steps identified yet. Occurs during normal conversational synthesis with a voice-cloned reference WAV.

**Expected behavior:**
The spoken output should match the input text.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Word substitution hallucinations during synthesis #14

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Word substitution hallucinations during synthesis #14

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions