Skip to content

Improving language detection#77

Open
a-nasstrom wants to merge 2 commits intoVexa-ai:mainfrom
Symfa-Inc:language_detection_algorithm
Open

Improving language detection#77
a-nasstrom wants to merge 2 commits intoVexa-ai:mainfrom
Symfa-Inc:language_detection_algorithm

Conversation

@a-nasstrom
Copy link
Copy Markdown

I improved the language detection algorithm by adding segment-level probability aggregation, weighted scoring, early stopping logic, and more robust handling of noisy or mixed-language audio.

…, weighted scoring, early stopping, and more robust handling of noisy/mixed audio.
@DmitriyG228 DmitriyG228 added this to the 0.7 patches milestone Feb 4, 2026
@DmitriyG228 DmitriyG228 added the area: transcription (Whisper/STT) Transcription / Whisper / STT label Feb 4, 2026
@DmitriyG228 DmitriyG228 removed this from the 0.7 patches milestone Feb 13, 2026
@DmitriyG228
Copy link
Copy Markdown
Contributor

This PR has been open since November 2025 and is currently CONFLICTING with main. Are you still working on it?

  • If yes: happy to coordinate a rebase; ping us and we'll prioritize review.
  • If no: we'll close and surface the idea (segment-level probability aggregation for language detection) in a future groom cycle in case anyone else wants to pick it up.

No pressure either way — just avoiding an indefinite in-flight PR.

@DmitriyG228 DmitriyG228 added this to the 0.11 milestone Apr 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area: transcription (Whisper/STT) Transcription / Whisper / STT

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants