Iterate on cross-platform support#132
Merged
Merged
Conversation
* Adapt logic to pass tests on windows and macos * Configure stdout/stderr to use UTF-8 before any output in main.py * Skip lab file validation when registering an untranscribed dataset * Skip dot-underscore macOS resource fork files during transcription * Resolve conda executable path across platforms for MFA service calls * Pin separate git dependencies to local branches * Switch to editable for local deps * Fix cache default based on dataset cached * Pin forked speechbrain to patch problems * Add lock file updates * Fix purple alternating table rows by overriding system palette alternate-background-color * Separate tasks for build by os, add windows-build * Block terminal from opening during subprocesses in built exe * Add windows voxkit icon for exe * Add transcription verification w.r.t. expectation * Store textgrids in local storage for all datasets and fix w2tg using wrong audio root for non-cached * Add 2 additional dataset analysis methods * Fix and expand dataset validation tests * Fix model import path rewrite on Windows * Fix training using wrong audio path for non-cached datasets * Fix aggregation function resolution for compute_pllr * Remove local task file * Move some dependencies to remote
MFA 3.3.x's SQLite backend hits a multiprocessing race in collect_alignments on Windows where the word_interval_temp staging table disappears between worker connections, aborting alignment. Run `mfa server init` + `mfa server start` (both idempotent) before align and adapt so the bundled Postgres backend is used instead. Gated to win32 so macOS and Linux behavior is unchanged.
Old install script added $HOME/.cargo/bin to PATH, but modern uv installs to $HOME/.local/bin, breaking Windows and Ubuntu jobs.
pytest-qt failed with 'libEGL.so.1: cannot open shared object file' on ubuntu-latest because Qt's runtime libraries aren't preinstalled and there's no display server available.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request improves cross-platform compatibility (Windows, Mac, Linux); and updates dependency management for better reproducibility and packaging. Also the addition of two new analyzers for audio statistics. Some cleanup and refactoring ensued along the way.
New Dataset Analyzers:
ClipDurationStatisticsAnalyzerto compute total, average, min, and max clip durations per speaker, including a bar chart visualization. [1] [2] [3] [4]AudioFormatProfileAnalyzerto summarize dominant sample rate, channel count, and flag inconsistent files per speaker. [1] [2] [3] [4]Windows and Cross-Platform Improvements:
main.pyto callmultiprocessing.freeze_support()at the top for PyInstaller compatibility, redirectsys.stdout/sys.stderrtoos.devnullif missing, and set UTF-8 encoding on Windows to prevent Unicode errors. [1] [2]VoxKit.issand Linux desktop entry for packaging and distribution. [1] [2]Dependency Management and Packaging:
pyproject.tomlto specify key dependencies (e.g.,pypllrcomputer,wav2textgrid,alignment-comparison-plots,speechbrain) via[tool.uv.sources]for Windows compatibility and reproducibility. [1] [2] [3]pyproject.tomlto ignore build artifacts (build/,dist/).Bug Fixes and Minor Improvements:
faster_whisper_engine.pyto skip hidden files when globbing for audio files.mfa_engine.pyfor better error reporting.Developer Experience:
.pre-commit-config.yamlto run theshredguardcheck viauvfor consistent development environment.