SONAR currently depends on fairseq2>=0.5.2 for transformer architecture, tokenization, speech models, data loading, sequence generation, and model registry. This limits adoption.
Goal: replace all fairseq2 usage with PyTorch-native implementations and/or HuggingFace transformers, making SONAR installable with standard pip install and compatible with a wider range of environments & make SONAR easier to use for embedding-only downstream applications while keeping the larger migration incremental.
SONAR currently depends on
fairseq2>=0.5.2for transformer architecture, tokenization, speech models, data loading, sequence generation, and model registry. This limits adoption.Goal: replace all fairseq2 usage with PyTorch-native implementations and/or HuggingFace transformers, making SONAR installable with standard
pip installand compatible with a wider range of environments & make SONAR easier to use for embedding-only downstream applications while keeping the larger migration incremental.