Skip to content
View evie-8's full-sized avatar

Block or report evie-8

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
evie-8/README.md

Nafula Evelyn Ouma

Data Engineer · Kampala, Uganda

I build data pipelines and storage systems for large-scale speech and NLP datasets, specializing in low-resource African languages. Data contributor on 4 published research papers. Pursuing GCP Associate Cloud Engineer certification.


Current focus

  • Automated ETL pipeline processing 12,000+ hours of speech data across 51 African languages
  • Scaling text data preparation for 40 → 75 African languages for LLM training

Stack Python · SQL · GCP (BigQuery, Cloud Storage) · HuggingFace Datasets · YAML · pandas · Docker

Domain Speech data · NLP · Low-resource African languages · Audio processing · ASR


Research contributions

# Paper Venue Year
01 How Much Speech Data Is Necessary for ASR in African Languages? arXiv 2025
02 Sunflower: Expanding Coverage of African Languages in LLMs arXiv 2025
03 SALT-31: A Machine Translation Benchmark for 31 Ugandan Languages OpenReview 2026
04 Noise Mapping and Ambient Sound Recordings in Urban Uganda ResearchGate 2026

📧 info@evelynouma.com · LinkedIn · Portfolio

Pinned Loading

  1. upgrade-lms upgrade-lms Public

    Learning Managmement System

    TypeScript

  2. printf printf Public

    Creating our own printf function

    C

  3. magasin-ecommerce-store magasin-ecommerce-store Public

    E-commerce website

    TypeScript

  4. ifiok1equere/simple_shell ifiok1equere/simple_shell Public

    Project collaboration with Evelyn on building a simple shell.

    C

  5. SunbirdAI/kinyarwanda-asr-hackathon SunbirdAI/kinyarwanda-asr-hackathon Public

    Entry to the ASR Kinyarwanda hackathon organised by Digital Umuganda

    Jupyter Notebook 1

  6. SunbirdAI/sunbird-ai-api SunbirdAI/sunbird-ai-api Public

    Sunbird's API for language translation, speech to text and text to speech in African languages and English

    Python 9 5