VoiceBridge: Personalized Speech Assistant

VoiceBridge is an offline Android assistant for users with speech impairment (dysarthria and related conditions). It fine-tunes Gemma 4 on a single user's voice and runs on-device for private, low-latency voice control.

This repository is organized so a 3-person team can execute the hackathon plan in parallel:

ML Lead: dataset prep, Unsloth fine-tuning, WER/CER benchmarking, Hugging Face publish
Android Lead: data collection app, inference app, action dispatcher, latency benchmarks
Product/Video Lead: user studies, impact proof, video, writeup, submission packaging

Repository Structure

.
├── android/                       # Android Studio project
├── benchmarks/                    # WER/CER + latency report tooling
├── data/                          # 195-sentence prompt list and notes
├── docs/                          # execution and submission checklists
├── training/                      # Kaggle/Colab training pipeline
├── kaggle_writeup_template.md
├── model_card.md
└── speech_assistant_hackathon_guide.md

Quick Start

0) If you do not have Android Studio or local GPU

Use the fallback path:

docs/no_android_low_compute_path.md

This gives you:

browser-based data collection (web_demo/),
typed command demo for video and testing,
cloud-only training (Kaggle/Colab) with lighter model options.

1) ML setup (Kaggle preferred)

Open training/train.ipynb in Kaggle and run all cells in order.

If running locally:

python3 -m venv .venv
source .venv/bin/activate
pip install -r training/requirements.txt
python3 training/train.py --help

2) Android setup

Open android/ in Android Studio (Meerkat or newer).
Sync Gradle.
Connect a real Android device.
Run app module.
Set MODEL_URL in android/app/src/main/java/com/voicebridge/speechassist/model/ModelDownloadConfig.kt.
Use Download Model in the app (or preload model file manually).
Use DataCollectionActivity to capture sentence recordings.
Use typed demo mode for reliable command walkthroughs during integration.

2b) Browser demo setup (no Android tooling required)

python3 -m venv .venv
source .venv/bin/activate
pip install -r web_demo/requirements.txt
python3 web_demo/app.py

Then open http://localhost:7860.

3) Benchmarks

python3 benchmarks/benchmark_report.py \
  --baseline-wer 0.72 \
  --baseline-cer 0.55 \
  --finetuned-wer 0.14 \
  --finetuned-cer 0.09 \
  --model-name "gemma-4-e4b personalized" \
  --user-description "Adult with moderate dysarthria" \
  --num-train-samples 720 \
  --num-val-samples 80 \
  --inference-latency-ms 820 \
  --device-name "OnePlus 12"

Week 1-4 Operating Plan

Use:

docs/hackathon_execution_checklist.md
docs/submission_checklist.md
docs/no_android_low_compute_path.md

These documents are optimized for a 4-week sprint and mirror the guide.

Critical Notes

Keep all personally identifiable or sensitive user audio private.
Get explicit consent from the user featured in demo/video/writeup.
For model inference on device, place or download the quantized model into the app's external files directory.
The inference engine uses the modern LiteRT-LM Engine/Conversation runtime with legacy fallback.
Always validate on target device because audio backend/model support varies by model export and runtime version.

Deliverables

By submission day, ensure these are ready:

Public GitHub repository
Public Hugging Face adapter + GGUF
benchmarks/benchmark_report.json
Unlisted YouTube demo under 3 minutes
Final Kaggle writeup (based on template)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VoiceBridge: Personalized Speech Assistant

Repository Structure

Quick Start

0) If you do not have Android Studio or local GPU

1) ML setup (Kaggle preferred)

2) Android setup

2b) Browser demo setup (no Android tooling required)

3) Benchmarks

Week 1-4 Operating Plan

Critical Notes

Deliverables

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
android		android
benchmarks		benchmarks
data		data
docs		docs
scripts		scripts
training		training
web_demo		web_demo
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
kaggle_writeup_template.md		kaggle_writeup_template.md
model_card.md		model_card.md
speech_assistant_hackathon_guide.md		speech_assistant_hackathon_guide.md

Folders and files

Latest commit

History

Repository files navigation

VoiceBridge: Personalized Speech Assistant

Repository Structure

Quick Start

0) If you do not have Android Studio or local GPU

1) ML setup (Kaggle preferred)

2) Android setup

2b) Browser demo setup (no Android tooling required)

3) Benchmarks

Week 1-4 Operating Plan

Critical Notes

Deliverables

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages