Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ name: Release Build
on:
push:
branches:
- releases
- release
tags:
- 'v*'

Expand Down Expand Up @@ -79,7 +79,7 @@ jobs:

🍎 **Platform:** macOS (Apple Silicon & Intel)

Automated build from releases branch.
Automated build from release branch.

## Installation
1. Download VoxKit-macOS.app.zip
Expand All @@ -95,7 +95,7 @@ jobs:
- Built from commit: ${{ github.sha }}

draft: false
prerelease: false
prerelease: true

- name: Upload .app.zip to Release
uses: actions/upload-release-asset@v1
Expand Down
1 change: 1 addition & 0 deletions ARCHITECTURE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
<TODO>
Empty file added CONTRIBUTING.md
Empty file.
214 changes: 34 additions & 180 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,206 +1,60 @@
# TODO : Update README
# 🌉 VoxKit
---
[![Release](https://img.shields.io/github/v/release/BrainBehaviorAnalyticsLab/PyPLLR_GUI?label=Latest%20Release)](https://github.com/BrainBehaviorAnalyticsLab/PyPLLR_GUI/releases/latest) [![Downloads](https://img.shields.io/github/downloads/BrainBehaviorAnalyticsLab/PyPLLR_GUI/total)](https://github.com/BrainBehaviorAnalyticsLab/PyPLLR_GUI/releases) [![Tests](https://img.shields.io/github/actions/workflow/status/BrainBehaviorAnalyticsLab/PyPLLR_GUI/tests.yml?branch=main&label=tests)](https://github.com/BrainBehaviorAnalyticsLab/PyPLLR_GUI/actions/workflows/tests.yml) [![Project Management](https://img.shields.io/badge/Project-Jira%20Board-0052CC?logo=jira)](https://voxkit.atlassian.net/jira/software/projects/VOX/boards/2/)

[![Release](https://img.shields.io/github/v/release/BrainBehaviorAnalyticsLab/PyPLLR_GUI?label=Latest%20Release)](https://github.com/BrainBehaviorAnalyticsLab/PyPLLR_GUI/releases/latest)
[![Downloads](https://img.shields.io/github/downloads/BrainBehaviorAnalyticsLab/PyPLLR_GUI/total)](https://github.com/BrainBehaviorAnalyticsLab/PyPLLR_GUI/releases)
[![Project Management](https://img.shields.io/badge/Project-Jira%20Board-0052CC?logo=jira)](https://voxkit.atlassian.net/jira/software/projects/VOX/boards/2/)

Providing cross-functional Speech Pathology research teams with the capability to connect their researchers with state of the art speech analysis tools and forced alignment. Effectively bridging the gap between cutting edge AI/ML capability and clinical applications in speech pathology.

A modern PyQt6-based graphical user interface for phonetic alignment and PLLR (Phoneme-Level Likelihood Ratios) score extraction. This application provides an intuitive workflow for researchers working with speech data.
---

## Features

- **🎯 Intuitive Workflow**: Step-by-step interface with numbered navigation (1️⃣ Train Aligner → 2️⃣ Predict Alignments → 3️⃣ Extract PLLR Scores)
- **🔄 Multiple Alignment Methods**:
- **MFA (Montreal Forced Aligner)**: Traditional acoustic model-based alignment
- **Wav2TextGrid**: Neural network-based alignment using speech embeddings
- **📊 PLLR Score Extraction**: Automatic extraction of phoneme-level likelihood ratios
- **⚡ Background Processing**: Non-blocking operations with progress feedback
- **🎨 Modern UI**: Clean, professional interface with dark theme

## Prerequisites
## Development

### System Requirements
- **Python**: 3.11 or higher
- **Operating System**: macOS, Windows, or Linux
- **Memory**: At least 8GB RAM recommended
- **Storage**: ~5GB free space for models and dependencies
**Orientation:**

### Required Software
- **uv**: Modern Python package manager (`pip install uv`)
- **Conda/Miniconda**: For MFA environment management
- See [ARCHITECTURE](./ARCHITECURE.md) for codebase terminology...
- See [RESEARCH](./RESEARCH.md) for papers and research background...
- See [CONTRIBUTING](./CONTRIBUTING.md) for contibution guidelines...
- See [Documentation](<TODO>) for the rendered documentation...

## Installation
**Prerequisites:**
- [python](https://www.python.org/downloads/release/python-31114/) code language
- [uv](https://docs.astral.sh/uv/) package manager
- [git](https://git-scm.com/install/) version tracking

### 1. Clone the Repository
**Getting-started:**
```bash
git clone <repository-url>
# Clone repository
git clone https://github.com/BrainBehaviorAnalyticsLab/PyPLLR_GUI.git
cd PyPLLR_GUI
```

### 2. Install Dependencies
```bash
# Install Python dependencies
uv sync
```

### 3. Set Up MFA

(SEE => https://montreal-forced-aligner.readthedocs.io/en/latest/installation.html)

## Usage

### Launch the Application
```bash
uv run main.py
```

### Workflow Overview

#### 1️⃣ Train Aligner (Coming Soon)
- Training interface for custom acoustic models
- Currently displays placeholder content

#### 2️⃣ Predict Alignments
1. **Select Alignment Model**:
- **MFA**: Traditional acoustic model alignment (requires conda environment)
- **Wav2TextGrid**: Neural network-based alignment (no setup required)

2. **Set Paths**:
- **Data Corpus Path**: Directory containing WAV files and transcripts
- **TextGrid Output Path**: Where alignment results will be saved

3. **Run Alignment**:
- Click "Predict Alignments"
- Monitor progress in the status area
- Results saved as TextGrid files

#### 3️⃣ Extract PLLR Scores
1. **Set Input Paths**:
- **TextGrid Path**: Directory containing alignment TextGrid files
- **Wav/Lab Path**: Directory containing corresponding WAV files
# As easy as...

2. **Set Output Path**:
- **Output Path**: Where PLLR scores will be saved
# (1) Browse developer commands
make help

3. **Extract Scores**:
- Click "Extract PLLR"
- Generates CSV files with phoneme-level and frame-level probabilities
# (2) Install precommit and initialize environment
make setup

## Data Format Requirements

### For Alignment (Step 2)
```
corpus_directory/
├── speaker1/
│ ├── audio1.wav
│ ├── audio1.lab # Transcript text
│ ├── audio2.wav
│ └── audio2.lab
└── speaker2/
├── audio3.wav
└── audio3.lab
```

### For PLLR Extraction (Step 3)
```
textgrid_directory/
├── audio1.TextGrid # From alignment step
├── audio2.TextGrid
└── audio3.TextGrid

wav_directory/
├── audio1.wav
├── audio2.wav
└── audio3.wav
# (3) Start app (developer mode)
make dev
```

## Output Files

### Alignment Output
- **TextGrid files**: Praat-compatible alignment files with phoneme timestamps
---

### PLLR Output
- **phonewise_proba.csv**: Phoneme-level likelihood ratios
- **framewise_proba.csv**: Frame-level probability distributions

## Troubleshooting

### Common Issues

#### MFA Alignment Fails
**Error**: `Could not find a model named "english_us_arpa"`
**Solution**: Ensure MFA models are downloaded:
```bash
conda run -n aligner mfa model download acoustic english_us_arpa
conda run -n aligner mfa model download dictionary english_us_arpa
```

#### Import Errors
**Error**: `ModuleNotFoundError` for torch/torchaudio
**Solution**: Install specific versions:
```bash
uv pip install torch==2.8.0 torchaudio==2.8.0
```

#### GUI Won't Start
**Error**: PyQt6 import issues
**Solution**: Reinstall dependencies:
```bash
uv sync --reinstall
```

#### Memory Issues
**Problem**: Application crashes with large datasets
**Solution**:
- Process data in smaller batches
- Ensure at least 8GB RAM available
- Close other memory-intensive applications

### Getting Help
1. Check the status messages in the GUI for specific error details
2. Verify all paths exist and are accessible
3. Ensure conda environment is properly activated for MFA operations
## Citation

## Development
If you use VoxKit in your research, please cite:

### Project Structure
```bibtex
<TODO>
```
PyPLLR_GUI/
├── main.py # Main application code
├── pyproject.toml # Project configuration
├── README.md # This file
└── .venv/ # Virtual environment (created by uv)
```

### Key Dependencies
- **PyQt6**: GUI framework
- **PyPLLRComputer**: PLLR score extraction
- **Wav2TextGrid**: Neural alignment
- **Montreal Forced Aligner**: Traditional alignment
- **torch/torchaudio**: Neural network backend

### Contributing
1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Test thoroughly
5. Submit a pull request
---

## License

[TODO: Add license information here]

## Citation

If you use this software in your research, please cite:

```
[TODO: Add citation details here]
```
See [LICENSE](LICENSE) for details...

## Support
---

For issues, questions, or feature requests:
- Open an issue on GitHub
- Check the troubleshooting section above
- Review the dependency documentation for PyPLLRComputer and Wav2TextGrid
>For questions or collaboration inquiries, please open an issue on GitHub.
Empty file added RESEARCH.md
Empty file.
Loading