🎵 audiolens

A powerful CLI tool for analyzing audio samples, with a focus on ML training data quality checks.

Features

Comprehensive Audio Analysis: Extract a wide range of audio features including:
- MFCCs (Mel-frequency cepstral coefficients)
- Spectral features (centroid, rolloff, flux, bandwidth)
- Fundamental frequency (F0/pitch) analysis
- Zero-crossing rate (ZCR)
- Formants (F1, F2, F3)
- Voice quality metrics (jitter, shimmer, HNR)
- Energy metrics (RMS, peak levels, crest factor, energy entropy)
- Quality indicators (SNR estimate, clipping detection, DC offset, speech-to-silence ratio)
Rich Terminal UI: Beautiful, formatted output using the rich library
Visualization: ASCII spectrogram in terminal + high-resolution PNG export
Batch Processing: Analyze entire directories of audio files
CSV Export: Export batch results for further analysis
Multiple Formats: Support for WAV, MP3, FLAC, OGG, M4A

Installation

Recommended: Using pipx (for global CLI access)

# Install pipx if you don't have it
pip install --user pipx

# Install audiolens globally
pipx install git+https://github.com/yourusername/audiolens.git

# Or from local source
pipx install /path/to/audiolens

Why pipx? Installs CLI tools in isolated environments, avoiding dependency conflicts.

Alternative: From PyPI (once published)

pipx install audiolens

For Development

git clone https://github.com/yourusername/audiolens.git
cd audiolens

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install in editable mode
pip install -e .

Manual Installation

# Clone and install
git clone https://github.com/yourusername/audiolens.git
cd audiolens
pip install --user .

Dependencies

audiolens requires Python 3.8+ and the following packages:

librosa - Audio feature extraction
numpy - Numerical computing
scipy - Scientific computing
rich - Terminal UI
soundfile - Audio I/O
matplotlib - Visualization
praat-parselmouth - Voice quality metrics (optional but recommended)

All dependencies are installed automatically.

Usage

Analyze a single audio file

audiolens analyze audio.wav

Analyze with spectrogram visualization

audiolens analyze audio.wav --visualize

This displays an ASCII spectrogram in your terminal and saves a high-resolution PNG image.

Batch analyze a directory

audiolens analyze /path/to/audio/files/

Export batch results to CSV

audiolens analyze /path/to/audio/files/ --output results.csv

Custom analysis parameters

audiolens analyze audio.wav --n-mfcc 20 --sample-rate 22050 --hop-length 256

CLI Options

audiolens analyze [OPTIONS] INPUT

Arguments:
  INPUT                Input audio file or directory

Options:
  -o, --output PATH    Output CSV file for batch results
  -v, --visualize      Display spectrogram visualization
  --sample-rate INT    Target sample rate (default: use file's native rate)
  --n-mfcc INT         Number of MFCCs to compute (default: 13)
  --hop-length INT     Hop length for frame-based analysis (default: 512)
  --n-fft INT          FFT window size (default: 2048)
  --formats STR        Comma-separated audio formats (default: wav,mp3,flac,ogg,m4a)
  --help               Show help message

Output Features

File Information

Duration
Sample rate
Number of channels

Spectral Features

Spectral centroid (mean & std)
Spectral rolloff (mean & std)
Spectral bandwidth (mean & std)
Spectral flux (mean & std)

Pitch & Voice Quality

F0 fundamental frequency (mean, std, min, max)
Formants F1, F2, F3 (mean values)
Jitter (pitch variation)
Shimmer (amplitude variation)
HNR (Harmonics-to-Noise Ratio)

Energy & Dynamics

RMS energy (mean & std)
Peak amplitude
Crest factor
Energy entropy
Zero-crossing rate

Quality Indicators

SNR estimate
Speech-to-silence ratio
Clipping detection & percentage
DC offset

Examples

Check audio quality for ML dataset

# Analyze all wav files in a directory and export to CSV
audiolens analyze ./dataset/train/ --output train_analysis.csv --formats wav

# Check for common issues
# The output will highlight:
# - Low SNR (< 10 dB = poor quality)
# - Clipping (> 0.1% clipped samples)
# - High DC offset (> 0.01)

Analyze speech recordings

# Analyze a single speech file with full voice quality metrics
audiolens analyze speech.wav --visualize

# This will show:
# - Pitch characteristics (F0)
# - Formants (vowel quality)
# - Jitter & shimmer (voice stability)
# - HNR (voice clarity)

Compare audio files

# Batch analyze and compare
audiolens analyze ./recordings/ --output comparison.csv

# Open comparison.csv in Excel/pandas to compare:
# - SNR across files
# - Spectral characteristics
# - Clipping issues
# - Energy levels

Use Cases

ML Training Data Quality

Detect clipping, noise, and other artifacts
Ensure consistent audio quality across dataset
Identify outliers based on spectral features
Validate SNR requirements

Speech Analysis

Analyze pitch and formants
Measure voice quality (jitter, shimmer, HNR)
Detect speech vs. silence regions

Audio Processing Validation

Verify audio processing pipelines
Check for DC offset after processing
Ensure no clipping introduced
Validate spectral characteristics

Development

Running tests

pytest tests/

Code formatting

black src/

Type checking

mypy src/

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Fork the repository
Create your feature branch (git checkout -b feature/AmazingFeature)
Commit your changes (git commit -m 'Add some AmazingFeature')
Push to the branch (git push origin feature/AmazingFeature)
Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Built with librosa for audio analysis
Uses Praat-Parselmouth for voice quality features
Terminal UI powered by rich

Citation

If you use audiolens in your research, please cite:

@software{audiolens,
  title = {audiolens: Audio Analysis for ML Training Data},
  author = {Your Name},
  year = {2025},
  url = {https://github.com/yourusername/audiolens}
}

Roadmap

Add more quality metrics (PESQ, STOI, etc.)
Support for multi-channel analysis
Integration with audio augmentation libraries
Interactive TUI mode
Plugin system for custom analyzers
Real-time audio analysis mode
Web-based dashboard for batch results

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
docs		docs
examples		examples
src/audiolens		src/audiolens
tests		tests
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
PUBLISHING.md		PUBLISHING.md
README.md		README.md
publish.sh		publish.sh
pyproject.toml		pyproject.toml
setup.py		setup.py

License

hhheath/audiolens

Folders and files

Latest commit

History

Repository files navigation

🎵 audiolens

Features

Installation

Recommended: Using pipx (for global CLI access)

Alternative: From PyPI (once published)

For Development

Manual Installation

Dependencies

Usage

Analyze a single audio file

Analyze with spectrogram visualization

Batch analyze a directory

Export batch results to CSV

Custom analysis parameters

CLI Options

Output Features

File Information

Spectral Features

Pitch & Voice Quality

Energy & Dynamics

Quality Indicators

Examples

Check audio quality for ML dataset

Analyze speech recordings

Compare audio files

Use Cases

ML Training Data Quality

Speech Analysis

Audio Processing Validation

Development

Running tests

Code formatting

Type checking

Contributing

License

Acknowledgments

Citation

Roadmap

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages