Skip to content

hhheath/audiolens

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🎵 audiolens

A powerful CLI tool for analyzing audio samples, with a focus on ML training data quality checks.

Features

  • Comprehensive Audio Analysis: Extract a wide range of audio features including:

    • MFCCs (Mel-frequency cepstral coefficients)
    • Spectral features (centroid, rolloff, flux, bandwidth)
    • Fundamental frequency (F0/pitch) analysis
    • Zero-crossing rate (ZCR)
    • Formants (F1, F2, F3)
    • Voice quality metrics (jitter, shimmer, HNR)
    • Energy metrics (RMS, peak levels, crest factor, energy entropy)
    • Quality indicators (SNR estimate, clipping detection, DC offset, speech-to-silence ratio)
  • Rich Terminal UI: Beautiful, formatted output using the rich library

  • Visualization: ASCII spectrogram in terminal + high-resolution PNG export

  • Batch Processing: Analyze entire directories of audio files

  • CSV Export: Export batch results for further analysis

  • Multiple Formats: Support for WAV, MP3, FLAC, OGG, M4A

Installation

Recommended: Using pipx (for global CLI access)

# Install pipx if you don't have it
pip install --user pipx

# Install audiolens globally
pipx install git+https://github.com/yourusername/audiolens.git

# Or from local source
pipx install /path/to/audiolens

Why pipx? Installs CLI tools in isolated environments, avoiding dependency conflicts.

Alternative: From PyPI (once published)

pipx install audiolens

For Development

git clone https://github.com/yourusername/audiolens.git
cd audiolens

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install in editable mode
pip install -e .

Manual Installation

# Clone and install
git clone https://github.com/yourusername/audiolens.git
cd audiolens
pip install --user .

Dependencies

audiolens requires Python 3.8+ and the following packages:

  • librosa - Audio feature extraction
  • numpy - Numerical computing
  • scipy - Scientific computing
  • rich - Terminal UI
  • soundfile - Audio I/O
  • matplotlib - Visualization
  • praat-parselmouth - Voice quality metrics (optional but recommended)

All dependencies are installed automatically.

Usage

Analyze a single audio file

audiolens analyze audio.wav

Analyze with spectrogram visualization

audiolens analyze audio.wav --visualize

This displays an ASCII spectrogram in your terminal and saves a high-resolution PNG image.

Batch analyze a directory

audiolens analyze /path/to/audio/files/

Export batch results to CSV

audiolens analyze /path/to/audio/files/ --output results.csv

Custom analysis parameters

audiolens analyze audio.wav --n-mfcc 20 --sample-rate 22050 --hop-length 256

CLI Options

audiolens analyze [OPTIONS] INPUT

Arguments:
  INPUT                Input audio file or directory

Options:
  -o, --output PATH    Output CSV file for batch results
  -v, --visualize      Display spectrogram visualization
  --sample-rate INT    Target sample rate (default: use file's native rate)
  --n-mfcc INT         Number of MFCCs to compute (default: 13)
  --hop-length INT     Hop length for frame-based analysis (default: 512)
  --n-fft INT          FFT window size (default: 2048)
  --formats STR        Comma-separated audio formats (default: wav,mp3,flac,ogg,m4a)
  --help               Show help message

Output Features

File Information

  • Duration
  • Sample rate
  • Number of channels

Spectral Features

  • Spectral centroid (mean & std)
  • Spectral rolloff (mean & std)
  • Spectral bandwidth (mean & std)
  • Spectral flux (mean & std)

Pitch & Voice Quality

  • F0 fundamental frequency (mean, std, min, max)
  • Formants F1, F2, F3 (mean values)
  • Jitter (pitch variation)
  • Shimmer (amplitude variation)
  • HNR (Harmonics-to-Noise Ratio)

Energy & Dynamics

  • RMS energy (mean & std)
  • Peak amplitude
  • Crest factor
  • Energy entropy
  • Zero-crossing rate

Quality Indicators

  • SNR estimate
  • Speech-to-silence ratio
  • Clipping detection & percentage
  • DC offset

Examples

Check audio quality for ML dataset

# Analyze all wav files in a directory and export to CSV
audiolens analyze ./dataset/train/ --output train_analysis.csv --formats wav

# Check for common issues
# The output will highlight:
# - Low SNR (< 10 dB = poor quality)
# - Clipping (> 0.1% clipped samples)
# - High DC offset (> 0.01)

Analyze speech recordings

# Analyze a single speech file with full voice quality metrics
audiolens analyze speech.wav --visualize

# This will show:
# - Pitch characteristics (F0)
# - Formants (vowel quality)
# - Jitter & shimmer (voice stability)
# - HNR (voice clarity)

Compare audio files

# Batch analyze and compare
audiolens analyze ./recordings/ --output comparison.csv

# Open comparison.csv in Excel/pandas to compare:
# - SNR across files
# - Spectral characteristics
# - Clipping issues
# - Energy levels

Use Cases

ML Training Data Quality

  • Detect clipping, noise, and other artifacts
  • Ensure consistent audio quality across dataset
  • Identify outliers based on spectral features
  • Validate SNR requirements

Speech Analysis

  • Analyze pitch and formants
  • Measure voice quality (jitter, shimmer, HNR)
  • Detect speech vs. silence regions

Audio Processing Validation

  • Verify audio processing pipelines
  • Check for DC offset after processing
  • Ensure no clipping introduced
  • Validate spectral characteristics

Development

Running tests

pytest tests/

Code formatting

black src/

Type checking

mypy src/

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit your changes (git commit -m 'Add some AmazingFeature')
  4. Push to the branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Citation

If you use audiolens in your research, please cite:

@software{audiolens,
  title = {audiolens: Audio Analysis for ML Training Data},
  author = {Your Name},
  year = {2025},
  url = {https://github.com/yourusername/audiolens}
}

Roadmap

  • Add more quality metrics (PESQ, STOI, etc.)
  • Support for multi-channel analysis
  • Integration with audio augmentation libraries
  • Interactive TUI mode
  • Plugin system for custom analyzers
  • Real-time audio analysis mode
  • Web-based dashboard for batch results

About

CLI tool for analyzing audio samples, with a focus on ML training data quality checks.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published