This repository contains Python scripts for analyzing audio signals and extracting features for speech recognition and other machine learning tasks. The code demonstrates recording audio, visualizing waveforms, generating spectrograms, and extracting Mel-Frequency Cepstral Coefficients (MFCCs).
Additionally, this repository includes a detailed example of the Fourier Transform in both LaTeX and PDF formats, providing a mathematical explanation and visualization of how time-domain signals transform into the frequency domain.
- Record and save audio as a WAV file.
- Visualize time-domain waveforms using Matplotlib.
- Generate spectrograms to analyze frequency variations over time.
- Extract MFCCs, a compact representation of audio signals, for machine learning.
- Understand the Fourier Transform with provided LaTeX and PDF documentation.
- Speech recognition
- Sentiment analysis
- Speaker identification
- Audio classification
- Python 3.x
- Libraries:
numpymatplotliblibrosasounddevicescipy
Install all dependencies with:
pip install -r requirements.txt