This project combines audio transcription and smart editing into a convenient end-to-end user tool. The smart editing feature parses the transcription, adds appropriate paragraph breaks and corrects grammatical / sentence errors. Users are able to manually edit the text to their liking at any stage of processing. It uses the following models:
- Transcription: OpenAI's Whisper model
- Post-transcription smart editing: Microsoft's Phi-3 Mini 128K Instruct OpenRouter.AI API
- Clone the repository.
- Install python3 and pip.
- Inside the project root folder, run
python -m venv venvto create a virtual environment. - Activate the virtual environment: run
source ./venv/bin/activate. - Inside the project root folder, run
pip install -r requirements.txtto install the necessary packages. - To download the Whisper model, run
pip install --upgrade --no-deps --force-reinstall git+https://github.com/openai/whisper.git. Follow the instructions on https://github.com/openai/whisper for more information.
- Run
python -m flask --app tool run --port 8000 - Open the web app on
localhost:8000in your Chrome browser.
- Flask
- Python
- HTML
- JavaScript
- CSS