Whisper GUI

Transcribe your audio locally with Whisper multimodal speech-to-text model by OpenAI

Installation

MacOS

Download DMG-image from releases and launch it. You can use the application from image or copy it to Applications folder for a quick access. It's up to you.

When you first open dmg-image or application, you are most probable to get untrusted developer warning. It basically means that, as far as the project is non-commercial, I do not have neither funding, nor wish to register in Apple. To bypass the warning and use the application:

open system settings
navigate to privacy and security settings
allow the application to run (enter password, if prompted)

To uninstall the application, move the .app file and ~/.cache/whisper to trash.

Building from source

Install uv
Clone repository

git clone https://github.com/loginchik/whisper-app.git
cd whisper-app

Configure virtual environment

uv sync --all-groups

Build app

make build_english

Usage

Internet connection is required only to download a model on its first usage, while all other processes run locally on your machine

Launch application

You need to choose a model and add audio files to process

Choose model

tiny, base or small are powerful enough to handle most tasks and can be run on almost any PC having at least 4GB of RAM. For more complicated cases you can try larger models, but remember about resource limitations of your machine. You can always check About models or Whisper to study model's requirements.

Previously used models, if you had not manually deleted them from cache directory, are almost ready to use. Models that need to be downloaded first are marked with red icon.

— here tiny and base are available locally; small, medium and large will be downloaded first.

Add audio files and configure task

For each audio file, it is recommended to pass language for Whisper to start with relative context. Presets are predefined task settings (created by ChatGPT) that can help you handle popular tasks. If you are not sure which preset to choose, use universal.

Besides preset, there are separate options you can set manually:

Setting	Purpose	Performance
Word timestamps	Force Whisper to export every word's timing and probability in the resulting Excel file	Decreases peformance and increases processing time
Propmt	Pass additional context to Whisper to put into context of the audio contents	Correctly formed, prompt usually increased transcription quality
Condition on previous text	Take previously processed audio files into consideration, when transcribing this one	May lead to hallucinations
FP16		More energy efficient mode, which can worsen transcription quality

Start transcription and wait it to complete

While task is running, most application features freeze.

Transcribed files are exported in Downloads folder and can be located via double click.

Contributing

This is a non-commercial project for personal usage, contributions are welcome. For major changes, please open an issue first to discuss what you would like to change.

Troubleshooting

If application crashes a few times in a row, consider it a bug. If you want the bug to be fixed, open an issue and make sure to include step-by-step description of your actions and log files from ~/.cache/whisper/logging directory.

License

GPL v3

Name		Name	Last commit message	Last commit date
Latest commit History 77 Commits
.github/ISSUE_TEMPLATE		.github/ISSUE_TEMPLATE
assets		assets
src		src
.gitattributes		.gitattributes
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
build.spec		build.spec
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Whisper GUI

Installation

MacOS

Building from source

Usage

Launch application

Choose model

Add audio files and configure task

Start transcription and wait it to complete

Contributing

Troubleshooting

License

About

Uh oh!

Releases 1

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Whisper GUI

Installation

MacOS

Building from source

Usage

Launch application

Choose model

Add audio files and configure task

Start transcription and wait it to complete

Contributing

Troubleshooting

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Uh oh!

Contributors

Uh oh!

Languages