Video Transtool is a utility app to transcode and transcribe video files using FFmpeg and OpenAI's API.
Live demo: Video Transtool
DISCLAIMER: Due to Vercel Functions limit, in the live version only videos smaller than 5MB can be processed.
In case you are interested to use this utility to process larger files, you can run it locally or deploy it to your own server. You must use your own OpenAI API key in this case.
Install dependencies:
pnpm installRun the development server:
pnpm run devCreate a .env.local file in the root directory with the following variables:
OPENAI_API_KEY=your_openai_api_keyThis app has been developed as a workaround for some issues found with the transcription with Whisper of some recordings. After some research, it was found that some browsers, devices or configurations produced videos with codecs that were not supported by Whisper, leading to errors or suboptimal transcriptions.
With FFmpeg, we can transcode the video to a format that is supported by Whisper, which can then be transcribed with better results. In addition, we can end up with a file with a smaller size, which can be useful for storage or sharing purposes. After all, audio transcription doesn't need great video qualities.
FFmpeg is integrated to this app thanks to the @ffmpeg/ffmpeg package, which implements FFmpeg's functionality in a WebAssembly module.
The transcription is handled by OpenAI's API, which is used through the @ai-sdk/openai package with the Whisper model.
For now, presets for the transcoding are hardcoded. Adding UI elements to customize these parameters would be a good next step.
Upload a video file and wait for the transcription to complete.
MIT