From Spoken Voice to Textual Output in Seconds - Whisper's Guide
From Spoken Voice to Textual Output in Seconds - Whisper’s Guide
The very same people behind ChatGPT have created another AI-based tool you can use today to boost your productivity. We’re referring to Whisper, a voice-to-text solution that eclipsed all similar solutions that came before it.
You can use Whisper in your programs or the command line. And yet, that defeats its very purpose: typing without a keyboard. If you need to type to use it, why use it to avoid typing? Thankfully, you can now use Whisper through a desktop GUI. Even better, it can also transcribe your voice almost in real time. Let’s see how you can type with your voice using Whisper Desktop.
What Is OpenAI’s Whisper?
OpenAI’s Whisper is an Automatic Speech Recognition system (ASR for short) or, to put it simply, is a solution for converting spoken language into text.
However, unlike older dictation and transcription systems, Whisper is an AI solution trained on over 680,000 hours of speech in various languages. Whisper offers unparalleled accuracy and, quite impressively, not only is it multilingual, but it can also translate between languages.
More importantly, it’s free and available as open source. Thanks to that, many developers have forked its code into their own projects or created apps that rely on it, like Whisper Desktop.
If you’d prefer the “vanilla” version of Whisper and the versatility of the terminal instead of clunky GUIs, check our article on how to turn your voice into text with OpenAI’s Whisper for Windows .
EmEditor Professional (Lifetime License, non-store app)
Are Whisper and Whisper Desktop the Same?
Despite its official-sounding name, Whisper Desktop is a third-party GUI for Whisper, made for everyone who’d prefer to click buttons instead of typing commands.
Whisper Desktop is a standalone solution that doesn’t rely on an existing Whisper installation. As a bonus, it uses an alternative, optimized version of Whisper, so it should perform better than the standalone version.
You’re on the other end of the spectrum, and instead of seeking an easier way to use Whisper than the terminal you’re seeking ways to implement it in your own solutions? Rejoice, for OpenAI has opened access to ChatGPT and Whisper APIs .
Download & Install Whisper Desktop
Although Whisper Desktop is easier to use than the standalone Whisper, its installation is more convoluted than repeatedly clicking Next in a wizard.
- Visit Whisper Desktop’s official Github page . Look on the right, and click on the latest version under Releases.
- Under Assets, click WhisperDesktop.zip and download it to your PC.
- Extract the downloaded archive to a folder and use your file manager to visit it. Inside you will find the Whisper Desktop application. Double-click on it to run it.
- You also need a Whisper language model in GCML binary format. Whisper Desktop will provide you with two links for acquiring one. Skip the second link for generating your own model since it’s a more complicated process. Click on Hugging Face to open that page in your default browser, from where you can download a ready-to-use file.
- The version of Whisper Desktop we used while writing this article provided a link to an obsolete repository at Hugging Face. If you meet the same problem, notice a link to a new location. Click on it to visit the new repository.
- Click on the link that will take you to the available models.
- From that list, click on either the ggml-medium.bin or ggml-medium.en.bin, depending on if you want multilingual or English-only support in Whisper.
- Finally, you should have reached your destination. Notice the line stating that this file is stored with Git LFS and is too big to display, but you can still download it. Click on download to do precisely that.
- When the file completes downloading, use your favorite file manager (File Explorer will do) to move the downloaded language model file into the same folder as Whisper Desktop.
Transcribing With Whisper Desktop
Transcribing with Whisper Desktop is easy, but you may still need one or two clicks to use the app.
Rerun Whisper Desktop. Does it (still) miss the correct path to your downloaded language model? Click on the button with the three dots on the right of the field and manually select the file you downloaded from Hugging Face.
From this spot, you can also use the drop-down menu next to Model Implementation to choose if you want to run Whisper on your GPU (GPU), on both the CPU and GPU (Hybrid), or only on the CPU (Reference).
The Advanced button leads to more options that affect how Whisper will run on your hardware. However, since the button clearly states they are advanced, we suggest you only tweak them if you are troubleshooting or know what you are doing. Setting the wrong options values here can impose a performance penalty or render the app unusable.
Click on OK to move to the app’s main interface.
- Title: From Spoken Voice to Textual Output in Seconds - Whisper's Guide
- Author: Joseph
- Created at : 2024-08-15 16:14:31
- Updated at : 2024-08-16 16:14:31
- Link: https://windows11.techidaily.com/from-spoken-voice-to-textual-output-in-seconds-whispers-guide/
- License: This work is licensed under CC BY-NC-SA 4.0.