Comparison of Voice Coding / Speech Recognition options in 2025:
| Speech Engine: | My review: |
|---|---|
| OpenAI Whisper | Whisper has no command mode, but excellent dictation (best dictation of anything I've ever used!), no editing abilities, includes punctuation but doesn't come with any way to adjust punctuation when its guess is wrong, no mouse control, is open source, free, and cross platform. But you need a desktop GPU or similar for the better quality models ("medium" model is great). |
| PicoVoice | PicoVoice has minimal command mode, but good dictation, similar to Whisper-Base but smaller model. It's a library/API that's closed-source and only free for private use. |
| Kaldi-Active-Grammar + Dragonfly | This combo (using Caster or custom scripts) has extremely powerful command mode and customisation abilities, but crappy dictation accuracy, good editing abilities, minimal auto punctuation, doesn't come with mouse control but can be integrated with a separate mouse control system, is open source, free, and cross platform. |
| Numen Voice | Numen has very powerful command mode and customisation abilities, but crappy dictation accuracy, good editing abilities, minimal auto punctuation, doesn't come with mouse control but can be integrated with a separate mouse control system, is open source, free, and cross platform. |
| Dragon Naturally Speaking | DNS has crappy command mode, good dictation accuracy, good editing abilities, good auto punctuation, crappy mouse control, but is closed source, costs hundreds of dollars, only works on Windows and sort of works on OS X, and is very unreliable (crashes very often and occasionally needs complete re-installs!). |
| Talon | Talon has good command mode, probably good dictation accuracy (I think it uses Facebook wav2vec), probably decent editing abilities, has good eye tracking mouse control, is partly open but partly closed source, free for OS X, and cross platform but I think currently expects some payment for latest features in Windows or Linux. It doesn't handle my accent / monotone voice / short utterances well though. |
| Serenade.ai | Serenade has good command mode, good dictation mode, some editing abilities, no mouse control, is open source, free, and cross platform. |
| Nerd-dictation | Nerd-dictation (2021) has no command mode, and only weak dictation: https://github.com/KarlHaines82/nerd-dictation-gui |
| April ASR | April ASR (2025) has minimal support and weak recognition, but it might improve in the future since it's still in beta: https://github.com/abb128/april-asr |
| CMU Sphinx 4 / CMU PocketSphinx | CMU Sphinx has been around for a lng time but it's always had quite unreliable accuracy for me. |
(Note that I have a somewhat unusual Australian English accent with very monotoned voice and I don't speak as clearly as the average person, so other people might get different results from these).