Why Steno Is Better Than Apple Dictation for Mac

All posts

Every Mac ships with Apple Dictation built in. Press the microphone key, speak, and your words appear on screen. It works. But if you have ever tried to use it for anything beyond a quick text message, you have probably run into its limitations: awkward pauses, missed words, dictation that mysteriously stops after 30 seconds, and results that require almost as much editing as typing from scratch.

Steno was built to solve exactly these problems. It is a native macOS menu bar app that uses OpenAI's Whisper speech recognition model, hosted on Groq's ultra-fast inference hardware, to deliver voice-to-text that is genuinely better than Apple Dictation in every measurable way. Here is how they compare.

Speed: Groq Whisper vs. On-Device Processing

Apple Dictation uses a combination of on-device and server-side processing. The on-device model is fast but less accurate, and the server-side model introduces noticeable latency. You often see a delay of one to three seconds before your words appear, and longer pauses can cause the dictation to stop entirely.

Steno takes a fundamentally different approach. When you release the hotkey, your audio is sent to Groq's hosted Whisper API, which runs on custom LPU (Language Processing Unit) hardware designed specifically for inference speed. The result is that your transcription typically returns in under a second, even for longer passages. There is no creeping word-by-word animation. You speak, you release, and your entire text appears at once, fully formed and correctly punctuated.

This difference matters more than you might expect. When dictation is slow, you unconsciously start speaking more carefully, pausing to check the screen, losing your train of thought. When it is fast, you can think out loud naturally and trust that the text will be there when you look.

Accuracy: Whisper Large-v3 vs. Apple's Model

Apple's speech recognition has improved significantly over the years, but it still struggles with technical vocabulary, proper nouns, numbers mixed with text, and any accent that deviates from standard American English. OpenAI's Whisper model, particularly the large-v3 variant that Steno uses, was trained on 680,000 hours of multilingual audio data. It handles accents, jargon, and complex sentences with noticeably higher accuracy.

In practical terms, this means fewer corrections. A paragraph dictated with Apple Dictation might need four or five edits. The same paragraph through Steno typically needs zero or one. Over the course of a workday, that difference adds up to a substantial amount of saved time.

Punctuation and Formatting

Apple Dictation requires you to speak punctuation commands: "period," "comma," "new paragraph." This breaks your flow and makes you think about formatting instead of content. Steno's Whisper-based transcription automatically infers punctuation from your speech patterns. Pauses become periods. Rising intonation gets question marks. You never have to say "comma" again.

Works Everywhere, Not Just Text Fields

One of the most frustrating limitations of Apple Dictation is that it only works in standard text input fields. Try to use it in a terminal, a code editor, a web app with a custom text input, or certain Electron apps, and it simply does not activate. Apple Dictation is tightly coupled to the macOS text input system, which means any app that implements text input differently is out of luck.

Steno works differently. It captures your audio, transcribes it, and then pastes the result at your cursor position. This means it works in literally any application where you can paste text. Terminal, VS Code, Figma, Slack, Discord, Notion, Google Docs in Chrome, even obscure internal tools. If you can paste into it, Steno can dictate into it.

The Hold-to-Speak Model

Apple Dictation uses a toggle model. You press a key to start, speak, then either press the key again or wait for it to time out. This creates an awkward interaction pattern. Did you remember to stop dictation? Is it still listening? Why did it stop in the middle of your sentence?

Steno uses a hold-to-speak model. You hold down your chosen hotkey, speak for as long as you need, and release when you are done. The recording starts immediately on key-down and stops immediately on key-up. There is no ambiguity about whether dictation is active. There is no timeout that cuts you off mid-sentence. You have complete, tactile control over exactly when the microphone is listening.

This model also has a privacy advantage. The microphone is only active while you are physically holding the key. There is no possibility of accidental recording, no "always listening" mode, no ambient audio being processed in the background.

Transcription History

Apple Dictation has no history feature. Once you dictate text and move on, there is no record of what you said or when you said it. If you accidentally overwrite a dictation or close a window, that text is gone.

Steno maintains a searchable history of your last 100 transcriptions, stored locally on your Mac in ~/.steno/stats.json. You can review what you dictated, when you dictated it, and copy any previous transcription back to your clipboard. This is invaluable when you dictated something brilliant but accidentally pasted it in the wrong window, or when you need to recall what you said in a message earlier in the day.

Menu Bar Native, Not a Full App

Apple Dictation is integrated into macOS, which sounds like an advantage until you realize that it means you have no control over it. You cannot customize the hotkey (it is always the microphone key or a double-press of the function key), you cannot see a visual indicator of recording status in the menu bar, and you cannot configure any aspect of its behavior.

Steno lives in your menu bar as a small, unobtrusive icon. It shows your recording status at a glance. You can configure your preferred hotkey, adjust settings, view history, and manage your account all from the menu bar dropdown. It is a focused tool that does one thing extremely well without getting in the way of your workflow.

Privacy and Data Handling

Apple Dictation sends your audio to Apple's servers for processing (unless you have specifically enabled on-device-only dictation, which sacrifices accuracy). Apple's privacy policy covers this data, but you are still sending your voice to a third party with limited transparency about retention and use.

Steno sends your audio to Groq's Whisper API for transcription. The audio is processed and immediately discarded. No audio is stored on Groq's servers after transcription. Your transcription history is stored locally on your Mac, never on any server. Your API key is stored in the macOS Keychain, the same secure enclave that stores your passwords.

Pricing: Free Tier and Pro

Apple Dictation is free, which is its strongest advantage. Steno offers a free tier so you can try it without commitment. For power users, Steno Pro is $4.99 per month, which gives you unlimited dictation, priority transcription speed, and access to all features. For anyone who dictates more than a few minutes per day, the time saved easily justifies the cost.

The Bottom Line

Apple Dictation is good enough for the occasional quick message. But if voice-to-text is a meaningful part of your workflow, its limitations will slow you down. Steno is faster, more accurate, works in every application, gives you complete control over when the microphone is active, and keeps a history of everything you dictate. It is the dictation tool that Apple Dictation should have been.

Download Steno free at stenofast.com and try it for yourself. Most people never go back to Apple Dictation after their first day with Steno.