Every Mac ships with a built-in dictation feature. It is free, it requires no installation, and it works reasonably well. So why would anyone use a third-party voice-to-text tool like Steno? That is a fair question, and we want to answer it honestly.
Apple Dictation is a solid baseline. For some users, it is genuinely all they need. But it was designed as a general-purpose accessibility feature, not a productivity tool for people who want to replace a significant portion of their daily typing with voice input. The differences become apparent quickly once you start using dictation as a core part of your workflow.
The Side-by-Side Comparison
| Feature | Apple Dictation | Steno |
|---|---|---|
| Activation | Double-press Fn key or click mic icon | Hold any configurable hotkey |
| Transcription speed | 1-3 seconds | Sub-second (200-500ms) |
| Accuracy | Good for short phrases | State-of-the-art AI transcription |
| Voice commands | Basic (new line, period) | Extensive (punctuation, select all, undo, formatting) |
| Text snippets | No | Yes — save and expand frequently used phrases |
| Dictation history | No | Yes — full searchable history |
| WPM tracking | No | Yes — tracks your dictation speed |
| Works offline | Yes (on Apple Silicon) | Requires internet |
| Price | Free (built-in) | Free |
| Installation | None required | One-time PKG install |
Where Apple Dictation Falls Short
Activation Friction
Apple Dictation requires you to either double-press the Fn (Globe) key or click the microphone icon in a text field. The double-press method works, but it toggles dictation on and off — meaning you need to press it once to start and again to stop. There is an awkward pause while the system initializes, and if you start speaking too quickly, the first few words get clipped.
Steno uses a hold-to-speak model. Press and hold your hotkey, speak, release. Recording starts the instant you press and stops the instant you release. There is no toggling, no initialization delay, and no ambiguity about whether the system is listening. It is the difference between a walkie-talkie and a phone call — one is immediate, the other requires setup.
Transcription Speed
Apple Dictation typically takes one to three seconds after you stop speaking to finalize the transcription. During that time, you may see words change as the system reconsiders its initial guesses. This "shimmer" effect is distracting and makes it hard to know when your text is finalized.
Steno delivers transcribed text in 200 to 500 milliseconds after you release the key. The text appears once, in its final form, with no shimmer or post-processing corrections. For short dictations like a Slack reply or a commit message, the text appears almost before your finger leaves the key.
No History or Tracking
Apple Dictation has no memory. Once your text is transcribed and inserted, there is no record of what you dictated. If you accidentally overwrite something or want to reference a dictation from earlier in the day, you are out of luck.
Steno maintains a full dictation history accessible from the menu bar. You can scroll through everything you have dictated, copy previous entries, and see timestamps. It also tracks your words per minute over time, so you can see how your dictation speed and habits evolve.
Limited Voice Commands
Apple Dictation supports basic commands like "new line," "period," and "comma." But the list is short, and the commands are not always reliably recognized during natural speech.
Steno supports a broader set of voice commands that integrate naturally into dictation. Beyond punctuation, you can use commands for text manipulation like "select all" and "undo," making it possible to perform light editing without touching the keyboard.
Where Apple Dictation Wins
We want to be honest about where Apple's built-in option has genuine advantages:
No Installation Required
Apple Dictation is already on your Mac. You do not need to download anything, grant special permissions, or configure settings. For someone who wants to try voice dictation for the first time with zero commitment, the built-in option is the lowest-friction starting point.
Offline Support
On Apple Silicon Macs, Apple Dictation can run entirely on-device without an internet connection. Steno currently requires an internet connection for transcription, since it uses cloud-based state-of-the-art speech recognition to achieve higher accuracy. If you frequently work without internet access, this is a meaningful consideration.
Deep System Integration
Apple Dictation has some system-level integrations that third-party apps cannot replicate, such as inline dictation within certain Apple apps where the microphone icon appears directly in the text field. These are minor conveniences, but they exist.
When Apple Dictation Is Enough
If you only dictate occasionally — a quick text message here, a short note there — and you are not concerned about speed or accuracy for longer passages, Apple Dictation handles those tasks just fine. It is a reasonable tool for light, infrequent use.
It is also the right choice if you work primarily offline or if you want to try voice input for the first time without installing anything new.
When You Need Steno
Steno is built for people who want voice-to-text to be a core part of their daily workflow, not an occasional convenience. If any of these describe you, the difference will be immediately apparent:
- You dictate frequently throughout the day — emails, messages, notes, documentation.
- You care about transcription speed and want text to appear in under a second.
- You want a hold-to-speak interaction model that feels instant and natural.
- You need dictation history to reference or reuse previous transcriptions.
- You want text snippets to expand frequently used phrases.
- You track your productivity and want WPM statistics.
- You use voice commands beyond basic punctuation.
The best way to decide is to try both. Apple Dictation is already on your Mac, and Steno installs in 30 seconds. Use each for a day and see which one fits how you actually work. The difference in speed and reliability becomes obvious within the first few dictations.