Google Docs has a built-in voice typing feature. On paper, this sounds like the ideal solution for Mac users who want to dictate documents. In practice, Google's voice typing has significant limitations that make it frustrating for sustained use. There is a better approach: using a dedicated native macOS dictation tool like Steno that works in Google Docs and every other application on your Mac.
This article compares Google Docs' built-in voice typing with Steno's native Mac approach, examines the tradeoffs, and explains why a system-level tool is the better long-term investment for anyone who dictates regularly.
Google Docs Voice Typing: What It Offers
Google Docs includes a voice typing feature accessible from the Tools menu or via the keyboard shortcut Cmd+Shift+S. When activated, a microphone icon appears in the document, and Google transcribes your speech in real time using its cloud-based speech recognition engine.
The feature works. For casual use, it is adequate. But it has several limitations that become apparent quickly when you try to use it as a primary dictation tool.
Chrome Only
Google Docs voice typing only works in the Chrome browser. If you use Safari, Firefox, Arc, Brave, or any other browser, the feature is not available. This is a significant limitation on a platform where many users prefer alternative browsers, especially on macOS where Safari is the default and offers superior battery life and system integration.
Google Docs Only
The voice typing feature is exclusive to Google Docs. It does not work in Google Sheets, Google Slides, Gmail, Google Chat, or any other Google product. It certainly does not work in non-Google applications like Notion, Slack, VS Code, or your terminal. If you learn to rely on Google's voice typing, you have a dictation tool that works in exactly one application.
Toggle-Based Activation
Google's voice typing uses a toggle model. You click the microphone icon (or press the shortcut) to start listening, and click it again to stop. Between those two actions, the microphone is continuously active, transcribing everything it hears. There is no tactile feedback about whether dictation is on or off, and it is easy to forget that the microphone is still listening.
The toggle model also means that ambient noise, side conversations, and background sounds are all captured and (poorly) transcribed into your document. You may end up with phantom text from a colleague's phone call or the television in the next room.
Mediocre Accuracy
Google's speech recognition model is competent but not state-of-the-art. It was designed for general-purpose recognition across Google's product ecosystem, not specifically optimized for dictation accuracy. Compared to OpenAI's Whisper large-v3 model, Google's recognition produces more errors, particularly with technical vocabulary, proper nouns, and accented speech.
Spoken Punctuation Required
Google Docs voice typing requires you to speak punctuation commands: "period," "comma," "new line," "question mark." This breaks your flow of thought and forces you to think about formatting while you are trying to compose content. It is the equivalent of someone interrupting you mid-sentence to ask about grammar. The cognitive overhead is real and measurable.
Steno: A Different Approach
Steno is a native macOS menu bar app that uses OpenAI's Whisper large-v3 model running on Groq's ultra-fast LPU hardware. Rather than being a feature inside one application, it works at the system level across every application on your Mac, including Google Docs.
Hold-to-Speak vs. Toggle
The most significant difference is the interaction model. Steno uses hold-to-speak: you hold down your chosen hotkey, speak, and release when you are done. The microphone is only active while the key is held. There is no ambiguity, no forgotten-microphone problem, and no phantom transcriptions from background noise.
The hold-to-speak model also creates a natural rhythm for dictation. You think about what you want to say, press the key, speak, release. The pause between releases and the next press gives you a moment to collect your thoughts for the next passage. This rhythm produces better-structured prose than continuous dictation because each press-speak-release cycle tends to correspond to a coherent thought unit.
Automatic Punctuation
Whisper large-v3 handles punctuation inference natively. When you pause briefly, it inserts a comma. When you pause longer or end a thought, it inserts a period. Question marks, exclamation points, and other punctuation are inferred from your speech patterns. You never need to say "period" or "comma." This alone makes the dictation experience dramatically more natural than Google's approach.
Sub-Second Transcription
When you release the Steno hotkey, your audio is sent to Groq's Whisper API, which processes it on custom LPU hardware. The transcription returns in under a second, and your text appears at the cursor position in Google Docs. Compare this to Google's approach, where text appears word by word with a one-to-two-second delay, and errors accumulate in real time as you watch.
Steno's approach is actually less distracting because you are not watching text appear and errors accumulate while you are trying to speak. You focus entirely on speaking, then see the final result all at once when you release.
Works Everywhere, Not Just Google Docs
Because Steno operates at the macOS level, the same dictation tool that works in Google Docs also works in Gmail, Slack, Notion, VS Code, your terminal, Apple Pages, Microsoft Word, Obsidian, and literally any other application that accepts paste. You learn one tool and one hotkey, and it works everywhere.
This universality is the strongest argument for a system-level dictation tool over an application-specific one. Your workflow is not confined to Google Docs. You write in many different applications throughout the day, and having consistent dictation across all of them eliminates the friction of switching between tools that work differently.
Head-to-Head Comparison
Accuracy
Steno uses Whisper large-v3, which was trained on 680,000 hours of multilingual audio data. It consistently outperforms Google's speech recognition on benchmarks and in real-world use, particularly for technical vocabulary, accented speech, and noisy environments. Fewer errors mean less editing, which means faster overall throughput.
Speed
Google Docs voice typing transcribes in real time, which sounds fast but actually means you are watching errors appear and accumulate live. Steno transcribes your entire passage at once after you release the hotkey, delivering the complete result in under a second. The actual time from speaking to having final text is shorter with Steno because there is no error-correction phase of watching and re-dictating.
Browser and App Support
Google voice typing: Chrome only, Google Docs only. Steno: every browser, every application. There is no contest here.
Punctuation
Google: you must speak punctuation commands. Steno: punctuation is automatic. For natural, flowing dictation, automatic punctuation is essential.
Privacy
Both tools send audio to servers for processing. Google processes audio through its cloud infrastructure with its standard data handling policies. Steno sends audio to Groq's Whisper API, where it is processed and immediately discarded with no retention. Steno's transcription history is stored locally on your Mac in ~/.steno/, not on any server.
Cost
Google Docs voice typing is free. Steno offers a free tier and Steno Pro at $4.99 per month. The cost of Steno Pro is justified by the improvement in accuracy, the elimination of spoken punctuation, the sub-second transcription speed, and the ability to use dictation across every application, not just Google Docs.
Using Steno in Google Docs
The workflow is straightforward:
- Open your Google Doc in any browser.
- Click where you want text to appear.
- Hold the Steno hotkey and speak.
- Release the hotkey. Text appears at the cursor.
Steno works with Google Docs in Chrome, Safari, Firefox, Arc, Brave, and every other browser. It does not require any extensions, plugins, or permissions beyond standard microphone access. It does not modify Google Docs in any way. It simply pastes text where your cursor is.
Working with Google Docs Features
Steno's paste-based approach means it works naturally with Google Docs' collaboration features. Other editors can see your text appear in real time (as it does with any paste). Comments, suggestions, and version history all work normally because the text is standard pasted content.
For formatting, use a hybrid approach: dictate your content with Steno, then apply Google Docs formatting (headings, bold, lists) with keyboard shortcuts or the toolbar. This separation of content and formatting is actually a more efficient workflow than trying to do both simultaneously.
Getting Started
Download Steno free at stenofast.com. Try dictating in Google Docs alongside Google's built-in voice typing and compare the experience. Most users find that after experiencing automatic punctuation, sub-second transcription, and hold-to-speak control, they never go back to Google's built-in option. Steno Pro at $4.99 per month gives you unlimited dictation across Google Docs and every other application on your Mac.