You have spent years learning to type. You have internalized the QWERTY layout, built muscle memory for common words, and maybe even trained with typing tutors to push your speed higher. If you are a reasonably fast typist, you clock in around 60-80 words per minute. If you are exceptional, maybe 100-120 WPM. But here is the thing: you hit a ceiling long ago, and no amount of practice will meaningfully raise it. The keyboard has a speed limit imposed by human biomechanics, and you are already close to yours.
Voice has no such limit. The average person speaks at 120-150 words per minute in normal conversation. Practiced speakers reach 160-180 WPM. And unlike typing speed, which requires years of deliberate practice to improve, speaking speed is something you already have. You have been speaking at this rate since childhood. The only missing piece was a way to convert that speech into text accurately enough to be useful. That piece now exists.
The Biomechanics of Typing Speed
To understand why the keyboard has a ceiling, consider what typing actually requires of your body. Each keystroke involves a specific finger moving to a specific key, pressing down with enough force to register, and returning to the home row. Complex words require alternating hands, stretching to reach keys outside the home row, and coordinating modifier keys for capitals and symbols.
The fastest typists in the world, competitive speed typists, top out around 200 WPM in short bursts. But sustained professional typing rarely exceeds 100-120 WPM, and this level requires extraordinary training and practice. For the vast majority of people, practical typing speed falls between 40 and 70 WPM.
More importantly, typing speed has diminishing returns on practice. Going from 30 WPM to 60 WPM is relatively straightforward with a few months of dedicated practice. Going from 60 WPM to 90 WPM takes significantly more effort. Going from 90 WPM to 120 WPM requires the kind of deliberate practice that competitive typists engage in, which most people neither have the time nor the inclination to pursue.
The Error Factor
Raw WPM numbers are misleading without accounting for errors. A typist who can produce 80 WPM with a 5% error rate is effectively typing at a lower net speed because they have to stop and correct mistakes. At higher speeds, error rates tend to increase, creating a natural governor that limits practical throughput. Many people find that their "comfortable" typing speed, the speed at which they make few enough errors to maintain flow, is 20-30% below their peak speed.
The Physics of Speaking Speed
Speech operates on entirely different biomechanics. Speaking involves coordinated movements of the diaphragm, vocal cords, tongue, lips, and jaw. These movements are controlled by some of the most developed motor circuits in the human brain, refined by millions of years of evolutionary pressure on communication.
The result is that speaking is effortless in a way that typing never can be. You do not think about the physical mechanics of speech. You think about what you want to say, and the words come out. This effortlessness is why speaking speed is so much higher than typing speed and why it does not require training to achieve.
The Numbers
Here is a direct comparison of text input speeds across methods:
- Handwriting: 15-25 WPM
- Hunt-and-peck typing: 20-35 WPM
- Average touch typing: 40-60 WPM
- Fast touch typing: 70-100 WPM
- Professional/competitive typing: 100-150 WPM
- Average speaking: 120-150 WPM
- Fast speaking: 150-180 WPM
Notice that average speaking speed is already in the range of professional typing speed. Most people can speak faster than even very good typists can type. And while typing speed requires years of practice to develop, speaking speed is your baseline.
But What About Accuracy?
The historical argument against voice input was accuracy. If the speech recognition system misrecognizes 15% of your words, the time you save speaking is consumed by corrections. This was a valid argument until recently.
Modern AI models like OpenAI's Whisper achieve word error rates below 5% on most English speech. For clear speech in a reasonably quiet environment, error rates drop to 2-3%. At this level, dictation is not just faster than typing in raw WPM. It is faster in net WPM after accounting for corrections, because the correction burden is minimal.
Consider the math for producing 500 words of text:
- Typing at 60 WPM: 8.3 minutes of typing plus corrections, roughly 9-10 minutes total.
- Dictating at 140 WPM with 3% error rate: 3.6 minutes of speaking plus 15 corrections taking about 5 seconds each, roughly 4.8 minutes total.
Voice input is approximately twice as fast even after factoring in error correction. For 500 words, that saves 5 minutes. Over 5,000 words (a typical knowledge worker's daily text output), the savings are nearly an hour.
The Cognitive Advantage
Speed is the obvious benefit, but the cognitive advantage of voice input may be even more significant. When you type, there are two parallel processes: thinking about what to say and executing the physical typing. These processes compete for cognitive resources, especially when the typing requires attention (reaching for uncommon keys, fixing typos, managing formatting).
When you speak, the physical execution of speech is essentially automatic. Your conscious attention is fully available for content: what to say, how to phrase it, what comes next. Many people report that they think more clearly and express ideas more completely when speaking than when typing, because there is no mechanical process competing for their attention.
This cognitive freedom is particularly valuable for creative and analytical work. Writers find that dictated first drafts are more fluid. Analysts find that dictated summaries are more thorough. Professionals find that dictated emails are more direct and natural-sounding. The quality of output often improves alongside the speed, which is rare for productivity tools.
When Typing Still Wins
Voice input is not universally superior. There are specific tasks where the keyboard remains the better tool:
- Code: Programming languages are not spoken languages. The syntax of code, with its brackets, semicolons, operators, and precise formatting, is designed for keyboard entry. While you can dictate code comments and documentation, the code itself is best typed.
- Precise editing: Moving the cursor to a specific position, selecting a word, replacing it with another, these fine-grained editing operations are faster with a keyboard and mouse than with voice commands.
- Structured data entry: Filling out forms, entering numbers, working with spreadsheets. These tasks involve short, precise inputs that do not benefit from voice speed.
- Noisy environments: Open offices, coffee shops, and other environments with significant background noise reduce dictation accuracy and make speaking impractical.
The optimal workflow is not voice or keyboard. It is voice and keyboard. Use voice for generating text: emails, messages, documents, notes, descriptions, and any other prose. Use the keyboard for editing, code, and structured input. This hybrid approach gives you the speed advantage of voice where it matters most while retaining the precision of the keyboard where it is needed.
How Steno Makes Voice Input Practical
The theoretical speed advantage of voice over keyboard only matters if the tool implementing it is practical enough for daily use. This is where many dictation tools fall short. They require too many steps to activate, only work in certain applications, or have latency that breaks the flow of speaking.
Steno eliminates these friction points. It lives in your menu bar and activates with a single hotkey hold. It works in every application on your Mac through Accessibility APIs. The Groq Whisper API returns transcriptions fast enough that text appears within a beat of releasing the hotkey. There is no setup per application, no mode switching, and no workflow interruption.
The hold-to-speak model deserves specific attention in the context of speed. Unlike always-on dictation that requires you to say "stop listening" or click a button, Steno's model is physical and immediate. Hold the key, speak, release. The key release is the signal to transcribe. There is no ambiguity, no delay in stopping, and no wasted time managing the microphone state.
Making the Switch
If you want to break through your keyboard speed ceiling, here is a practical plan:
- Week 1: Download Steno from stenofast.com and use it exclusively for email and chat replies. This builds the basic habit without the pressure of producing polished content.
- Week 2: Extend to notes, journal entries, and quick documents. Start dictating anything that is more than two sentences long.
- Week 3: Apply voice to your primary writing work: blog posts, reports, documentation, or whatever you produce most. Use the dictate-then-edit workflow.
- Week 4 and beyond: Voice becomes your default for text generation. You will find yourself reaching for the hotkey instinctively whenever you have something to say.
The keyboard has been the primary text input device for over a century. It is a remarkable tool, but it has a speed ceiling that no amount of practice can overcome. Your voice does not have that ceiling. With Steno, you can finally type as fast as you think. The free tier gets you started, and Steno Pro at $4.99/month unlocks unlimited use for when voice becomes your primary input method.