All posts

Voice input is the ability to speak and have your words appear as text in whatever application you are using — your email client, your notes app, your code editor, your browser. When it works well, it feels like a superpower: thoughts become text at the speed of speech, with no typing required. When it works poorly, it is a frustrating collection of half-heard words and missed phrases that sends you back to the keyboard.

On Mac, the gap between "works well" and "works poorly" is largely determined by which voice input solution you choose and how you configure it. This guide explains the key factors that make voice input effective and how to set it up so it becomes a genuine part of your daily workflow.

The Two Models of Voice Input

Voice input on Mac operates in one of two models: toggle mode or push-to-talk mode. Understanding the difference is essential to understanding why most people's first experience with voice input is disappointing.

Toggle Mode

Toggle mode is what Apple's built-in dictation uses. You press a key or button to turn dictation on, speak as much as you want, and press again to turn it off. This sounds convenient but creates a significant practical problem: you have to actively manage when the system is listening. If you get distracted between pressing the button to start and finishing your thought, the microphone is open and picking up everything in your environment. Toggle mode also means you cannot seamlessly alternate between voice and typing — turning dictation on while your hands are on the keyboard causes the keyboard to input commands that activate dictation controls.

Push-to-Talk (Hold-to-Speak) Mode

Push-to-talk voice input works the way walkie-talkies work: you hold a button while speaking and release it when you are done. The microphone is only active while the button is held, which means the system is never accidentally listening, and you can transition from voice to keyboard instantly by releasing the key. This model is dramatically more practical for daily use. It integrates naturally with keyboard-based workflows because your hands never leave the keyboard area — you just hold a key and speak.

Steno uses this push-to-talk model exclusively. You hold a configurable global hotkey, speak, and release. The transcribed text appears at your cursor. There is no toggle, no activation UI, and no management overhead. The control is as direct and tactile as pressing a key.

Why "Global" Matters

Many voice input tools only work in specific applications. Browser extensions only activate in your browser. App-specific dictation features only work within that app. But your work spans many applications — you switch between email, chat, documents, notes, code, and browsers throughout the day. Having a different voice input tool for each application is not a real solution.

Global voice input means the same hotkey works in every application on your Mac simultaneously. You do not configure it per-application or switch tools depending on what you are using. The hotkey works in Mail, Slack, VS Code, Notion, Obsidian, Safari, Chrome, Terminal, and anything else that accepts text. This system-level operation is one of the core advantages of a native Mac app like Steno over browser extensions or web-based voice tools.

Setting Up Voice Input for Real Work

Choose a Comfortable Hotkey

The hotkey you choose for voice input matters more than most people expect. You want something that is easy to hold while your hands are in their natural typing position. Many Steno users choose a function key, the right Command key, or a combination like Option+Space. The key should be reachable without moving your hand significantly from home row position. If reaching the hotkey requires awkward hand movements, you will unconsciously avoid using voice input for short dictations.

Position Your Microphone Correctly

Voice input accuracy scales directly with audio quality. Your Mac's built-in microphone is adequate for quiet environments but struggles with background noise. AirPods and similar in-ear headphones deliver consistently better audio because the microphone is close to your mouth regardless of your head position. If you dictate at a desk, a small USB condenser microphone pointed toward your face provides excellent results without the hassle of headphones.

Build the Habit in Stages

Voice input only becomes a habit if you use it consistently enough that the behavior becomes automatic. The best way to reach that point is to start with one specific use case and master it before expanding. Most successful voice typists start with a single task — typically email replies or daily notes — and use voice input exclusively for that task for two weeks. By the time two weeks have passed, the hold-speak-release cycle is automatic and easily transfers to other applications.

Applications Where Voice Input Excels

Email

Email is the ideal training ground for voice input because messages are conversational, length is predictable, and errors are easy to spot during a quick review before sending. The average email body contains 50 to 150 words. At conversational speaking speed, that takes 20 to 60 seconds to dictate. Over 20 to 30 emails a day, you recover 30 to 60 minutes of keyboard time — every day, indefinitely.

Meeting Notes

Taking notes during or immediately after meetings is awkward to type because your attention is divided between the conversation and the keyboard. Voice input lets you keep your eyes on the meeting while capturing key points by speaking briefly in intervals. During a pause in the discussion, hold the hotkey, speak your summary of the last few minutes, and release. The notes are captured with your attention on the room, not the keyboard.

Writing and Long-Form Content

Any writing task that involves more than a few sentences benefits from voice input at the drafting stage. Speaking a first draft is faster and often produces more natural prose than typing one. The internal editor that makes typing feel laborious is less active when speaking, which means more words reach the page before self-doubt intervenes.

Search and Quick Commands

One underrated use of voice input is for search queries and quick commands. Rather than typing a search into your browser or app, hold the hotkey and speak it. For multi-word searches, this can be faster than typing because you do not need to pause to ensure correct spelling of every word.

Getting Started

Steno is available at stenofast.com and works on any Mac running macOS 13 or later. The installation takes about 30 seconds. The free tier includes enough daily dictation to evaluate whether global voice input fits your workflow. Once you have used hold-to-speak voice input in all your applications for a week, returning to toggle-only dictation feels like a genuine step backward.

Voice input is not a replacement for the keyboard. It is an upgrade that makes the keyboard better by giving you a faster way to generate text and a more deliberate way to refine it.