All posts

Every iPhone keyboard has a microphone button at the bottom of the keyboard row. Tap it, speak, and iOS transcribes your words. It works, but it has always felt like an afterthought — finicky, slow to activate, prone to cutting off mid-sentence, and inconsistent in accuracy across different apps. If you have tried iPhone's speak to text and found yourself going back to typing on glass, the issue is not with voice input as a technology. It is with how Apple's implementation works.

There are meaningful improvements available, and the right setup makes speak to text on iPhone genuinely faster and more reliable than typing — especially for anything longer than a sentence.

How iPhone's Built-In Speak to Text Works

The microphone button on the default iOS keyboard activates Apple's on-device speech recognition. When you tap it, a waveform animation appears and iOS begins listening. The transcription is visible in the text field as you speak. When you stop speaking and pause for a moment, iOS stops listening and the microphone indicator disappears. You can then resume by tapping the microphone again or switching to typing.

The Core Problems

The fundamental issue is the tap-to-activate, auto-stop model. iOS detects speech pauses and automatically ends the listening session when it thinks you have finished. This is a problem because natural speech contains frequent pauses — between thoughts, when choosing a word, or simply while breathing. These pauses cause iOS to stop listening prematurely, cutting off your sentence and requiring you to tap the microphone again to resume.

The resulting experience is choppy: speak a few words, pause, see the microphone deactivate, tap again, speak, pause, deactivate, tap. For short messages this is merely inconvenient. For anything longer it becomes genuinely frustrating. The cognitive overhead of managing the microphone state competes with the cognitive work of composing what you want to say.

Accuracy is a second issue. Apple's on-device speech recognition is good for standard conversational English but struggles with names, technical terms, less common words, and non-standard accents. In noisy environments — commuting, walking outside, in a coffee shop — accuracy degrades significantly.

What Better iPhone Speak to Text Looks Like

A better speak to text experience on iPhone uses a hold-to-speak interaction rather than tap-and-wait. You press and hold a microphone button while speaking — the system listens only while the button is held — and release when you are done. There are no premature cutoffs, no accidental transcription of background sounds, and no repeated tap-to-restart cycles. The microphone is active for exactly as long as you intend it to be.

Accuracy is the second improvement. Modern server-side speech models perform substantially better than on-device alternatives on diverse vocabulary, accents, and noisy audio. The trade-off is that the audio is processed over the network, but for most users this is an acceptable trade given the accuracy improvement.

The Steno iPhone Keyboard

Steno includes an iPhone keyboard extension that brings the hold-to-speak interface to iOS. After installing Steno from the App Store and enabling the Steno keyboard in your iOS Settings under Keyboard, you can switch to the Steno keyboard from the keyboard selector globe button. The Steno keyboard includes a large microphone button in the center of the keyboard row. Press and hold it while speaking; release to insert the transcribed text at your cursor.

The Steno keyboard works in any iOS app that accepts keyboard input — Messages, Mail, Notes, WhatsApp, Telegram, Signal, Slack, and any other app where you would normally type. You switch to the Steno keyboard when you want to dictate and switch back to the default keyboard when you prefer to type. The switch takes one tap.

Accuracy on iPhone

Steno's speech model handles specialized vocabulary significantly better than iOS's built-in recognition. Technical terms, proper nouns, industry-specific language, and non-standard accents are handled with higher consistency. For users who frequently dictate domain-specific content — clinical notes, legal commentary, software specifications — the accuracy difference is substantial enough to save real editing time.

Best Use Cases for iPhone Speak to Text

iMessage and Messaging Apps

Messaging is the highest-frequency speak to text use case on iPhone because messages are short, conversational, and suffer the most from the tedium of glass keyboard typing. Holding the microphone button, speaking a complete message, and releasing takes five to ten seconds. The equivalent typing for a 30-word message takes 20 to 30 seconds with a glass keyboard. Over dozens of messages a day, this difference is significant.

Email on the Go

Reading and replying to email on iPhone is a common workflow for commuters and people between meetings. Typing a substantive email reply on a glass keyboard is genuinely difficult and produces shorter, less complete replies than the situation warrants. Speaking the reply with hold-to-speak dictation produces longer, more useful replies without adding time — because speaking is faster than glass keyboard typing.

Notes and Quick Capture

iPhone is often the device you have with you when an idea or observation occurs to you away from your desk. Hold the microphone button and speak the idea into your notes app — Apple Notes, Notion, Obsidian, or any other notes application — and the capture is complete in seconds. This works for any length of note, from a one-sentence reminder to a page of observations from a meeting or site visit.

Search Queries

Search queries on iPhone are typically multi-word phrases that take several seconds to type accurately on a glass keyboard. Speaking them is faster and produces fewer typos. The same hold-to-speak interaction works in any search field — Safari, Maps, the App Store, or any in-app search bar.

Tips for Better iPhone Speak to Text

Getting Started

Steno is available for iPhone in the App Store. If you also use a Mac, the same Steno account covers both Mac and iPhone, giving you a consistent speak to text experience across both devices. Visit stenofast.com for links to both the Mac app and the iPhone keyboard.

Glass keyboards are a compromise. We carry pocket computers capable of understanding speech perfectly, yet we still peck at tiny letters one at a time. Speak to text done right makes that compromise feel unnecessary.