Voice messages have become a conversational norm. They feel more personal than text, they are faster to send when your hands are occupied, and they convey tone in ways that written messages cannot. But receiving them is another matter — playing a two-minute voice message in a meeting, on public transport, or at your desk in an open office requires either headphones, privacy, or the patience to wait until later. Converting voice messages to text solves this entirely.
Here is a platform-by-platform guide to the fastest ways to convert voice messages to readable text, plus the broader workflow upgrade that makes sending and receiving voice messages more productive overall.
iMessage Voice Messages on iPhone
Apple introduced automatic transcription for audio messages in iMessage with iOS 17. When you receive an audio message in iMessage, a small transcript appears below the waveform in the conversation bubble. For clear audio in quiet conditions, this works well and requires no action from you — the transcript appears automatically.
The transcription quality depends on audio conditions. Messages recorded in a noisy environment, with strong accents, or with heavy background music may produce inaccurate transcripts. In those cases, the transcript is still useful for getting the gist quickly, but you may need to listen to the full audio for precise details.
On Mac, iMessage audio messages also show transcripts if you are running a recent version of macOS. The transcript appears in-line and can be selected, copied, and used like any other text.
WhatsApp Voice Messages
WhatsApp added automatic transcription for voice messages in 2023. To enable it: Settings, Chats, Voice Message Transcripts, then select your preferred transcription language. Once enabled, a "Transcribe" button appears below voice messages in conversations, and tapping it generates a text version of the message.
WhatsApp transcription works on-device, which means your voice messages are processed locally and not sent to WhatsApp's servers for transcription — a meaningful privacy benefit. Accuracy is generally good for standard speech in supported languages, though technical vocabulary and proper nouns may be mishandled.
On WhatsApp for Mac and Windows desktop, the transcription feature follows the same settings as mobile, so if you have enabled it on iPhone, it will also be available on the desktop app.
Telegram Voice Messages
Telegram has offered voice message transcription in its Premium plan for several years. Premium subscribers see a small text icon next to voice messages in conversations; tapping it transcribes the message in place. The transcript is displayed in a collapsible panel below the voice message so the audio is still accessible if you want to listen.
Telegram's transcription supports a wide range of languages and handles accents reasonably well compared to alternatives. For users who communicate across language barriers, the combination of transcription and automatic translation opens up new possibilities for multilingual teams.
General Approach: Download and Transcribe
For voice messages in platforms that do not include built-in transcription — or when the built-in transcription is inaccurate and you need a better result — the manual approach works for any platform:
- Long-press or right-click the voice message and save or forward the audio file
- Upload the audio file to a dedicated transcription service
- Receive the transcript and copy the text you need
This approach works universally because every voice message platform either allows saving the audio directly or sharing it to another app. The trade-off is the extra steps compared to built-in transcription, but it gives you access to higher-quality transcription tools when accuracy matters.
Sending Voice Messages More Productively
The flip side of receiving voice messages is sending them — and this is where a dictation app like Steno changes the workflow entirely. Instead of recording a voice message and sending audio that the recipient has to play, you can use voice dictation to compose a text message with your voice. Same speed of input, but the recipient gets readable text.
With Steno on your iPhone (available as a custom keyboard) or Mac, you hold the microphone key, speak your message, and it appears as text in the message compose field. You get the effortlessness of speaking without creating the inconvenience of an audio file that someone has to listen to at an inconvenient time.
This is especially useful in professional contexts where voice messages feel informal, in group chats where audio messages disrupt the reading flow, and in situations where the recipient might not have headphones available.
Privacy Considerations
Before converting a voice message to text, consider the content of the message and the privacy implications of the tool you are using. Messages with sensitive personal information, financial details, health information, or confidential business content should be transcribed using tools that process audio locally rather than sending it to a cloud server.
iOS's built-in iMessage transcription processes audio on-device. WhatsApp's transcription is also on-device. When using third-party transcription services for voice messages from any platform, read the privacy policy before uploading content you would not want stored on someone else's servers.
The Bigger Picture
Voice messages are a natural form of communication, but they do not have to remain as audio files. Both the sending and receiving sides are better with text: sending with dictation so the recipient can read rather than listen, receiving with automatic transcription so the message is searchable and quotable. The tools to do both exist today and are free or low-cost.
If you want to upgrade the sending side — composing voice messages as text using your voice — download Steno at stenofast.com for Mac or find the iOS keyboard in the App Store.
Voice messages should give the sender the convenience of speaking and the recipient the convenience of reading. When both sides are handled well, voice communication finally works at the speed of text.