Both Steno and Superwhisper are dedicated voice-to-text apps for macOS that use OpenAI's Whisper model. They solve the same fundamental problem: let you speak instead of type, anywhere on your Mac. But they take very different approaches to get there. Steno routes audio through Groq's cloud infrastructure for sub-second transcription. Superwhisper runs Whisper models locally on your machine using CoreML. That single architectural decision shapes everything else about the two apps, from speed and accuracy to pricing and privacy.

This page provides an honest, side-by-side comparison of both tools so you can decide which one fits your workflow. We will cover features, performance, pricing, and the tradeoffs you should know about before choosing.

How They Work

Steno is a 2MB native Swift app that lives in your menu bar. You hold a hotkey, speak, and release. Your audio is sent to Groq's servers, which run Whisper large-v3 on specialized hardware and return the transcription in under a second. The text is then typed into whatever app has focus. Steno also offers an offline fallback mode using Apple's built-in speech recognition, so you are never completely stuck without internet.

Superwhisper takes the opposite approach. It downloads Whisper models directly to your Mac and runs them locally using Apple's CoreML framework. You can choose from multiple model sizes, from the tiny model (around 75MB) to the large model (over 1.5GB). Larger models are more accurate but slower to process. Superwhisper also offers a cloud mode for users who want faster results, but its primary selling point is local, on-device transcription.

Both apps work system-wide, meaning you can dictate into any text field in any application. Both are macOS-only. Beyond that, the differences are significant.

Feature Comparison

FeatureStenoSuperwhisper
Speech modelWhisper large-v3 (via Groq cloud)Whisper tiny to large (local CoreML)
ProcessingCloud (Groq) + offline (Apple Speech)Local (CoreML) + optional cloud
LatencySub-second (200-500ms)1-8 seconds depending on model size
Interaction modelHold-to-speak hotkeyToggle-based recording
Voice commandsYes (new line, select all, undo, etc.)No
Text snippetsYes (custom abbreviation expansions)No
Smart rewriteYes (AI-powered text cleanup)No
Dictation historyYes (searchable, with WPM stats)No
Languages50+ via Whisper50+ via Whisper
Model selectionFixed (large-v3 for best accuracy)User-selectable (tiny, small, medium, large)
App size~2MB200MB+ (varies by downloaded models)
PricingFree tier; Pro $4.99/mo or $34.99/yr$9.99/mo (no free tier)
PrivacyAudio sent to Groq; offline mode availableFully local by default
Built withNative SwiftNative Swift

Speed and Latency

This is where the architectural difference matters most. Steno sends your audio to Groq, which runs Whisper large-v3 on custom LPU hardware optimized specifically for inference speed. The result comes back in 200 to 500 milliseconds for most utterances. The hold-to-speak model makes this feel even faster: you release the hotkey and text appears almost instantly.

Superwhisper processes audio on your Mac's CPU and Neural Engine. With the tiny model, transcription takes one to two seconds. With the medium or large model, it can take four to eight seconds depending on your hardware and the length of the audio clip. On an M1 or M2 Mac, results are reasonable. On an Intel Mac or under heavy CPU load, the wait can be noticeable.

For short messages and quick replies, both tools are usable. For long-form dictation where you are speaking continuously, Steno's sub-second turnaround creates a much more fluid experience. You never find yourself waiting for text to catch up to your speech.

Accuracy

Both apps use the same underlying Whisper model family, so baseline accuracy is similar. However, Steno always uses the large-v3 variant, which is the most accurate. Superwhisper lets you choose, but the large model is slow on local hardware, so many users default to the medium or small model for practical speed, which trades off accuracy.

In practice, Steno tends to produce cleaner transcriptions because it can afford to run the largest model every time. Cloud processing removes the accuracy-versus-speed tradeoff that local users face. For developers dictating technical terms, code-related vocabulary, and mixed-language sentences, the large-v3 model handles edge cases better than smaller variants.

Privacy: Cloud vs Local

This is the strongest argument for Superwhisper. When you use Superwhisper in local mode, your audio never leaves your Mac. There is no server, no third-party API, no data retention policy to worry about. For users handling sensitive information, medical records, legal documents, or confidential business communications, fully local processing provides peace of mind that no cloud service can match.

Steno sends audio to Groq for transcription. Groq processes the audio and returns text; they do not store or train on your data. Steno also offers an offline mode using Apple's on-device speech recognition, but accuracy in offline mode is lower than the cloud Whisper mode. If absolute privacy is your top priority and you are willing to accept slower transcription speed, Superwhisper's local-first approach is the better fit.

Productivity Features

This is where Steno pulls ahead significantly. Beyond raw transcription, Steno includes several features designed for users who dictate heavily throughout the day.

Superwhisper focuses primarily on the transcription engine itself. It does a good job at its core task, but it does not offer voice commands, snippet expansion, AI rewriting, or history tracking. If you are looking for a complete voice-typing productivity system rather than just a transcription tool, Steno offers more out of the box. You can explore similar comparisons on our best voice-to-text for Mac page.

Pricing Comparison

Steno offers a free tier with daily usage limits, so you can try the app and see whether voice typing fits your workflow before paying anything. The Pro plan costs $4.99 per month or $34.99 per year and unlocks unlimited dictation, voice commands, smart rewrite, and priority transcription. You can also check our free voice-to-text guide for more options.

Superwhisper costs $9.99 per month with no free tier. You get a short trial period, but after that you must subscribe to continue using the app. There is no annual discount option publicly listed.

At half the price with a free tier included, Steno is the more accessible option. The cost difference adds up: over a year, Steno Pro costs $34.99 (annual plan) versus Superwhisper at $119.88. That is a difference of nearly $85 per year for comparable transcription quality.

App Size and System Resources

Steno is approximately 2MB because all the heavy computation happens on Groq's servers. It installs in seconds, takes up almost no disk space, and uses minimal RAM and CPU while idle.

Superwhisper requires 200MB or more depending on which models you download. The large model alone is over 1.5GB. During transcription, Superwhisper uses significant CPU and Neural Engine resources, which can impact battery life on laptops and slow down other tasks running on your machine.

For users who are mindful about system resources, or who work on older machines, Steno's lightweight footprint is a clear advantage.

Pros and Cons

Steno Strengths

  • Sub-second latency for instant-feeling dictation
  • Always uses the most accurate Whisper large-v3 model
  • Voice commands, text snippets, smart rewrite, and history
  • Free tier available, Pro at $4.99/mo (half the price)
  • Tiny 2MB app with minimal system resource usage
  • Hold-to-speak hotkey prevents accidental recordings

Steno Weaknesses

  • Cloud mode requires internet and sends audio to Groq
  • Offline mode uses Apple Speech (lower accuracy than Whisper)
  • No option to choose different Whisper model sizes
  • macOS only (no Windows or Linux)

Superwhisper Strengths

  • Fully local processing with no cloud dependency
  • Maximum privacy, audio never leaves your device
  • Choose from multiple Whisper model sizes
  • Works completely offline with full Whisper accuracy

Superwhisper Weaknesses

  • Slower transcription (1-8s depending on model)
  • More expensive at $9.99/mo with no free tier
  • 200MB+ app size, 1.5GB+ with large model
  • No voice commands, snippets, rewrite, or history
  • Heavy CPU and Neural Engine usage during transcription

The Verdict

Steno and Superwhisper are both well-built macOS voice typing apps, but they serve different users. If privacy is your non-negotiable priority and you want everything processed locally on your device, Superwhisper is the right choice. It handles local Whisper inference well and gives you control over model selection.

For everyone else, Steno is the stronger option. It is faster, cheaper, lighter, and comes with productivity features that Superwhisper simply does not offer. The hold-to-speak interaction, voice commands, text snippets, smart rewrite, and dictation history make Steno a complete voice-typing system rather than just a transcription engine. At half the price with a free tier to get started, the barrier to trying Steno is virtually zero.

If you are coming from Apple Dictation and want something more powerful, or if you have been considering Wispr Flow as another alternative, Steno is worth trying first. The free tier lets you experience the speed and accuracy difference without committing to a subscription.

Frequently Asked Questions

Is Steno or Superwhisper better for voice typing on Mac?

Steno is faster (sub-second latency via Groq cloud), cheaper ($4.99/mo vs $9.99/mo), and includes voice commands, text snippets, smart rewrite, and dictation history. Superwhisper is better if you need fully local, on-device processing with no cloud dependency and want to choose between different Whisper model sizes.

What is the difference between local and cloud transcription?

Local transcription runs the Whisper model directly on your Mac using CoreML, meaning audio never leaves your device. Cloud transcription sends audio to a server (Steno uses Groq) for processing and returns the text. Cloud is typically faster and more accurate with large models, while local offers maximum privacy and works without internet.

How much do Steno and Superwhisper cost?

Steno offers a free tier with daily limits plus a Pro plan at $4.99/month or $34.99/year. Superwhisper costs $9.99/month with no free tier. Over a year, Steno Pro costs $34.99 on the annual plan versus $119.88 for Superwhisper, a difference of nearly $85.

Which app is better for developers and power users?

Steno is built for power users who write thousands of words daily. It offers voice commands, text snippets, smart rewrite to clean up dictated text, and a full dictation history. Superwhisper focuses on the transcription engine with customizable model sizes but fewer productivity features.

Try Steno Free on Your Mac

Sub-second voice typing with voice commands, snippets, and smart rewrite. No credit card required.

Download Steno