Key Takeaways
- Speech is 3x faster than typing — 161 WPM vs 53 WPM in Stanford’s controlled study
- Speech had 20.4% fewer errors than keyboard input in the same study
- Average typing speed is 52 WPM — based on 168,000 participants (Aalto University)
- After editing, dictation still yields ~55 WPM — 2.5x faster than average typing
- Whisper Large-v3 hits 2.4% word error rate — making the “correction tax” minimal
- Best approach: dictate first drafts, type edits — combines speed with precision
My Test: 1,000 Words, Two Methods, Timed
Before I get to the academic research, here is what happened when I actually tested this myself. I wrote a 1,000-word blog draft about Whisper AI model sizes. First by typing, then by dictating with EmberType using the Large-v3 model.
Typing: 14 minutes, 12 seconds. I am a 90 WPM typist on a good day, but composition speed is always slower than copy-typing speed because you are thinking about what to write. My effective composition rate was about 70 WPM.
Dictation (no AI cleanup): 8 minutes, 47 seconds for the raw dictation. Then 6 minutes, 15 seconds fixing errors, rephrasing awkward sentences, and cleaning up filler words the transcription captured. Total: 15 minutes, 2 seconds. Slower than typing.
Dictation (with EmberType AI cleanup): Same 8 minutes, 47 seconds dictating. But the AI cleanup removed filler words and fixed punctuation automatically, cutting my correction time to 1 minute, 52 seconds. Total: 10 minutes, 39 seconds. 25% faster than typing.
The takeaway: raw dictation speed is meaningless without accounting for correction time. And the correction time depends almost entirely on the quality of your AI model and post-processing. Now, here is what the academic research says.
The Stanford Study: 3x Faster (With a Caveat)
The most cited study on dictation vs typing speed comes from a 2016 collaboration between Stanford University, the University of Washington, and Baidu. They recruited 32 participants to transcribe phrases using both speech recognition and a mobile keyboard on an iPhone 6 Plus.
The results were decisive:
| Metric | Speech Input | Keyboard Input |
|---|---|---|
| Words per minute | ~161 WPM | ~53 WPM |
| Speed advantage | 3.0x faster | Baseline |
| Error rate (English) | 20.4% lower | Baseline |
| Error rate (Mandarin) | 63.4% lower | Baseline |
Speech was not just faster — it was more accurate. In English, speech input had a 20.4% lower error rate than keyboard typing. In Mandarin, the gap was even wider at 63.4%.
The Caveat Nobody Mentions
This study used a mobile keyboard, not a desktop keyboard. That matters a lot. Mobile typing at 53 WPM is realistic, but desktop typists regularly hit 60-90 WPM. The 3x headline number is comparing speech against thumb-typing, not against a full keyboard.
This is why my personal test matters. At 90 WPM, I am in the top 10-15% of typists. Dictation still beat me — but by 25%, not 300%. For a 50 WPM typist, the advantage would be much larger. The speed benefit of dictation scales inversely with typing skill: the slower you type, the more dictation helps.
Average Typing Speeds: The Real Numbers
To understand the dictation vs typing comparison, you need accurate typing speed data. The most comprehensive study comes from Aalto University’s 2018 analysis of 168,000 participants and 136 million keystrokes. This wasn’t a small lab study — it’s one of the largest typing datasets ever collected.
| Group | Average WPM | Notes |
|---|---|---|
| Overall average | 52 WPM | 168,000 participants |
| Hunt-and-peck typists | 27 WPM | Using 2–5 fingers |
| Touch typists (average) | 40–60 WPM | All 10 fingers, eyes on screen |
| Legal professionals | 60.6 WPM | Fastest professional group |
| Ages 18–30 | 60–80 WPM | Grew up with keyboards |
| Fastest 5% of all typists | 80+ WPM | Approaching physical limits |
One of the study’s most interesting findings: fast typists use “rollover typing” for 40–70% of their keystrokes. This means they press the next key before fully releasing the previous one, creating overlapping key presses. It’s a technique that develops unconsciously with practice, and it’s a major factor separating 40 WPM typists from 80+ WPM typists.
Even among the fastest typists, though, the ceiling is around 100–120 WPM for sustained composition (as opposed to copying text, which can be faster). That’s still well below comfortable speaking speed.
Dictation Speed: The Number Everyone Gets Wrong
Here is the mistake most articles make: they quote raw speaking speed (150 WPM) as if that is your dictation speed. It is not. Dictation speed has two components: how fast you speak and how much time you spend fixing what the AI got wrong. Ignoring the second number is like measuring driving speed without counting time stuck in traffic.
Raw Speaking Speed
| Speaking Mode | WPM Range | Context |
|---|---|---|
| Conversational speech | ~150 WPM | Natural, unstructured talking |
| Dictation-optimized | 100–130 WPM | Slower, clearer enunciation |
| Careful/technical dictation | 80–100 WPM | Complex vocabulary, pausing for thought |
| Presentation/lecture pace | 120–140 WPM | Structured, audience-facing |
Most people who dictate regularly settle into a 100–130 WPM range. This is slower than natural conversation because you are composing in real time. You pause to think about structure, choose words deliberately, and sometimes restart a sentence. In my testing, I clocked my sustained dictation pace at about 115 WPM — which matches what I see from experienced EmberType users.
Effective Speed After Corrections (The Real Number)
This is where it gets interesting. A 2025 multi-country study on automated speech recognition in medical documentation found that after accounting for editing time, the median effective dictation speed was 55.42 WPM — still 2.5x faster than the average typing speed of 22 WPM observed in their clinical setting.
But that 55 WPM figure reflects older ASR technology and manual correction workflows. In my own testing with Whisper Large-v3 and EmberType's AI cleanup, effective speed after corrections was closer to 95 WPM for long-form prose. The improvement is almost entirely because modern AI cleanup catches filler words, fixes punctuation, and cleans up disfluencies automatically — work that used to be done manually.
The Correction Tax: The Only Number That Matters
I call it the "correction tax" because it is the hidden cost that dictation advocates never mention and dictation skeptics always overestimate. Here is how it actually breaks down based on my experience building and testing dictation software daily:
- Experienced dictator + Large-v3 + AI cleanup: 10–15% of total time editing. Effective speed: 85–115 WPM. This is where I am now.
- Intermediate dictator or smaller model: 20–30% of time editing. Effective speed: 55–80 WPM. Still significantly faster than average typing.
- Beginner or poor recognition: 40–50% of time editing. Effective speed: 30–50 WPM. At this level, you might as well type.
The difference between those tiers is almost entirely about the AI model. With the Tiny Whisper model (7.6% word error rate), I spent more time correcting than I saved by dictating. With Large-v3 (2.4% WER) plus AI cleanup, correction time dropped by 70%. The model you choose determines whether dictation saves time or wastes it.
My 1,000-word test proved this precisely. Raw dictation without AI cleanup: 15 minutes total (slower than typing). With AI cleanup: 10 minutes 39 seconds (25% faster). The feature that seemed like a nice-to-have turned out to be the thing that makes dictation actually worth doing for fast typists.
Whisper AI Accuracy by Model Size
OpenAI’s Whisper is the AI model that powers most modern offline dictation tools, including EmberType. According to the original Whisper research paper, accuracy varies significantly by model size:
| Model | Parameters | Word Error Rate | Best For |
|---|---|---|---|
| Tiny | 39M | ~7.6% | Quick notes, low-power devices |
| Base | 74M | ~5.7% | Casual dictation |
| Small | 244M | ~3.4% | Good balance of speed and accuracy |
| Medium | 769M | ~2.9% | Professional dictation |
| Large-v3 | 1.55B | ~2.4% | Maximum accuracy |
The practical difference between these models is significant. At 7.6% WER (Tiny), you’re correcting roughly 1 in 13 words. At 2.4% WER (Large-v3), you’re correcting roughly 1 in 42 words. Over a 2,000-word article, that’s the difference between fixing 152 errors and fixing 48.
On Apple Silicon Macs, the Large-v3 model runs efficiently enough for real-time dictation. EmberType supports all Whisper model sizes, letting you choose the accuracy level that fits your hardware. After my 1,000-word test, my recommendation is unequivocal: use Large-v3 if your Mac can handle it. The accuracy improvement is not incremental — it is the difference between dictation being slower than typing and dictation being 25% faster.
Dictate at 3x Your Typing Speed
EmberType runs Whisper AI locally on your Mac. No cloud. No subscription. No data collection.
Download EmberType Free7-day free trial. No account needed. macOS 14+ / Apple Silicon. $49 one-time.
When Dictation Wins (Based on My Daily Use)
I use both methods every day building EmberType. Here is where dictation consistently beats typing in my workflow, not just in theory:
- Long-form prose (this article, for instance). I dictated the first draft of this 3,000-word piece in about 25 minutes. Typing it would have taken 45+. The longer the content, the bigger the advantage. Anything over 500 words, dictation wins.
- Email. Most emails are conversational in tone. Dictating them feels completely natural because you are essentially talking to the person. I dictate 80% of my emails now.
- Brainstorming and idea capture. When I need to dump ideas out of my head without worrying about structure, speaking removes the physical bottleneck entirely. My fingers cannot keep up with a fast brainstorm. My voice can.
- Documentation. Writing docs for EmberType features, support responses, blog outlines. Anything where the thinking is done and I just need to get words on screen.
- Accessibility needs. For users with carpal tunnel, RSI, or other conditions that make typing painful, dictation is not just faster — it is the only sustainable option for extended writing.
- First drafts of anything. The "dictate first, edit later" workflow is faster than typing-and-editing simultaneously. Every time. I have tested this enough to be certain.
When Typing Wins (And I Say This As Someone Who Sells Dictation Software)
I would lose credibility if I pretended dictation is always better. Here is when I reach for the keyboard instead of my voice:
- Writing code. I write Swift all day. Dictating `func viewDidLoad() { super.viewDidLoad() }` is absurd. Symbols, brackets, indentation, precise syntax — this is keyboard territory. I do not even attempt to dictate code.
- Formatted content. Tables, spreadsheets, LaTeX, markdown with precise formatting. Anything where structure matters as much as words. Typing gives you direct control that voice cannot match.
- Noisy environments. When I tested dictation in a coffee shop, my WER jumped from 2% to about 8%. The correction tax wiped out the speed advantage. If the environment is noisy, type.
- Short edits. Fixing a typo, changing a word, tweaking a sentence. For anything under 10 words, activating dictation is slower than just pressing keys. I keep one hand on the keyboard at all times while dictating.
- Shared offices. Talking aloud in a quiet shared workspace is not practical. Social context overrides speed advantages.
- Dense technical content. When I write about Whisper model architectures with specific parameter counts and benchmark numbers, typing is more precise. Custom dictionaries help, but some content is inherently keyboard-shaped.
The Verdict: It Depends on What You Are Writing
The academic data says dictation is 2–3x faster than typing. My personal testing at 90 WPM says the advantage is closer to 25% for fast typists using good AI, and much larger for average typists. Both are correct — the gap depends entirely on your typing speed, your AI model, and what you are writing.
Here is the practical framework I use every day:
| Scenario | Recommendation | Expected Speed Gain |
|---|---|---|
| First drafts (500+ words) | Dictate | 2–3x faster |
| Editing and revisions | Type | More precise control |
| Emails and messages | Dictate | 2x faster |
| Code and formatting | Type | Symbols require keyboard |
| Brainstorming | Dictate | 3x+ faster (captures flow) |
| Short edits (<10 words) | Type | Faster than activating mic |
| Noisy environment | Type | Accuracy drops in noise |
| Physical pain/RSI | Dictate | Only sustainable option |
The most productive approach is not choosing one or the other. It is using both. Dictate first drafts for speed, then switch to keyboard for editing and precision. This hybrid workflow is how I write everything for EmberType — blog posts, documentation, emails, support replies. I dictate the bulk and type the edits.
If you have not tried modern AI dictation with a good model, the data (and my personal testing) suggest you are leaving significant productivity on the table. The key insight from my 1,000-word experiment: AI cleanup is what makes dictation faster than typing for fast typists. Without it, dictation is a wash. With it, dictation wins decisively. The model and the post-processing matter more than your speaking speed.
Frequently Asked Questions
Free Mac Dictation Tips
Get tips on voice-to-text, dictation workflows, and productivity. No spam.
Unsubscribe anytime. We never share your email.
You're in! Check your inbox.
Ready to Write 3x Faster?
Download EmberType and start dictating in under 5 minutes. No account. No credit card.
Download EmberType FreemacOS 14+ required. Apple Silicon only. $49 after trial.
