VoiceInput vs Wispr Flow

Wispr Flow is a polished English-first dictation tool with strong AI rewrite features. VoiceInput targets Chinese and mixed-language typing speed, ships sub-1.5-second latency, and turns every utterance into a searchable memory you can revisit a month later.

One-line verdict

If you write English emails all day and want AI to clean up your tone on the fly, Wispr Flow's rewrite features are excellent. If you type Chinese or mixed Chinese-English, want under-1.5-second latency, and want everything you said to be retrievable later — VoiceInput is built for that.

Side-by-side comparison

Dimension VoiceInput Wispr Flow
End-to-end latency (Chinese) ~1.4s ~3-10s
End-to-end latency (English) ~1.5s ~1.5-3s
Mixed-language (CJK + English) 200+ brand hotwords + pinyin disambiguation English-first; CJK is best-effort
Local / fully offline Yes — SenseVoice / Paraformer / Apple Cloud only
AI tone rewrite (formal / casual / shorter) Constrained: fix homophones, drop fillers, add punctuation. Won't rewrite tone unless asked. Strong rewrite presets (email, Slack, tweet)
Memory layer (search past dictations) Built-in. Local archive, full-text search, app + time + tags, export. No queryable history
AI persona review 7 built-in personas re-read your week. Weekly Big5 sketch + 3-5 quotes. N/A
Local typography engine CJK-Latin spacing, brand casing, units — all in <5ms, zero LLM call N/A
Privacy Local audio + text. Only recording-length seconds reported (toggleable). Audio sent to cloud. Standard SaaS retention.
Free tier 100% Local + BYOK forever — no card, no rate limit on local tier Rate-limited free tier
Paid tier Cloud: $5/mo · $79/yr · $49 lifetime $12/mo
BYOK (your own LLM key) DeepSeek / Kimi / OpenAI / any OpenAI-compatible endpoint No
Distribution Direct DMG, Sparkle auto-update Direct DMG

Where Wispr Flow wins

Where VoiceInput wins

The tone-rewrite question

Wispr Flow's rewrite features (make this email more formal, make this tweet shorter) are the product's signature. VoiceInput intentionally doesn't do this:

This is a deliberate split. If you want voice → polished output instantly, Wispr Flow wins. If you want voice → fast clean text + retrievable memory, VoiceInput wins.

Speed: why 1.4 seconds matters

Latency under 1.5 seconds is the threshold where dictation stops feeling like a tool and starts feeling like typing. Above 3 seconds, you wait — and the flow breaks.

VoiceInput's pipeline:

  1. Persistent ASR connection — no handshake per utterance.
  2. Streaming partial transcripts — text starts arriving before you stop talking.
  3. Local typography engine handles formatting in <5ms (no LLM call).
  4. Optional AI tidy fires only after pause detection — never blocks the first injection.

Wispr Flow's pipeline batches audio in chunks before sending — built for transcription accuracy on English, not real-time CJK throughput.

Memory layer: a different product category

Most dictation apps end at "speech became text." VoiceInput treats every utterance as data worth keeping:

Privacy

Who should pick which

Pick VoiceInput if

You write Chinese (or mixed CJK-English) daily, want sub-1.5-second latency, value an offline option, and want every dictation to be searchable next month. Free forever covers most real usage.

Pick Wispr Flow if

You write English all day, want strong AI rewrite presets (formal email, casual Slack, tweet shorter), and don't mind cloud-only + $12/mo.

FAQ

Is VoiceInput a Wispr Flow alternative?

Yes. Both are menu-bar dictation apps. VoiceInput is faster on Chinese (~1.4s vs 3-10s), works fully offline, and every dictation archives into a searchable memory layer with AI persona reviews. Wispr Flow focuses on English tone rewrite.

Which handles mixed Chinese-English better?

VoiceInput. Volcengine ASR tuned for Mandarin, 200+ brand hotwords (Cursor, Kimi, GitHub), and pinyin disambiguation in the LLM cleanup prompt. Wispr Flow is English-first.

Does VoiceInput have an offline mode?

Yes. Three on-device ASR engines (SenseVoice, Paraformer, Apple) plus a local typography engine. Toggle off cloud — nothing leaves your Mac.

Can VoiceInput rewrite my emails the way Wispr Flow does?

Not by design. AI tidy is constrained to homophones, fillers, and punctuation. Tone shaping happens in the memory layer (7 personas re-read your week), not at the moment of dictation. If you want voice → polished output instantly, pick Wispr Flow.

What does each cost?

Wispr Flow Pro is $12/mo. VoiceInput Local + BYOK are free forever. Optional Cloud tier is $5/mo, $79/yr, or $49 lifetime.

Try VoiceInput free

Free forever (100% local). No account, no API key, no setup. macOS 14+.

Download v0.47.0 · 21 MB