Can VoiceInput run fully offline like Superwhisper?

Yes. VoiceInput ships three local ASR engines (SenseVoice, Paraformer, Apple) and a local-only typography engine that handles spacing and casing in under 5ms. Toggle off cloud features and nothing leaves your Mac.

VoiceInput vs Superwhisper

Q: Is VoiceInput a Superwhisper alternative?

Yes. Both are macOS menu-bar dictation apps that drop transcribed text at your cursor. VoiceInput differs in three ways: faster end-to-end latency on Chinese (~1.4s vs ~3-6s), every utterance auto-archives into a searchable local memory layer with AI personas, and the local + bring-your-own-key tiers are free forever (no subscription).

Q: Which is more accurate for mixed Chinese-English?

VoiceInput. It uses Volcengine ASR tuned for Mandarin plus a local 200+ brand hotword list (Cursor, Kimi, GitHub, etc.) and pinyin disambiguation injected into the LLM cleanup prompt. Superwhisper relies on Whisper variants which historically underperform on code-switching CJK/Latin in real-time settings.

Q: How much does each cost?

Superwhisper is $84/year (Pro) or $8.49/month. VoiceInput has no subscription on its 100% Local tier and Bring-Your-Own-Key tier (free forever). The optional Cloud tier is $9/mo, $79/yr, or $49 lifetime.

Superwhisper is the polished Whisper desktop wrapper for English-first users. VoiceInput is built around Chinese typing speed and turns every dictation into a searchable, persona-reviewable memory — not just text dropped at the cursor.

One-line verdict

If you're an English-first user who wants the cleanest Whisper desktop wrapper, Superwhisper is excellent. If you type Chinese (or mixed Chinese-English) all day, want under-1.5-second latency, and want everything you said to be searchable next month — VoiceInput is built for that.

Side-by-side comparison

Dimension	VoiceInput	Superwhisper
End-to-end latency (Chinese)	~1.4s	~3-6s
End-to-end latency (English)	~1.5s	~1.5-3s
Mixed-language (CJK + English code/brand)	200+ hotwords + pinyin disambiguation	Generic Whisper, frequent miscaps
Local / fully offline	Yes — SenseVoice / Paraformer / Apple	Yes — Whisper local models
Cloud option	Volcengine streaming (audio direct, no relay)	OpenAI Whisper / cloud Whisper variants
Memory layer (search past dictations)	Built-in. Every line archives with app, time, tags. Full-text search, export.	No. Text drops, then it's gone.
AI persona review	7 built-in personas (Boss, Coach, Therapist, Editor…) re-read your week. Weekly Big5 sketch.	N/A
Local typography engine	CJK-Latin spacing, brand casing, unit spacing — all handled in <5ms, zero LLM call	N/A
Privacy boundary	Audio + text local; only recording-length seconds reported (toggleable)	Audio sent to cloud (Pro tier); local mode keeps everything offline
Free tier	100% Local + BYOK forever — no card	7-day trial, then paid
Paid tier	Cloud: $9/mo · $79/yr · $49 lifetime	$8.49/mo · $84/yr
Distribution	Direct DMG, Sparkle auto-update	Direct DMG, auto-update
Open API key (BYOK)	DeepSeek / Kimi / OpenAI / any OpenAI-compatible endpoint	Limited — Whisper-only routing

Where Superwhisper wins

English polish. Whisper's English transcription is industry-standard. If you only speak English, Superwhisper's output is hard to beat.
Single-purpose simplicity. No memory features, no personas — just dictate and move on. Some people prefer that.
Mature AI prompt library. Superwhisper has built-in prompt presets for email tone, tweet style, etc. VoiceInput's AI tidy stays focused on transcription correction.

Where VoiceInput wins

Speed on Chinese. Volcengine streaming + local typography = sub-1.5-second on Mandarin and mixed CJK-English. Superwhisper's Whisper backbone is slower for the same input.
Memory layer. Every voice line auto-archives. A month later you can search "what did I say about onboarding?" — Superwhisper drops the text and forgets.
AI personas. Same line re-read by Boss, Coach, Therapist, Editor. Weekly MBTI sketch from your real talk. This is a different product category, not just dictation.
Free tier covers real usage. 100% Local and BYOK paths are free forever. Superwhisper's free tier is a 7-day trial.
BYOK flexibility. Drop in any OpenAI-compatible endpoint (DeepSeek, Kimi, your own server). Pay model providers directly, ~$0-2/month for typical use.
Chinese typography. Half-width spacing between CJK and Latin, brand casing (Cursor, Kimi, API stay correct), unit spacing — all handled by a local rules engine in under 5ms.

Speed: where the 1.4 seconds come from

VoiceInput's end-to-end pipeline is built for one number: time from button release to text landing at your cursor. The path:

Press right Option, audio streams to ASR (Volcengine) over a persistent connection — no handshake on each utterance.
ASR returns partial transcripts in real time. The local typography engine fixes spacing, casing, and unit formatting in <5ms (no LLM call).
If AI tidy is enabled, the cleaned text is sent to the LLM with a constrained prompt (three jobs only: homophones, fillers, punctuation). Confidence below 0.5 keeps the original.
Text injects at the cursor via Accessibility API. Clipboard fallback triggers if the target field rejects the inject.

Superwhisper's Whisper-based pipeline batches audio into chunks of 200ms-1s before processing — fundamentally a different architecture, optimized for English transcription accuracy over real-time latency.

Memory layer: the real product difference

Most dictation apps stop at "speech becomes text." VoiceInput treats each utterance as data worth keeping:

Tool layer (SPEAK). Hold-to-talk. Text lands at the cursor. Same job Superwhisper does.
Data layer (RECALL). Every line archives locally with the source app, timestamp, and auto-tags. Full-text search across months of dictation. Export to Markdown / JSON / CSV.
Memory layer (REFLECT). 7 built-in personas (or your own) re-read your week. AI-picked quotes worth echoing. Weekly Big5 sketch — patterns you wouldn't see yourself.

If you mostly use voice for quick text injection, you don't need the bottom two layers. If you talk through real decisions and want them retrievable later, no Whisper wrapper has built this.

Privacy

Both apps offer a local-only mode. The difference is what leaves your Mac when you opt into cloud:

VoiceInput cloud. Audio streams directly to Volcengine ASR — no proxy server, no audio storage. Only one number leaves: recording length in seconds (for the global pulse counter, toggleable in Settings). API keys live in macOS Keychain.
Superwhisper cloud. Audio routes through Whisper API providers. Standard OpenAI / equivalent retention policies apply.

Who should pick which

Pick VoiceInput if

You type Chinese or mixed Chinese-English daily, want sub-1.5-second latency, and want to be able to search what you said last month. Free forever covers most real usage.

Pick Superwhisper if

You only speak English, you want a polished Whisper wrapper, you don't need a memory layer, and $84/year fits your workflow.

FAQ

Is VoiceInput a Superwhisper alternative?

Yes. Both are macOS menu-bar dictation apps. VoiceInput differs in three ways: ~1.4s end-to-end on Chinese (vs 3-6s), every utterance auto-archives into a searchable local memory layer with AI personas, and the Local + BYOK tiers are free forever.

Which is more accurate for mixed Chinese-English?

VoiceInput. Volcengine ASR tuned for Mandarin, plus 200+ brand hotwords (Cursor, Kimi, GitHub) and pinyin disambiguation injected into LLM cleanup. Superwhisper's Whisper backbone underperforms on real-time code-switching CJK/Latin.

Can VoiceInput run fully offline?

Yes. Three local ASR engines: SenseVoice, Paraformer, Apple. Local typography engine handles formatting in <5ms. Toggle off cloud — nothing leaves your Mac.

How much does each cost?

Superwhisper: $84/yr or $8.49/mo. VoiceInput: 100% Local + BYOK free forever. Optional Cloud tier $9/mo, $79/yr, or $49 lifetime.

Can I migrate my Superwhisper history into VoiceInput?

Not yet. Superwhisper doesn't preserve a queryable history — there's nothing structural to migrate. From the moment you start using VoiceInput, every dictation is captured.

Try VoiceInput free

Free forever (100% local). No account, no API key, no setup. macOS 14+.

Download v0.79.0 · 23.1 MB