A macOS menu-bar voice input. Hold the hotkey, speak — text streams into your cursor in real time, and every sentence sediments into a searchable, private memory layer that lives only on your Mac.
Local engines and Bring-Your-Own-Key paths are free forever, no subscription. Pro removes all cloud quotas with zero configuration.
Three on-device ASR engines (SenseVoice / Paraformer / Apple). Nothing leaves your Mac.
Plug in any OpenAI-compatible API key (DeepSeek / Kimi / OpenAI / your local OpenAI-compatible server). Unlimited tidy, you control the cost.
Zero-config cloud ASR + cloud AI tidy. Removes the 60 min/mo + 50 tidy/day quota. For people who just want it to work.
Input on top, voice archive in the middle, AI memory underneath.
Hold, speak, release. Mixed-language, homophones, fillers handled silently. Under 1.4s.
Every line archives locally with source app, time, tags. Search, filter, export.
7 personas review your week. A weekly MBTI sketch. 3–5 quotes worth echoing.
7 built-in personas plus your own. Contrast itself is the memory.
Introspection up, decisions slower. Three returns to the "sunk cost" theme.
"Speed is iron law — no feature may add perceived latency."
Casing, punctuation, and unit spacing — handled locally in <5ms, no LLM call.
LLMs handle semantic judgment (homophones, fillers). The local engine handles format (brand casing, punctuation spacing, unit spacing). Two layers; both wins.
45+ brand names auto-corrected. Spacing after English commas/periods. Half-width parentheses kept for ASCII. Every rule toggleable.
Pin source app at record-start. Switch windows — still lands right, with clipboard fallback.
Stop talking, wait 1–2 seconds, polished text lands in one shot. You never see the raw.
Local pinyin injected into prompt. Homophone pairs no longer confused.
Your edits on AI output extract as candidate rules. Accept from menu bar.
AI models, dev tools, Apple products built-in. "Cursor" stays "Cursor".
More text, more transparent. Breathing glow hints AI is working. Original never disappears.
Audio, text and history all live on your Mac. One single number leaves — seconds per recording.
Audio and text land in the app's own directory, auto-backed up on launch. Uninstall takes it all with you.
Each recording sends only its length to the global pulse. No identity, IP, content, or context. Toggle off in Settings.
API keys sit in macOS Keychain, never on our servers. ASR runs directly against Volcengine, nothing persisted.
We don't pile features — we only ship what we believe earns its place.
Cloud recognition / AI cleanup is faster: connection reuse makes text appear sooner after release in most cases. Translation / bilingual is more accurate: proper nouns (company / product names) are recognized better.
New AI Translation: speak and get the translation directly, or original + translation side by side, 50+ languages — switch "Tidy / Translate / Bilingual" from the menu bar. AI cleanup is more accurate: fixed the occasional case where it answered your question instead of just tidying what you said.
Cloud speech recognition starts up faster — pressing right Option begins recognition almost instantly. Fixed occasional response stalls.
AI cleanup speed massively improved — release-to-text feels generation-leap fast again. New-version updates are now harder to miss: after 24h the banner turns red, after 48h it auto-restarts to finish the update. Onboarding redesigned with a full-size keyboard and three-phase animation (press → speak → release & text appears) so first-time users know which key at first glance. Dashboard adds a one-click fix entry when Accessibility permission becomes stale.
AI cleanup is noticeably faster — release the key and polished text appears almost instantly, no configuration needed. An all-new onboarding flow helps first-time users get up and running right away. Long-sentence direct insert is more reliable and produces more complete results. Recording status is clearer and more polished, proper-noun recognition is sharper, and sign-in on certain networks is fixed.
AI tidy response is markedly faster — the release-and-see-text loop feels much snappier. Common AI company / tool names (Anthropic / OpenAI / DeepSeek / Cursor / TypeLess) now hit correct spelling far more reliably. Chinese idioms that get misheard (e.g. 头痛医头脚痛医脚 / 实事求是) are auto-restored. Chat & email contexts preserve your spoken tone — no longer forced into stiff business prose.
Fixed local-insert occasional text duplication (when ASR re-identified a sentence mid-utterance, the cursor showed duplicated content). AI tidy more accurate: company names / tech terms / homophone corrections improved; output stays closer to your original phrasing. New learning loop: after you manually correct an AI-tidy output, the system remembers the proper-noun mapping and gets it right next time automatically.
Yes. ASR is Volcengine, LLM can be Doubao / DeepSeek / Kimi / OpenAI. Full control over account and bill. Typical: CNY 5–20/month.
On Chinese scenarios, much faster end-to-end (1.4s vs 3–10s). And it's not just input — everything you say becomes a searchable memory.
macOS 14.0+, Apple Silicon + Intel. 22.6 MB DMG, non-App-Store, Sparkle auto-update.
Microphone, Input Monitoring, Accessibility. Granted once via the onboarding page.
No. Prompt constrains LLM to three jobs: fix homophones, drop fillers, add punctuation. Confidence < 0.5 keeps the original. Double-tap right Option to bypass AI.
Yes. Markdown / JSON / CSV export. Copy the DB file to the same path on a new Mac.
Yes. Clear API config, all memory features stop. Local typography engine keeps running.
Honest, side-by-side comparisons against the tools you're probably also evaluating.
Faster on Chinese, plus a memory layer Superwhisper doesn't have. Read the side-by-side →
Different categories: Wispr rewrites tone, VoiceInput keeps memory. Compare →
No 60-second cap, AI cleanup, mixed-language handling, full archive. See full →
Download · grant three permissions · hold right Option. Thirty seconds to get it.
Download VoiceInput_v0.73.0.dmg · 22.6 MB