VoicePaste

A voice input tool for macOS & Windows — trigger with a hotkey, speak, auto-paste.

中文 | English

Demo

Features

Lightweight: ~20 MB installer; ~120 MB idle on macOS packaged builds. Online ASR peaks around ~150 MB; local models load on demand and vary by model/backend.
Data Security: All data stored locally; API keys are applied by the user, giving you full control.
Fully Customizable: Ready-to-use defaults with all parameters exposed for fine-tuning.
ASR Dual Engine (Online / Local): Online — ByteDance Doubao streaming ASR; Local — powered by sherpa-onnx, with CPU / CUDA / CoreML acceleration and model-dependent memory usage.
LLM Support: Built-in support for 8 LLM providers — DeepSeek, OpenAI, Anthropic, Gemini, OpenRouter, SiliconFlow, Ollama, and OpenAI-compatible APIs.
Streaming Output: For local models without native streaming, VAD-based segmentation with simulated streaming output delivers results in real time.
Multi-Scenario Text Polishing: Built-in templates for general cleanup, translation, email drafting, and more — customizable prompts with per-template hotkey bindings.
Hotwords: Multi-group hotword libraries to boost domain-specific term accuracy; automatically restores original formatting (capitalization, special characters, etc.), no manual corrections needed.
Cross-Platform: macOS (Apple Silicon / Intel) and Windows.
Customizable Hotkeys: Bind independent hotkeys for different scenarios (general, translation, formatting, etc.) with support for toggle (press to start, press again to stop) and hold (hold to speak, release to stop) modes.
Enhanced Experience: Audio feedback sounds, real-time waveform animation.
Apple Signed & Notarized: macOS builds are signed and notarized with an Apple Developer certificate — no Gatekeeper warnings on install (Windows builds are currently unsigned).

Quick Start

Download

Go to GitHub Releases and download the latest version for your platform.

Platform	Installer Filename
macOS (Apple Silicon)	`VoicePaste_{version}_aarch64.dmg`
macOS (Intel)	`VoicePaste_{version}_x64.dmg`
Windows (x64)	`VoicePaste_{version}_x64-setup.exe` / `.msi`

Configuration Guide

Type	Link
General Setup	EN / 中文
Doubao Streaming ASR	EN / 中文
Local Models	EN / 中文

Model Support & Capabilities

ASR Models

Type	Model	Guide	Size	Peak Memory (macOS Activity Monitor)	Languages	Streaming	Hotwords	Punctuation	ITN	Model ID
Online	Doubao Streaming ASR 2.0	EN / 中文	-	~150 MB	Chinese + English mix, dialects	✅️	✅️	✅️	✅️	-
Local	SenseVoice	EN / 中文	158 MB	~580 MB	ZH / EN / JP / KO / Cantonese	☑️ Simulated streaming	☑️ via LLM	☑️ via punctuation model	☑️ via LLM	sherpa-onnx-sense-voice-zh-en-ja-ko-yue-int8-2025-09-09
Local	Zipformer (ZH + EN bilingual)	EN / 中文	150 MB	~465 MB	Chinese + English	✅️	✅️	☑️ via punctuation model	☑️ via LLM	sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20
Local	FunASR-Nano	EN / 中文	948 MB	~2.5 GB	Chinese + English, 7 dialects	☑️ Simulated streaming	✅️	✅️	✅️	sherpa-onnx-funasr-nano-int8-2025-12-30
Local	Qwen3-ASR-0.6B	EN / 中文	938 MB	Not tested	30 languages, Chinese dialects, lyrics, rap	☑️ Simulated streaming	✅️	✅️	✅️	sherpa-onnx-qwen3-asr-0.6B-int8-2026-03-25

Notes

✅️ Native model capability, ☑️ Achieved through software composition
Idle memory is ~120 MB on macOS packaged builds, measured from the Activity Monitor Memory column across the VoicePaste process group. Models are loaded on demand during recognition.
Local models without native streaming output use built-in VAD (Voice Activity Detection) for audio segmentation with simulated streaming; optional punctuation restoration model available.
Memory data was measured locally on Mac mini (Apple Silicon). Results may vary with system load, macOS memory compression, restart state, backend, and cache state. See the performance test report for full details.

LLM

Provider	Supported
OpenAI	✅️
DeepSeek	✅️
Anthropic	✅️
OpenRouter	✅️
SiliconFlow	✅️
Gemini	✅️
Ollama	✅️
OpenAI-Compatible	✅️

FAQ

Not working on macOS?

VoicePaste requires Microphone and Accessibility permissions to function.

Microphone Permission

Settings page → System Permissions → Click "Request Permission"
System Settings → Privacy & Security → Microphone, ensure VoicePaste is authorized
If previously denied, reset via Terminal and re-authorize:

tccutil reset Microphone com.yolanda.voicepaste

Accessibility Permission

System Settings → Privacy & Security → Accessibility, ensure VoicePaste is authorized
If reinstalled after deletion, re-add it manually

Docs

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 191 Commits
.claude/skills/github-release		.claude/skills/github-release
.github/workflows		.github/workflows
assets		assets
docs		docs
schemas		schemas
scripts		scripts
src-tauri		src-tauri
web		web
.env.example		.env.example
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CHANGELOG.zh.md		CHANGELOG.zh.md
CLAUDE.md		CLAUDE.md
GUIDANCE.md		GUIDANCE.md
GUIDANCE.zh.md		GUIDANCE.zh.md
LICENSE		LICENSE
README.md		README.md
README.zh.md		README.zh.md
biome.json		biome.json
config.yaml.example		config.yaml.example
hotwords.json		hotwords.json
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml
prompts.json		prompts.json
registry.json		registry.json
tsconfig.json		tsconfig.json
vite.config.ts		vite.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VoicePaste

Demo

Features

Quick Start

Download

Configuration Guide

Model Support & Capabilities

ASR Models

LLM

FAQ

Not working on macOS?

Docs

License

About

Uh oh!

Releases 12

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

VoicePaste

Demo

Features

Quick Start

Download

Configuration Guide

Model Support & Capabilities

ASR Models

LLM

FAQ

Not working on macOS?

Docs

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 12

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages