Skip to content

that-yolanda/voicepaste

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

191 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

VoicePaste Demo

VoicePaste

A voice input tool for macOS & Windows — trigger with a hotkey, speak, auto-paste.

Downloads Version License Ko-fi

中文 | English

Demo

VoicePaste Demo

Features

  • Lightweight: ~20 MB installer; ~120 MB idle on macOS packaged builds. Online ASR peaks around ~150 MB; local models load on demand and vary by model/backend.
  • Data Security: All data stored locally; API keys are applied by the user, giving you full control.
  • Fully Customizable: Ready-to-use defaults with all parameters exposed for fine-tuning.
  • ASR Dual Engine (Online / Local): Online — ByteDance Doubao streaming ASR; Local — powered by sherpa-onnx, with CPU / CUDA / CoreML acceleration and model-dependent memory usage.
  • LLM Support: Built-in support for 8 LLM providers — DeepSeek, OpenAI, Anthropic, Gemini, OpenRouter, SiliconFlow, Ollama, and OpenAI-compatible APIs.
  • Streaming Output: For local models without native streaming, VAD-based segmentation with simulated streaming output delivers results in real time.
  • Multi-Scenario Text Polishing: Built-in templates for general cleanup, translation, email drafting, and more — customizable prompts with per-template hotkey bindings.
  • Hotwords: Multi-group hotword libraries to boost domain-specific term accuracy; automatically restores original formatting (capitalization, special characters, etc.), no manual corrections needed.
  • Cross-Platform: macOS (Apple Silicon / Intel) and Windows.
  • Customizable Hotkeys: Bind independent hotkeys for different scenarios (general, translation, formatting, etc.) with support for toggle (press to start, press again to stop) and hold (hold to speak, release to stop) modes.
  • Enhanced Experience: Audio feedback sounds, real-time waveform animation.
  • Apple Signed & Notarized: macOS builds are signed and notarized with an Apple Developer certificate — no Gatekeeper warnings on install (Windows builds are currently unsigned).

Quick Start

Download

Go to GitHub Releases and download the latest version for your platform.

Platform Installer Filename
macOS (Apple Silicon) VoicePaste_{version}_aarch64.dmg
macOS (Intel) VoicePaste_{version}_x64.dmg
Windows (x64) VoicePaste_{version}_x64-setup.exe / .msi

Configuration Guide

Type Link
General Setup EN / 中文
Doubao Streaming ASR EN / 中文
Local Models EN / 中文

Model Support & Capabilities

ASR Models

Type Model Guide Size Peak Memory (macOS Activity Monitor) Languages Streaming Hotwords Punctuation ITN Model ID
Online Doubao Streaming ASR 2.0 EN / 中文 - ~150 MB Chinese + English mix, dialects ✅️ ✅️ ✅️ ✅️ -
Local SenseVoice EN / 中文 158 MB ~580 MB ZH / EN / JP / KO / Cantonese ☑️ Simulated streaming ☑️ via LLM ☑️ via punctuation model ☑️ via LLM sherpa-onnx-sense-voice-zh-en-ja-ko-yue-int8-2025-09-09
Local Zipformer (ZH + EN bilingual) EN / 中文 150 MB ~465 MB Chinese + English ✅️ ✅️ ☑️ via punctuation model ☑️ via LLM sherpa-onnx-streaming-zipformer-bilingual-zh-en-2023-02-20
Local FunASR-Nano EN / 中文 948 MB ~2.5 GB Chinese + English, 7 dialects ☑️ Simulated streaming ✅️ ✅️ ✅️ sherpa-onnx-funasr-nano-int8-2025-12-30
Local Qwen3-ASR-0.6B EN / 中文 938 MB Not tested 30 languages, Chinese dialects, lyrics, rap ☑️ Simulated streaming ✅️ ✅️ ✅️ sherpa-onnx-qwen3-asr-0.6B-int8-2026-03-25

Notes

  • ✅️ Native model capability, ☑️ Achieved through software composition
  • Idle memory is ~120 MB on macOS packaged builds, measured from the Activity Monitor Memory column across the VoicePaste process group. Models are loaded on demand during recognition.
  • Local models without native streaming output use built-in VAD (Voice Activity Detection) for audio segmentation with simulated streaming; optional punctuation restoration model available.
  • Memory data was measured locally on Mac mini (Apple Silicon). Results may vary with system load, macOS memory compression, restart state, backend, and cache state. See the performance test report for full details.

LLM

Provider Supported
OpenAI ✅️
DeepSeek ✅️
Anthropic ✅️
OpenRouter ✅️
SiliconFlow ✅️
Gemini ✅️
Ollama ✅️
OpenAI-Compatible ✅️

FAQ

Not working on macOS?

VoicePaste requires Microphone and Accessibility permissions to function.

Microphone Permission

  1. Settings page → System Permissions → Click "Request Permission"
  2. System Settings → Privacy & Security → Microphone, ensure VoicePaste is authorized
  3. If previously denied, reset via Terminal and re-authorize:
tccutil reset Microphone com.yolanda.voicepaste

Accessibility Permission

  1. System Settings → Privacy & Security → Accessibility, ensure VoicePaste is authorized
  2. If reinstalled after deletion, re-add it manually

Docs

License

MIT

About

Privacy-first voice input for macOS & Windows — global hotkey, speak, auto-paste. Your data stays local. Works offline with on-device ASR & LLM. Lightweight, open source, fully customizable.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors