parakeet
npx skills add https://github.com/tdimino/claude-code-minoan --skill parakeet
Agent 安装分布
Skill 文档
Parakeet Dictation Skill
Local speech-to-text powered by NVIDIA Parakeet TDT 0.6B V3 (~600MB model, 100% offline).
Two Modes
1. Handy App (Primary â Push-to-Talk into Any Text Field)
Handy is a free, open-source Tauri app (Rust + React) providing push-to-talk dictation with Parakeet V3 built in. Inference via transcribe-rs (ONNX Runtime, int8 quantized).
brew install --cask handy
- Default hotkey: â¥Space (Option-Space) on macOS, Ctrl-Space on Windows/Linux
- Modes: Push-to-talk (hold) or toggle (press to start/stop)
- Select Parakeet V3 in Settings â Models (auto-downloads ~478MB)
- Grant microphone + accessibility permissions
- Includes VAD (Silero), model management UI
- Additional models: Whisper (Small/Medium/Turbo/Large), Moonshine, SenseVoice
- Models stored at
~/Library/Application Support/com.pais.handy/models/
2. CLI Scripts (Claude Code File Transcription & Terminal Dictation)
CLI scripts remain for headless/terminal use within Claude Code. These use NeMo/PyTorch.
Performance
| System | Speed | Engine |
|---|---|---|
| Handy (M4 Max) | ~30x realtime | transcribe-rs / ONNX int8 |
| Handy (Zen 3) | ~20x realtime | transcribe-rs / ONNX int8 |
| Handy (Skylake i5) | ~5x realtime | transcribe-rs / ONNX int8 |
| NeMo CLI (MPS) | Varies | NeMo / PyTorch |
- Accuracy: 6.05% WER (Word Error Rate)
- Languages: 25 European languages with automatic detection (no prompting)
- Privacy: 100% local processing, no cloud API
- License: CC BY 4.0 (model), MIT (Handy app)
Commands
Transcribe Audio File
/parakeet path/to/audio.wav
/parakeet ~/recordings/interview.mp3
/parakeet meeting.m4a
Supported formats: .wav, .mp3, .m4a, .flac, .ogg, .aac
Live Dictation (Terminal)
/parakeet
/parakeet dictate
Record from microphone until Enter is pressed, then transcribe.
Check Installation
/parakeet check
Verify Parakeet is properly installed and model can load.
Setup
Handy (Push-to-Talk UI)
brew install --cask handy
Launch from Applications, select Parakeet V3 model, configure hotkey.
CLI Scripts (Prerequisites)
- Parakeet Dictate repo at
~/Programming/parakeet-dictate/with Python venv - Install dependencies:
cd ~/Programming/parakeet-dictate uv venv && uv pip install -r requirements.txt - (Optional) Set custom path:
export PARAKEET_HOME=/path/to/parakeet-dictate
Implementation
When this skill is invoked:
-
For audio files: Run the transcription script
cd ~/.claude/skills/parakeet/scripts && \ ${PARAKEET_HOME:-~/Programming/parakeet-dictate}/.venv/bin/python transcribe.py "<filepath>" -
For live dictation: Run the dictation script
cd ~/.claude/skills/parakeet/scripts && \ ${PARAKEET_HOME:-~/Programming/parakeet-dictate}/.venv/bin/python dictate.py -
For checking setup: Run the check script
cd ~/.claude/skills/parakeet/scripts && \ ${PARAKEET_HOME:-~/Programming/parakeet-dictate}/.venv/bin/python check_setup.py
Model Caches
| System | Cache Location | Size | Engine |
|---|---|---|---|
| Handy | ~/Library/Application Support/com.pais.handy/models/ |
~478MB | transcribe-rs (ONNX int8) |
| NeMo CLI | ~/.cache/nemo/ |
~1.2GB | NeMo / PyTorch |
Model caches are separate. Handy’s Parakeet V3 int8 model structure:
parakeet-tdt-0.6b-v3-int8/
âââ encoder-model.int8.onnx
âââ decoder_joint-model.int8.onnx
âââ nemo128.onnx (audio preprocessor)
âââ vocab.txt
Troubleshooting
“No module named nemo”
Use the Parakeet virtual environment. Scripts automatically use the correct Python.
“MPS not available”
Apple Silicon Metal acceleration requires PyTorch 2.0+. Falls back to CPU automatically.
“Permission denied: microphone”
Grant microphone access in System Preferences â Privacy & Security â Microphone.
Model download slow
The Parakeet model downloads on first use (~478MB for Handy, ~1.2GB for NeMo). Subsequent runs use cache.
Configuration
| Variable | Default | Description |
|---|---|---|
PARAKEET_HOME |
~/Programming/parakeet-dictate |
Parakeet Dictate installation path |
Dependencies
Handy: brew install --cask handy (standalone, no other deps)
CLI scripts require:
- Parakeet Dictate repo at
$PARAKEET_HOME(default:~/Programming/parakeet-dictate) - Python virtual environment at
$PARAKEET_HOME/.venv - NeMo toolkit with ASR support (
nemo_toolkit[asr]>=2.0.0) - PyTorch 2.0+ (for MPS/CUDA acceleration)
- soundfile and sounddevice for audio handling