speechall-cli

📁 speechall/speechall-cli 📅 6 days ago
1
总安装量
1
周安装量
#45070
全站排名
安装命令
npx skills add https://github.com/speechall/speechall-cli --skill speechall-cli

Agent 安装分布

amp 1
opencode 1
kimi-cli 1
codex 1
github-copilot 1
claude-code 1

Skill 文档

speechall-cli

CLI for speech-to-text transcription via the Speechall API. Supports multiple providers (OpenAI, Deepgram, AssemblyAI, Google, Gemini, Groq, ElevenLabs, Cloudflare, and more).

Installation

Homebrew (macOS and Linux)

brew install Speechall/tap/speechall

Without Homebrew: See references/manual-install.md for manual download instructions.

Verify

speechall --version

Authentication

An API key is required. Provide it via environment variable (preferred) or flag:

export SPEECHALL_API_KEY="your-key-here"
# or
speechall --api-key "your-key-here" audio.wav

Commands

transcribe (default)

Transcribe an audio or video file. This is the default subcommand — speechall audio.wav is equivalent to speechall transcribe audio.wav.

speechall <file> [options]

Options:

Flag Description Default
--model <provider.model> STT model identifier openai.gpt-4o-mini-transcribe
--language <code> Language code (e.g. en, tr, de) API default (auto-detect)
--output-format <format> Output format (text, json, verbose_json, srt, vtt) API default
--diarization Enable speaker diarization off
--speakers-expected <n> Expected number of speakers (use with --diarization) —
--no-punctuation Disable automatic punctuation —
--temperature <0.0-1.0> Model temperature —
--initial-prompt <text> Text prompt to guide model style —
--custom-vocabulary <term> Terms to boost recognition (repeatable) —
--ruleset-id <uuid> Replacement ruleset UUID —
--api-key <key> API key (overrides SPEECHALL_API_KEY env var) —

Examples:

# Basic transcription
speechall interview.mp3

# Specific model and language
speechall call.wav --model deepgram.nova-2 --language en

# Speaker diarization with SRT output
speechall meeting.wav --diarization --speakers-expected 3 --output-format srt

# Custom vocabulary for domain-specific terms
speechall medical.wav --custom-vocabulary "myocardial" --custom-vocabulary "infarction"

# Transcribe a video file (macOS extracts audio automatically)
speechall presentation.mp4

models

List available speech-to-text models. Outputs JSON to stdout. Filters combine with AND logic.

speechall models [options]

Filter flags:

Flag Description
--provider <name> Filter by provider (e.g. openai, deepgram)
--language <code> Filter by supported language (tr matches tr, tr-TR, tr-CY)
--diarization Only models supporting speaker diarization
--srt Only models supporting SRT output
--vtt Only models supporting VTT output
--punctuation Only models supporting automatic punctuation
--streamable Only models supporting real-time streaming
--vocabulary Only models supporting custom vocabulary

Examples:

# List all available models
speechall models

# Models from a specific provider
speechall models --provider deepgram

# Models that support Turkish and diarization
speechall models --language tr --diarization

# Pipe to jq for specific fields
speechall models --provider openai | jq '.[].identifier'

Tips

  • On macOS, video files (.mp4, .mov, etc.) are automatically converted to audio before upload.
  • On Linux, pass audio files directly (.wav, .mp3, .m4a, .flac, etc.).
  • Output goes to stdout. Redirect to save: speechall audio.wav > transcript.txt
  • Errors go to stderr, so piping stdout is safe.
  • Run speechall --help, speechall transcribe --help, or speechall models --help to see all valid enum values for model identifiers, language codes, and output formats.