transcription expert

📁 willsigmon/sigstack 📅 Jan 1, 1970
4
总安装量
0
周安装量
#48623
全站排名
安装命令
npx skills add https://github.com/willsigmon/sigstack --skill Transcription Expert

Skill 文档

Transcription Expert

Choose the right transcription service for your use case.

Pricing Comparison (2026)

Service Price/min Speed Diarization Real-time
Whisper API $0.006 Slow No (+extra) No
Deepgram $0.0043 20s/hr Yes Yes
AssemblyAI $0.0025 Fast +$0.02/hr Yes

When to Use Each

Whisper

  • One-time batch processing
  • Self-hosting option (free)
  • Privacy-sensitive (local)
  • Best: Podcasts, offline processing

Deepgram

  • Real-time applications
  • Live captioning
  • Speaker identification built-in
  • Best: Meetings, call centers, voice apps

AssemblyAI

  • Cheapest per-minute
  • AI features (sentiment, topics)
  • PII redaction
  • Best: Content analysis, compliance

Quick Implementations

Whisper (OpenAI)

from openai import OpenAI
client = OpenAI()

with open("audio.mp3", "rb") as f:
    transcript = client.audio.transcriptions.create(
        model="whisper-1", file=f
    )
print(transcript.text)

Deepgram

from deepgram import DeepgramClient, PrerecordedOptions

dg = DeepgramClient(api_key="...")
options = PrerecordedOptions(model="nova-3", diarize=True)

response = dg.listen.rest.v1.transcribe_file(
    {"buffer": open("audio.mp3", "rb")}, options
)

AssemblyAI

import assemblyai as aai

aai.settings.api_key = "..."
transcriber = aai.Transcriber()

transcript = transcriber.transcribe("audio.mp3")
print(transcript.text)

Speaker Diarization

Deepgram (Built-in)

options = PrerecordedOptions(diarize=True)
# Response includes speaker labels automatically

AssemblyAI

config = aai.TranscriptionConfig(speaker_labels=True)
# +$0.02/hr additional

Whisper (Requires Extra)

# Need separate diarization service like pyannote
from pyannote.audio import Pipeline
pipeline = Pipeline.from_pretrained("pyannote/speaker-diarization")

Batch Processing

import asyncio

async def transcribe_batch(files):
    tasks = [transcribe(f) for f in files]
    return await asyncio.gather(*tasks)

Output Formats

  • Plain text
  • SRT/VTT subtitles
  • JSON with timestamps
  • Word-level timing

Use when: Podcast transcription, meeting notes, video subtitles, voice content indexing