speech-build

📁 cnemri/google-genai-skills 📅 12 days ago
18
总安装量
10
周安装量
#19337
全站排名
安装命令
npx skills add https://github.com/cnemri/google-genai-skills --skill speech-build

Agent 安装分布

gemini-cli 7
opencode 6
antigravity 6
codex 6
openclaw 6
claude-code 5

Skill 文档

Speech Skill (TTS & STT)

Use this skill to implement audio generation and transcription workflows using the google-genai and google-cloud-speech SDKs.

Quick Start Setup

from google import genai
from google.genai import types
# For STT: from google.cloud import speech_v2

client = genai.Client()

Reference Materials

Common Workflows

1. Generate Speech (Gemini-TTS)

response = client.models.generate_content(
    model="gemini-2.5-flash-preview-tts",
    contents="Hello, world!",
    config=types.GenerateContentConfig(
        response_modalities=["AUDIO"],
        speech_config=types.SpeechConfig(
            voice_config=types.VoiceConfig(
                prebuilt_voice_config=types.PrebuiltVoiceConfig(voice_name='Kore')
            )
        )
    )
)

2. Transcribe Audio (Chirp 3)

# Requires google-cloud-speech
from google.cloud import speech_v2
# ... (See stt.md for full setup)
response = speech_client.recognize(...)