yt-transcript

📁 mrecek/ai-skills 📅 3 days ago
1
总安装量
1
周安装量
#42472
全站排名
安装命令
npx skills add https://github.com/mrecek/ai-skills --skill yt-transcript

Agent 安装分布

claude-code 1

Skill 文档

You are using the YouTube Transcript (yt-dlp) Formatting skill. This skill transforms YouTube transcripts into structured markdown documents using yt-dlp, with support for author-created chapters.

When to Use This Skill

Invoke this skill when the user requests:

  • /yt-transcript <URL>
  • /yt-transcript <URL> output to <directory>
  • /yt-transcript <URL> --no-chapters
  • Any request to format a YouTube transcript with yt-dlp

Core Principle

Use the video author’s own chapter markers as structure when available. This ensures:

  • Structure reflects author’s intended organization
  • Chapters are part of the original content
  • Navigation is meaningful and authentic
  • Fallback to verbatim transcript when chapters don’t exist

Your Workflow

Step 1: Extract Metadata FIRST

Before doing anything else, call yt-dlp to extract video metadata:

yt-dlp --dump-json --skip-download --quiet "URL" | python3 -c "
import json, sys
data = json.load(sys.stdin)
upload_date = data.get('upload_date', '')
formatted_date = f\"{upload_date[0:4]}-{upload_date[4:6]}-{upload_date[6:8]}\" if upload_date and len(upload_date) == 8 else upload_date
print(f\"TITLE={data.get('title', 'N/A')}\")
print(f\"CHANNEL={data.get('uploader', 'N/A')}\")
print(f\"DATE={formatted_date}\")
"

This gives you the metadata needed for intelligent filename generation.

Step 2: Generate Intelligent Filename

Apply smart logic to determine the best identifier based on channel/speaker context.

Format: YYYY-MM-DD_identifier_topic-keywords.md

Identifier Selection Logic:

Use first-and-last name (e.g., john-smith) when:

  • Channel is clearly associated with a single individual
  • The person’s name is the primary brand
  • Video features the channel owner as the main speaker
  • Example: “Simon Sinek” → simon-sinek

Use channel brand name (e.g., techcrunch or ycombinator) when:

  • Channel brand is more recognizable than individual name
  • Multiple people contribute to the channel
  • Corporate or organizational channel
  • Brand name is well-known in its domain
  • Example: “Harvard Business Review” → harvard-business-review or hbr

Use guest/subject name (first-and-last) when:

  • Video is clearly an interview or guest appearance
  • Title features a guest who is more notable than the host
  • Content focuses on a specific person’s perspective
  • Example: “Interview with Satya Nadella” → satya-nadella

Formatting Rules:

  • Lowercase only
  • Hyphenate multi-word names/brands
  • Remove special characters, apostrophes, spaces
  • Keep it concise (prefer shorter recognizable forms)
  • Extract 2-4 topic keywords from video title
  • Max 70 characters total

Examples:

Channel/Title Identifier Full Filename
“AI News & Strategy Daily | Nate B Jones”, “Why AI-Native Companies…” nate-jones 2025-12-14_nate-jones_ai-native-companies.md
“TechCrunch”, “Startup Funding Trends” techcrunch 2024-01-15_techcrunch_startup-funding-trends.md
“The Podcast”, “Interview with Elon Musk” elon-musk 2024-01-15_elon-musk_interview.md

Step 3: Determine Output Directory

Parse user’s arguments for directory preference:

  • If user says “output to “, use that directory
  • If user provides a relative path, resolve it to absolute path

If not specified, suggest based on context:

  • If current directory is youtube-transcripts/ → use that
  • If in ideation/ or similar → suggest ideation/youtube-transcripts/
  • Otherwise → suggest creating youtube-transcripts/ in current directory

Always confirm with user before proceeding. Show the full output path.

Step 4: Detect –no-chapters Flag

Check if user wants to skip chapter organization:

  • Look for --no-chapters flag in user’s command
  • OR natural language: “without chapters”, “skip chapters”, “no organization”, “flat”, “verbatim”, “plain”

If detected, you’ll pass --no-chapters to the script.

Step 5: Show Preview and Confirm

Display to user:

  • Full output path: /full/path/to/YYYY-MM-DD_identifier_topic.md
  • URL being processed
  • Whether chapters will be included

Example:

I'll create the formatted transcript at:
  /home/user/git/notebooks/ideation/youtube-transcripts/2025-12-14_nate-jones_ai-native-companies.md

From: https://www.youtube.com/watch?v=4Bg0Q1enwS4
Chapters: Will be included (video has chapters)

Proceed?

Wait for user confirmation.

Step 6: Invoke the Script

The script is located at: .claude/skills/yt-transcript/script

Command format:

/full/path/to/.claude/skills/yt-transcript/script "URL" "/full/output/path.md" [--no-chapters]

Important:

  • Always use absolute paths for the script and output file
  • Quote the URL and output path
  • Add --no-chapters flag if detected in Step 4

Example invocation:

~/.agents/skills/yt-transcript/script \
  "https://www.youtube.com/watch?v=4Bg0Q1enwS4" \
  "/path/to/youtube-transcripts/2025-12-14_nate-jones_ai-native-companies.md"

Step 7: Display Results

If successful (exit code 0): Show success message with file path:

Success! Formatted transcript saved to:
  /full/path/to/file.md

The transcript includes:
- Video metadata (title, channel, date, duration)
- AI-generated summary
- Topic keywords
- [Chapters organized by author timestamps] OR [Verbatim transcript]

If failed (non-zero exit code): Display the error message from stderr and offer to help troubleshoot.

Error Handling

The script handles common errors and provides clear messages:

  • yt-dlp not installed → Install instructions
  • No captions available → Explanation and verification steps
  • Network errors → Retry suggestions
  • Invalid URL → Verification prompts

Your job is to relay these messages to the user and offer assistance.

Important Notes

  • Always extract metadata FIRST (Step 1) before generating filename
  • Always confirm with user before invoking the script
  • Use absolute paths when calling the script
  • The script does all the heavy lifting – you just orchestrate and communicate
  • Don’t try to parse VTT or format markdown yourself – let the script handle it

Example Interactions

Example 1: Basic Usage

User: /yt-transcript https://www.youtube.com/watch?v=4Bg0Q1enwS4 output to ideation/youtube-transcripts

Your workflow:

  1. Extract metadata → Get title, channel, date
  2. Generate filename → 2025-12-14_nate-jones_ai-native-companies.md
  3. Determine directory → User specified ideation/youtube-transcripts
  4. Check for –no-chapters → Not detected
  5. Show preview and confirm with user
  6. Invoke script with full paths
  7. Display success message

Example 2: Skip Chapters

User: /yt-transcript https://www.youtube.com/watch?v=VIDEO_ID --no-chapters

Your workflow:

  1. Extract metadata
  2. Generate filename
  3. Determine directory → Suggest based on context
  4. Detect –no-chapters → YES, pass to script
  5. Show preview (mention chapters will be skipped)
  6. Invoke script with --no-chapters flag
  7. Display results

Example 3: No Directory Specified

User: /yt-transcript https://www.youtube.com/watch?v=VIDEO_ID

Your workflow:

  1. Extract metadata
  2. Generate filename
  3. Determine directory → Current dir is /home/user/notes, suggest creating youtube-transcripts/
  4. Ask user: “I suggest saving to /home/user/notes/youtube-transcripts/. Is that okay?”
  5. User confirms
  6. Invoke script
  7. Display results

Script Capabilities

The Python script handles:

  • ✅ Metadata extraction from yt-dlp
  • ✅ Transcript download (VTT format)
  • ✅ VTT parsing with regex
  • ✅ Chapter timestamp matching
  • ✅ Summary generation
  • ✅ Topic keyword extraction
  • ✅ Markdown formatting
  • ✅ File creation with proper directory handling
  • ✅ Comprehensive error handling
  • ✅ Clean temp file management

You don’t need to do any of this – just invoke the script correctly.