youtube-video-analyzer
npx skills add https://github.com/chenxplorer/youtube-video-analyzer-skill --skill youtube-video-analyzer
Agent 安装分布
Skill 文档
YouTube Video Analyzer
A professional YouTube video analysis assistant using scene detection + subtitle alignment + parallel analysis architecture.
Prerequisites
Before starting, ensure these tools are installed:
# Check installations
which yt-dlp # Video/subtitle download
which ffmpeg # Scene detection and frame extraction
# Install if missing (macOS)
brew install yt-dlp ffmpeg
# Or via pip
pip install yt-dlp
Complete Workflow
Phase 1: Setup and Download
# Create working directory
VIDEO_ID="[extract from URL]"
WORK_DIR="youtube_analysis_$VIDEO_ID"
mkdir -p $WORK_DIR/{video,subtitles,frames,output}
# Download video + subtitles + metadata in one call (fewer requests)
yt-dlp -f "worst[ext=mp4]/best[ext=mp4]" \
--write-info-json \
--write-auto-sub --write-sub \
--sub-lang zh-Hans,zh,en \
--convert-subs srt \
--no-playlist \
-o "$WORK_DIR/video/source.%(ext)s" \
"YOUTUBE_URL"
# Move subtitles to subtitles/ and keep metadata.json
mv "$WORK_DIR/video/"*.srt "$WORK_DIR/subtitles/" 2>/dev/null || true
cp "$WORK_DIR/video/source.info.json" "$WORK_DIR/metadata.json" 2>/dev/null || true
Phase 2: Scene Detection and Frame Extraction
# Extract keyframes + timestamps in a single decode
ffmpeg -i $WORK_DIR/video/source.mp4 \
-vf "select='gt(scene,0.3)',showinfo" \
-vsync vfr \
$WORK_DIR/frames/scene_%04d.jpg \
2> $WORK_DIR/ffmpeg_scene.log
# Parse timestamps from log (no second decode)
grep "pts_time" $WORK_DIR/ffmpeg_scene.log | \
sed 's/.*pts_time:\([0-9.]*\).*/\1/' > $WORK_DIR/frame_timestamps.txt
Scene threshold guidelines:
| Video Type | Threshold | Description |
|---|---|---|
| Lectures/PPT | 0.2-0.3 | Fewer changes, capture slides |
| Technical tutorials | 0.25-0.35 | Code/UI changes |
| Vlogs/interviews | 0.3-0.4 | Moderate changes |
| Fast-paced/edited | 0.4-0.5 | Avoid too many frames |
Phase 3: Subtitle Parsing and Alignment
Parse the SRT subtitle file and align with extracted frames:
- Read subtitle file from
$WORK_DIR/subtitles/ - Parse timestamp format:
00:01:23,456 --> 00:01:25,789 - Match each frame timestamp to corresponding subtitle segment
- Create frame-subtitle pairs for analysis
Phase 4: Parallel Segment Analysis
Divide frames into segments (10-15 frames each) and analyze:
For each segment, use this prompt:
åæä»¥ä¸è§é¢ç段ï¼
æ¶é´èå´ï¼{start_time} - {end_time}
帧å¾çï¼[Read the frame images]
åå¹å
容ï¼
{subtitle_text}
请åæï¼
1. æ¯å¸§çè§è§å
容ï¼å¾è¡¨ã代ç ãæµç¨å¾ãUIçï¼
2. ç»ååå¹ç解讲解è¦ç¹
3. æåå
³é®æ¦å¿µåæ¯è¯
4. æ æ³¨éè¦çè§è§å
ç´
5. ç»åºå
³é®ç»èçè§£éæå°ç»
6. å¦æææ¥éª¤/代ç ï¼æç¼å¯å¤ç°çæä½ç¹
è¾åºæ ¼å¼ï¼ç»æåç¬è®°ï¼æ 注æ¶é´æ³
Parallel execution tips:
- Cap concurrency (e.g., 3â5 segments at once) to avoid rate limits
- Retry failed segments and merge results incrementally
- Consider de-dup/contact-sheeting similar frames to reduce token use
Phase 5: Final Summary Generation
Merge all segment analyses and generate complete summary:
Use this prompt for final generation:
æ´å以ä¸è§é¢åæç»æï¼çæå®æ´çå¦ä¹ æ»ç»ï¼
{all_segment_analyses}
**å¿
é¡»å
å«ä»¥ä¸å
容ï¼**
1. æ¦è§ï¼ä¸è±åè¯ï¼
2. æ ¸å¿è¦ç¹å表
3. åºæ¯æ¶é´çº¿è¡¨æ ¼
4. å
³é®è§è§å
容ï¼å¼ç¨å¸§å¾çï¼
5. 详ç»ç¬è®°ï¼æç« èç»ç»ï¼
6. å®è·µè¦ç¹æ¸
å
**详ç»åº¦è¦æ±ï¼**
- æ¯ä¸ªç« èè³å° 3-5 æ¡è¦ç¹ï¼å
å«è§£éãåå æå½±åï¼
- 对å
³é®æ¯è¯ç»åºç®çå®ä¹/éä¹
- 对å
³é®æ¥éª¤ç»åºå¯å¤ç°çæä½æè¿°
- éè¦ç»è®ºå°½éå¼ç¨å¯¹åºå¸§å¾ï¼scene_XXXX.jpgï¼
**å¿
é¡»çæä»¥ä¸å¾è¡¨ï¼Mermaidæ ¼å¼ï¼ï¼**
1. **æç»´å¯¼å¾**ï¼å¿
é¡»ï¼- å±ç¤ºç¥è¯ç»æ
2. **æ¶é´çº¿**ï¼å¿
é¡»ï¼- å±ç¤ºå
容åå¸
3. **æµç¨å¾**ï¼å¦ææ¥éª¤/æµç¨ï¼
4. **æ¦å¿µå
³ç³»å¾**ï¼å¦ææ¦å¿µå
³èï¼
Phase 6: Final Deliverables (cleanup)
Keep only final artifacts:
- Video file
- Chinese/English subtitles (SRT)
- Summary document
- Frames referenced by the summary
Run:
./scripts/finalize.sh "$WORK_DIR" /path/to/summary.md
Use --keep-work to preserve intermediate files for debugging.
When using this skill, always run finalize.sh after the summary is generated to remove intermediate artifacts.
Output Format Template
# [è§é¢æ é¢] å¦ä¹ æ»ç» / Learning Summary
## æ¦è§ / Overview
[ä¸è±åè¯ç®ä»]
## æ ¸å¿è¦ç¹ / Key Takeaways
- è¦ç¹ 1 / Point 1
- è¦ç¹ 2 / Point 2
- è¦ç¹ 3 / Point 3
## ç¥è¯ç»æå¾ / Knowledge Mind Map
```mermaid
mindmap
root((è§é¢ä¸»é¢))
æ ¸å¿æ¦å¿µ1
åæ¦å¿µA
åæ¦å¿µB
æ ¸å¿æ¦å¿µ2
åæ¦å¿µC
å®è·µè¦ç¹
æ¥éª¤1
æ¥éª¤2
è§é¢æ¶é´çº¿ / Video Timeline
gantt
title è§é¢å
容æ¶é´çº¿
dateFormat mm:ss
section å¼è¨
主é¢ä»ç» :00:00, 02:00
section æ ¸å¿å
容
æ¦å¿µè®²è§£ :02:00, 15:00
section æ»ç»
å顾è¦ç¹ :15:00, 20:00
å 容æµç¨å¾ / Content Flowchart (å¦éç¨)
flowchart TD
A[å¼å§] --> B[æ¥éª¤1]
B --> C{夿æ¡ä»¶}
C -->|æ¯| D[æ¥éª¤2]
C -->|å¦| E[æ¥éª¤3]
D --> F[ç»æ]
E --> F
æ¦å¿µå ³ç³»å¾ / Concept Relationships (å¦éç¨)
graph LR
A[æ¦å¿µA] --> B[æ¦å¿µB]
A --> C[æ¦å¿µC]
B --> D[æ¦å¿µD]
C --> D
åºæ¯æ¶é´çº¿ / Scene Timeline
| æ¶é´ | åºæ¯æè¿° | å ³é®å 容 |
|---|---|---|
| 00:15 | æ é¢é¡µ | 主é¢ä»ç» |
| 02:30 | ä»£ç æ¼ç¤º | æ ¸å¿å®ç° |
| 05:45 | æ¶æå¾ | ç³»ç»è®¾è®¡ |
å ³é®è§è§å 容 / Key Visuals
[00:02:30] – æ¶æå¾
åæ / Analysis: [å¾çå
容说æåéè¦æ§]
[00:05:45] – 代ç 示ä¾
åæ / Analysis: [代ç 说æåè¦ç¹]
详ç»ç¬è®° / Detailed Notes
第ä¸ç« ï¼å¼è¨ [00:00 – 02:00]
[详ç»å 容…]
第äºç« ï¼æ ¸å¿æ¦å¿µ [02:00 – 10:00]
[详ç»å 容…]
第ä¸ç« ï¼å®è·µæ¼ç¤º [10:00 – 18:00]
[详ç»å 容…]
第åç« ï¼æ»ç» [18:00 – 20:00]
[详ç»å 容…]
å ³é®æ¦å¿µéä¹ / Key Terms
- æ¯è¯ 1ï¼è§£é
- æ¯è¯ 2ï¼è§£é
å¤ç°æ¥éª¤ / Reproduction Steps
- æ¥éª¤ 1
- æ¥éª¤ 2
- æ¥éª¤ 3
常è§è¯¯åº / Common Pitfalls
- è¯¯åº 1ï¼è¯´æ
- è¯¯åº 2ï¼è¯´æ
å®è·µè¦ç¹ / Action Items
- å®è·µé¡¹ 1 / Action 1
- å®è·µé¡¹ 2 / Action 2
- å®è·µé¡¹ 3 / Action 3
ç¸å ³èµæº / Related Resources
## Execution Tips
1. **Long videos (>30min)**: Increase scene threshold to 0.4-0.5 to reduce frame count
2. **No subtitles available**: Use audio transcription or analyze frames only
3. **Too many frames**: Manually select key frames or increase threshold
4. **Token limits**: Process in smaller segments, summarize progressively
5. **Faster downloads**: Use parallel fragments with yt-dlp (e.g., `--concurrent-fragments 4`)
## Quick Start Script
Run the preprocessing script:
```bash
./scripts/preprocess.sh "YOUTUBE_URL"
Optional faster download (parallel fragments) and extra yt-dlp args:
YTDLP_CONCURRENT_FRAGMENTS=4 \
YTDLP_EXTRA_ARGS="--cookies-from-browser chrome" \
./scripts/preprocess.sh "YOUTUBE_URL"
Then analyze the extracted frames and subtitles using the prompts above, generate summary.md, and run finalize.sh to keep only deliverables.
Optional auto-finalize (if summary exists):
./scripts/preprocess.sh "YOUTUBE_URL" 0.3 /path/to/summary.md