video-copy-analyzer
npx skills add https://github.com/albedo-tabai/video-copy-analyzer --skill video-copy-analyzer
Agent 安装分布
Skill 文档
è§é¢ææ¡åæå·¥å ·
ä¸ç«å¼è§é¢å 容æå䏿æ¡åæï¼æ¯æ Bç«ãYouTubeãæé³ çå¹³å°ã
馿¬¡ä½¿ç¨è®¾ç½®
馿¬¡ä½¿ç¨æ¶ï¼è¯¢é®ç¨æ·ï¼
“请设置é»è®¤å·¥ä½ç®å½ï¼ç¨äºä¿åä¸è½½çè§é¢ååææ¥åï¼ï¼
A. 使ç¨é»è®¤ç®å½ï¼
~/video-analysis/B. æ¯æ¬¡æå¨æå®ç®å½ C. æå®ä¸ä¸ªåºå®ç®å½ï¼[请è¾å ¥è·¯å¾]”
ä¿åç¨æ·éæ©ä¾åç»ä½¿ç¨ã
ä¾èµç¯å¢æ£æµ
è¿è¡åæ£æµä»¥ä¸ä¾èµï¼å¦ç¼ºå¤±åæç¤ºå®è£ ï¼
# 1. yt-dlp
yt-dlp --version
# 2. FFmpeg
ffmpeg -version
# 3. Python ä¾èµ
python -c "import pysrt; from dotenv import load_dotenv; print('OK')"
# 4. RapidOCR (ç¨äºç§å½åå¹è¯å«ï¼ONNX è½»éç)
python -c "from rapidocr_onnxruntime import RapidOCR; print('OK')"
# 5. FunASR (䏿è¯é³è½¬å½ï¼æ¨è)
python -c "from funasr import AutoModel; print('OK')"
# 6. requests (ç¨äºæé³ä¸è½½)
python -c "import requests; print('OK')"
å®è£ å½ä»¤ï¼å¦ç¼ºå¤±ï¼ï¼
# åºç¡ä¾èµ
pip install yt-dlp pysrt python-dotenv requests
# FunASR (䏿è¯é³è½¬å½ï¼è½»é䏿æå¥½)
pip install funasr modelscope
# RapidOCR (ONNX è½»éçï¼ç¨äºç§å½åå¹è¯å«)
pip install rapidocr-onnxruntime
# Whisper (å¤éæ¹æ¡)
pip install openai-whisper
工使µç¨ï¼4 é¶æ®µï¼
é¶æ®µ 1: ä¸è½½è§é¢
- è·åç¨æ·è§é¢ URL åè¾åºç®å½
- 夿è§é¢å¹³å°ï¼
- æé³é¾æ¥ï¼douyin.com æ v.douyin.comï¼ï¼ä½¿ç¨ä¸ç¨èæ¬ä¸è½½
- å ¶ä»å¹³å°ï¼Bç«ãYouTubeçï¼ï¼ä½¿ç¨ yt-dlp ä¸è½½
æé³è§é¢ä¸è½½
å¯¹äºæé³é¾æ¥ï¼ä½¿ç¨ scripts/download_douyin.pyï¼
python scripts/download_douyin.py "<æé³é¾æ¥>" "<è¾åºè·¯å¾>"
æ¯æçæé³é¾æ¥æ ¼å¼ï¼
- ç龿¥ï¼
https://v.douyin.com/xxxxx - é¿é¾æ¥ï¼
https://www.douyin.com/video/xxxxx - ç²¾é页ï¼
https://www.douyin.com/jingxuan?modal_id=xxxxx - åäº«é¾æ¥ï¼
https://m.douyin.com/share/video/xxxxx
ä¸è½½æµç¨ï¼
æé³é¾æ¥
â
[Mobile UA 访é®] âââ è·åéå®åå页é¢
â
[æå RENDER_DATA] âââ è§£æè§é¢å
æ°æ®
â
[æå play_addr] âââ è·åæ æ°´å°è§é¢URL
â
[ä¸è½½è§é¢] âââ ä¿åå°æå®è·¯å¾
å ¶ä»å¹³å°ä¸è½½ï¼yt-dlpï¼
å¯¹äº Bç«ãYouTube çå¹³å°ï¼
yt-dlp -f "bestvideo[height<=1080]+bestaudio/best[height<=1080]" \
--merge-output-format mp4 \
-o "<output_dir>/%(id)s.%(ext)s" \
"<video_url>"
- è®°å½è§é¢æä»¶è·¯å¾
é¶æ®µ 2: æºè½å广å
ä½¿ç¨ scripts/extract_subtitle_funasr.py è¿è¡æºè½å广åï¼èªå¨éæ©æä½³æ¹æ¡ï¼
python scripts/extract_subtitle_funasr.py <è§é¢è·¯å¾> <è¾åºSRTè·¯å¾>
æºè½æåæµç¨ï¼ä¸å±ä¼å 级ï¼ï¼
è§é¢è¾å
¥
â
[1ï¸â£ å
åµå广£æµ] âââ æ£æµå°å广µ âââ ç´æ¥æåï¼å确度æé«ï¼
â æªæ£æµå°
[2ï¸â£ ç§å½å广£æµ] âââ éæ ·å¸§ OCR è¯å« âââ æ£æµå°æå âââ å
¨è§é¢ OCR æå
â æªæ£æµå°
[3ï¸â£ FunASR è¯é³è½¬å½] âââ 䏿ä¼å转å½ï¼ææä¼äº Whisperï¼
â
è¾åº SRT åå¹
ä¸å±æåçç¥è¯¦è§£ï¼
| å±çº§ | æ¹æ³ | éç¨åºæ¯ | å确度 | é度 |
|---|---|---|---|---|
| L1 | å åµå广å | è§é¢èªå¸¦å广µ | âââââ | â¡ æå¿« |
| L2 | RapidOCR ç§å½åå¹è¯å« | åå¹ç§å½å¨ç»é¢ä¸ | ââââ | ð å¿« |
| L3 | FunASR Nano è¯é³è½¬å½ | æ åå¹ï¼çº¯è¯é³ | âââ | ð¢ ä¸ç |
ææ¯æ 说æï¼
-
RapidOCR (ONNX): ç¨äºæ£æµåæåç§å½å¨è§é¢ç»é¢ä¸çåå¹
- ð è½»é级ï¼ONNX Runtime æ¨çï¼æ é GPU
- ð¯ 跨平å°ï¼Windows/Linux/Mac 忝æ
- ð¦ æé¨ç½²ï¼å pip å®è£ ï¼æ 夿ä¾èµ
- ⨠é«ç²¾åº¦ï¼åºäº PaddleOCR 模åä¼å
-
FunASR Nano: é¿é弿ºä¸æè¯é³è¯å«æ¨¡å
- ð è½»é级ï¼~100MB vs Whisper Large ~1.5GB
- ð¯ 䏿ä¼åï¼é坹䏿è¯é³ä¸é¨è®ç»ï¼ææä¼äº Whisper
- â±ï¸ æ¶é´æ³ï¼æ¯æåçº§å«æ¶é´æ³
- ð¨ é度快ï¼CPU ä¸ä¹è½å¿«éè¿è¡
å¤éæ¹æ¡ï¼
å¦éä½¿ç¨ Whisperï¼è±æå 容æ¨èï¼ï¼
python scripts/extract_subtitle.py <è§é¢è·¯å¾> <è¾åºSRTè·¯å¾>
å¦éæå¨æ§å¶ï¼å¯ä½¿ç¨å transcribe_audio.pyï¼
python scripts/transcribe_audio.py <è§é¢è·¯å¾> <è¾åºSRTè·¯å¾> [模å] [è¯è¨] [设å¤]
é¶æ®µ 3: æç¨¿æ ¡æ£
- 读å SRT å广件
- åå¹¶åå¹ä¸ºè¿ç»ææ¬
- åºäºä¸ä¸æè¯ä¹è¿è¡æºè½æ ¡æ£ï¼
- ä¿®æ£åé³åé误
- ä¿®æ£ä¸ä¸æ¯è¯
- è¡¥å æ ç¹ç¬¦å·
- è¾åºæ ¡æ£åçæå稿ï¼Markdown æ ¼å¼ï¼
æ ¡æ£è¾åºæ ¼å¼ï¼
# è§é¢è¯é³è½¬å½æå稿
**è§é¢æ¥æº**: [URL]
**è½¬å½æ¶é´**: [æ¥æ]
---
## 宿´æå稿
[æ ¡æ£åçæ£æå
容]
---
## åå§ SRT åå¹
[带æ¶é´æ³çåå§è½¬å½]
é¶æ®µ 4: ä¸ç»´åº¦ç»¼ååæ
åºç¨ä¸ä¸ªåææ¡æ¶è¿è¡æ·±åº¦åæï¼
4.1 TextContent Analysis è§è§
- åäºç»æåæ
- åäºå£°é³åæ
- ä¿®è¾ææ³è¯å«
- è¯åºæå
4.2 Viral-Abstract-Script è§è§
- Viral-5D æ¡æ¶è¯æï¼Hook/Emotion/çç¹/CTA/社交货å¸ï¼
- 飿 ¼å®ä½
- çæ¬¾æ½åè¯ä¼°
- ä¼å建议
4.3 Brainstorming è§è§
- æ ¸å¿ä»·å¼æè§£
- 2-3 ç§åææ¹åæ¢ç´¢
- å¢ééªè¯ç¹
åæè¾åºæ ¼å¼ï¼
# è§é¢ææ¡ç»¼ååææ¥åï¼ä¸ç»´åº¦ï¼
## ä¸ãTextContent Analysis è§è§
[åäºç»æãä¿®è¾ææ³ãè¯åº]
## äºãViral-Abstract-Script è§è§
[Viral-5Dè¯æã飿 ¼å®ä½ãä¼å建议]
## ä¸ãBrainstorming è§è§
[ä»·å¼æè§£ãåææ¹åãéªè¯ç¹]
## åã综åè¯ä¼°ä¸å»ºè®®
[è¯åãæ¹è¿å»ºè®®ãæ¹å示ä¾]
宿åè¾åº
宿ææé¶æ®µåï¼åç¨æ·ææ¥ï¼
â
è§é¢ææ¡åæå®æï¼
ð è¾åºç®å½: <ç¨æ·æå®çç®å½>
ð çææä»¶:
- <è§é¢ID>.mp4 (åå§è§é¢)
- <è§é¢ID>.srt (åå§åå¹)
- <è§é¢ID>_æå稿.md (æ ¡æ£åæå稿)
- <è§é¢ID>_åææ¥å.md (ä¸ç»´åº¦åææ¥å)
ð å¿«éæå¼:
[æå稿](<æå稿路å¾>)
[åææ¥å](<åææ¥åè·¯å¾>)
åèæä»¶
- download_douyin.py: æé³è§é¢ä¸è½½èæ¬
- extract_subtitle_funasr.py: æºè½å广åèæ¬ï¼FunASR + RapidOCRï¼
- extract_subtitle.py: å广åèæ¬ï¼Whisperï¼
- transcribe_audio.py: é³é¢è½¬å½èæ¬
- analysis-frameworks.md: ä¸ä¸ªåææ¡æ¶è¯¦è§£