video-podcast-maker
npx skills add https://github.com/agents365-ai/video-podcast-maker --skill video-podcast-maker
Agent 安装分布
Skill 文档
â ï¸ REQUIRED: Load Design System First
This skill depends on
remotion-design-master. You MUST invoke it before proceeding:Skill tool: skill="remotion-design-master"The design system provides all Remotion components, layout constraints, and visual guidelines.
Video Podcast Maker
Quick Start
æå¼ Claude Codeï¼ç´æ¥è¯´ï¼“帮æå¶ä½ä¸ä¸ªå ³äº [ä½ ç主é¢] ç Bç«è§é¢æå®¢”
Prerequisites (One-time Setup)
0.1 ç¯å¢æ£æ¥æ¸ å
| å·¥å · | æ£æ¥å½ä»¤ | å®è£ (macOS) |
|---|---|---|
| Node.js 18+ | node -v |
brew install node |
| Python 3.8+ | python3 --version |
brew install python3 |
| FFmpeg | ffmpeg -version |
brew install ffmpeg |
0.2 API å¯é¥
# Azure Speech (å¿
é) - æ·»å å° ~/.zshrc
export AZURE_SPEECH_KEY="your-azure-speech-key"
export AZURE_SPEECH_REGION="eastasia"
# éªè¯
echo $AZURE_SPEECH_KEY # åºæ¾ç¤ºä½ çå¯é¥
è·åæ¹å¼ï¼Azure 鍿· â å建”è¯é³æå¡”èµæº
0.3 Python ä¾èµ
pip install azure-cognitiveservices-speech requests
0.4 Remotion 项ç®è®¾ç½®
# å建 Remotion 项ç®ï¼å¦å·²æåè·³è¿ï¼
npx create-video@latest my-video-project
cd my-video-project
npm i # å®è£
ä¾èµ
# å®è£
设计系ç»
mkdir -p src/remotion/design
cp -r ~/.claude/skills/remotion-design-master/src/* src/remotion/design/
# éªè¯
npx remotion studio # åºæå¼æµè§å¨é¢è§
0.5 å¿«ééªè¯
# ä¸é®æ£æ¥ææä¾èµ
echo "=== ç¯å¢æ£æ¥ ===" && \
node -v && \
python3 --version && \
ffmpeg -version 2>&1 | head -1 && \
[ -n "$AZURE_SPEECH_KEY" ] && echo "â AZURE_SPEECH_KEY 已设置" || echo "â AZURE_SPEECH_KEY æªè®¾ç½®" && \
[ -d "src/remotion/design" ] && echo "â 设计系ç»å·²å®è£
" || echo "â è®¾è®¡ç³»ç»æªå®è£
"
Overview
Automated pipeline to create professional Bilibili (Bç«) 横å±ç¥è¯è§é¢ from a topic.
ç®æ å¹³å°ï¼Bç«æ¨ªå±è§é¢ (16:9)
- å辨çï¼3840Ã2160 (4K) æ 1920Ã1080 (1080p)
- 飿 ¼ï¼ç®çº¦çº¯ç½ï¼é»è®¤ï¼
ææ¯æ ï¼ Claude + Azure TTS + Remotion + FFmpeg
éç¨åºæ¯
| éå | ä¸éå |
|---|---|
| ç¥è¯ç§æ®è§é¢ | ç«å±çè§é¢ |
| 产å对æ¯è¯æµ | ç´æå½å |
| æç¨è®²è§£ | ç人åºé |
| æ°é»èµè®¯è§£è¯» | Vlog |
| ææ¯æ·±åº¦åæ | é³ä¹ MV |
è¾åºè§æ ¼
| åæ° | å¼ |
|---|---|
| å辨ç | 3840Ã2160 (4K) |
| 帧ç | 30 fps |
| ç¼ç | H.264, 16Mbps |
| é³é¢ | AAC, 192kbps |
| æ¶é¿ | 1-15 åé |
Design System
ä½¿ç¨ remotion-design-master skill æä¾ç设计系ç»ã
# å®è£
设计ç»ä»¶
TEMP_DIR=$(mktemp -d)
git clone --depth 1 https://github.com/Agents365-ai/remotion-design-master.git "$TEMP_DIR/rdm"
cp -r "$TEMP_DIR/rdm/src/"* src/remotion/design/
rm -rf "$TEMP_DIR"
设计系ç»å å«ï¼
- å¸å±ç»ä»¶: FullBleed, ContentArea, CoverMedia, DualLayerMedia
- å¨ç»ç»ä»¶: FadeIn, SpringPop, SlideIn, Typewriter
- æ°æ®å±ç¤º: DataDisplay, AnimatedCounter, ProgressBar
- 导èªç»ä»¶: ChapterProgressBar, SectionIndicator
- 主é¢: minimalWhite (é»è®¤), darkTech, gradientVibrant
硬约æè§åãç»ä»¶ææ¡£ãè§è§é£æ ¼ è¯¦è§ remotion-design-master skillã
â ï¸ HARD CONSTRAINT: ä¼å 使ç¨è®¾è®¡ç³»ç»ç»ä»¶
å¨ Step 8 å建è§é¢ç»ä»¶æ¶ï¼ç¦æ¢ä»é¶å®ç°å·²æç»ä»¶ãå¿ é¡»ä¼å æ£æ¥å¹¶ä½¿ç¨
remotion-design-masteræä¾çç»ä»¶ï¼
- ChapterProgressBar (ç« èè¿åº¦æ¡) – é»è®¤ä½¿ç¨ï¼å¯éå ³éï¼
- FadeIn, SlideIn (å¨ç») – ä¼å 使ç¨
- FullBleed, ContentArea (å¸å±) – ä¼å 使ç¨
å¦æè®¾è®¡ç³»ç»ç»ä»¶ä¸æ»¡è¶³éæ±ï¼åºå æ©å±è®¾è®¡ç³»ç»ï¼èéå¨è§é¢ç»ä»¶ä¸éå¤å®ç°ã
æä»¶è·¯å¾ä¸å½åè§è
ç®å½ç»æ
project-root/ # Remotion é¡¹ç®æ ¹ç®å½
âââ src/remotion/ # Remotion æºç (符å remotion-design-master è§è)
â âââ design/ # è®¾è®¡ç³»ç» (ä» remotion-design-master å¤å¶)
â â âââ tokens/ # 设计 tokens
â â âââ themes/ # ä¸»é¢ (minimalWhite, darkTech...)
â â âââ layout/ # å¸å±ç»ä»¶ (FullBleed, ContentArea...)
â â âââ animation/ # å¨ç»ç»ä»¶ (FadeIn, SlideIn...)
â â âââ components/ # UI ç»ä»¶
â âââ compositions/ # è§é¢ Composition å®ä¹
â âââ Root.tsx # Remotion å
¥å£
â âââ index.ts # 导åº
â
âââ public/media/{video-name}/ # ç´ æç®å½ (Remotion staticFile() å¯è®¿é®)
â âââ {section}_{index}.{ext} # éç¨ç´ æ
â âââ {section}_screenshot.png # ç½é¡µæªå¾
â âââ {section}_logo.png # Logo
â âââ {section}_web_{index}.{ext} # ç½ç»å¾ç
â âââ {section}_ai.png # AI çæå¾ç
â
âââ videos/{video-name}/ # è§é¢é¡¹ç®èµäº§ (é Remotion 代ç )
â âââ topic_definition.md # Step 0: 主é¢å®ä¹
â âââ topic_research.md # Step 1: ç ç©¶èµæ
â âââ podcast.txt # Step 3: æç½èæ¬
â âââ media_manifest.json # Step 4: ç´ ææ¸
å
â âââ publish_info.md # Step 5+12: åå¸ä¿¡æ¯
â âââ podcast_audio.wav # Step 7: TTS é³é¢
â âââ podcast_audio.srt # Step 7: å广件
â âââ timing.json # Step 7: æ¶é´è½´
â âââ thumbnail_*.png # Step 6: å°é¢
â âââ output.mp4 # Step 9: Remotion è¾åº
â âââ video_with_bgm.mp4 # Step 10: æ·»å BGM
â âââ final_video.mp4 # Step 11: æç»è¾åº
â âââ bgm.mp3 # èæ¯é³ä¹
â
âââ remotion.config.ts # Remotion é
ç½®
â ï¸ éè¦: Remotion æ¸²ææ¶å¿ é¡»æå®å®æ´è¾åºè·¯å¾ï¼å¦åé»è®¤è¾åºå°
out/:npx remotion render src/remotion/index.ts CompositionId videos/{name}/output.mp4
å½åè§å
è§é¢åç§° {video-name}: å
¨å°åè±æï¼è¿å符åéï¼å¦ reference-manager-comparisonï¼
ç« èåç§° {section}: å
¨å°åè±æï¼ä¸å线åéï¼ä¸ [SECTION:xxx] ä¸è´
缩ç¥å¾å½å (â ï¸ 16:9 å 4:3 齿¯å¿ é¡»çï¼Bç«ä¸åä½ç½®ä½¿ç¨ä¸åæ¯ä¾):
| ç±»å | 16:9 (ææ¾é¡µæ¨ªç) | 4:3 (æ¨èæµ/卿ç«ç) |
|---|---|---|
| Remotion | thumbnail_remotion_16x9.png |
thumbnail_remotion_4x3.png |
| AI | thumbnail_ai_16x9.png |
thumbnail_ai_4x3.png |
渲æååæä»¶æä½
# 渲æå
cp videos/{name}/podcast_audio.wav videos/{name}/timing.json public/
[ -f videos/{name}/media_manifest.json ] && cp videos/{name}/media_manifest.json public/
# 渲æåæ¸
ç
rm -f public/podcast_audio.wav public/timing.json public/media_manifest.json
rm -rf public/media/{name}
Workflow
| Step | Tool | Output |
|---|---|---|
| 0. Define Direction | brainstorming | topic_definition.md |
| 1. Research | WebSearch, WebFetch | topic_research.md |
| 2. Design Sections | brainstorming | 5-7 sections plan |
| 3. Write Script | Claude | podcast.txt |
| 4. Collect Media | Playwright/WebSearch | media_manifest.json |
| 5. Publish Info (Part 1) | Claude | publish_info.md |
| 6. Thumbnail | Remotion/imagen/imagenty | thumbnail_*.png |
| 7. Generate Audio | generate_tts.py | .wav, .srt, timing.json |
| 7.5. Component Check | remotion-design-master | â ç»ä»¶æ¸ å确认 |
| 8. Create Video | Remotion | Composition ready |
| 9. Render | remotion render | output.mp4 |
| 10. Add BGM | FFmpeg | video_with_bgm.mp4 |
| 11. Subtitles | FFmpeg + SRT | final_video.mp4 |
| 12. Publish Info (Part 2) | Claude | Update chapters |
| 13. Verify | Claude | Verification report |
| 14. Cleanup | Claude | Remove temp files |
Validation Checkpoints
After Step 7 (TTS):
-
podcast_audio.wavexists and plays correctly -
timing.jsonhas all sections with correct timestamps -
podcast_audio.srtencoding is UTF-8
After Step 9 (Render):
-
output.mp4resolution is 3840×2160 - Audio-video sync verified
- No black frames
After Step 11 (Final):
-
final_video.mp4resolution is 3840×2160 - Subtitles display correctly (if added)
- File size is reasonable
Step 0: Define Topic Direction
ä½¿ç¨ brainstorming 确认ï¼
- ç®æ åä¼: ææ¯å¼åè / æ®éç¨æ· / å¦ç / ä¸ä¸äººå£«
- è§é¢å®ä½: ç§æ®å ¥é¨ / 深度解æ / æ°é»éæ¥ / æç¨å®æ
- å 容èå´: åå²èæ¯ / ææ¯åç / ä½¿ç¨æ¹æ³ / 对æ¯è¯æµ
- è§é¢é£æ ¼: 严èä¸ä¸ / è½»æ¾å¹½é» / å¿«èå¥
- æ¶é¿é¢æ: ç (1-3åé) / ä¸ (3-7åé) / é¿ (7-15åé)
ä¿å为 videos/{name}/topic_definition.md
Step 1: Research Topic
Use WebSearch and WebFetch. Save to videos/{name}/topic_research.md.
Step 2: Design Video Sections
Design 5-7 sections:
- Hero/Intro (15-25s)
- Core concepts (30-45s each)
- Demo/Examples (30-60s)
- Comparison/Analysis (30-45s)
- Summary (20-30s)
Content Density Selection
Before designing, assign each section a density tier based on content volume:
| Tier | Items | Title Scale | Best For |
|---|---|---|---|
| Impact | 1 | 1.5x (330px) | Hook, hero, CTA, brand moment |
| Standard | 2-3 | 1.0x (220px) | Features, comparison, demo |
| Compact | 4-6 | 0.8x (176px) | Feature grid, ecosystem |
| Dense | 6+ | 0.65x (143px) | Data tables, detailed comparisons |
Example section plan with tiers:
hero: Impact (1 brand moment)
features: Standard (3 feature cards)
ecosystem: Compact (5 integration icons)
performance: Standard (2 comparison bars)
cta: Impact (1 call-to-action)
Title Position Confirmation
ä½¿ç¨ AskUserQuestion 询é®ç¨æ·æ é¢ä½ç½®å好ï¼
| ä½ç½® | 飿 ¼ | éç¨åºæ¯ |
|---|---|---|
| é¡¶é¨å± ä¸ | è§é¢é£æ ¼ | 大夿°è§é¢å 容 (æ¨è) |
| é¡¶é¨å·¦ä¾§ | æ¼ç¤ºé£æ ¼ | åå¡/æ£å¼å 容 |
| å ¨å±å± ä¸ | è±é飿 ¼ | ä» ç¨äº Hook/Hero åºæ¯ |
è§åï¼ å个è§é¢å ä¿ææ é¢ä½ç½®ä¸è´ã
Step 3: Write Narration Script
Create videos/{name}/podcast.txt with section markers:
[SECTION:hero]
å¤§å®¶å¥½ï¼æ¬¢è¿æ¥å°æ¬æè§é¢ãä»å¤©æä»¬èä¸ä¸ª...
[SECTION:features]
宿以ä¸åè½...
[SECTION:demo]
è®©ææ¼ç¤ºä¸ä¸...
[SECTION:summary]
æ»ç»ä¸ä¸ï¼xxxæ¯ç®åæxxxçxxxã
[SECTION:references]
æ¬æè§é¢åèäºå®æ¹ææ¡£åææ¯å客ã
[SECTION:outro]
æè°¢è§çï¼ç¹èµæå¸æ¶èï¼å
³æ³¨æï¼ä¸æåè§ï¼
æ°åå¿ é¡»ä½¿ç¨ä¸æè¯»é³ – æææ°åå¿ é¡»åæä¸æï¼TTS æè½æ£ç¡®æè¯»ï¼
| ç±»å | â é误 | â æ£ç¡® |
|---|---|---|
| æ´æ° | 29, 3999, 128 | äºåä¹ï¼ä¸åä¹ç¾ä¹åä¹ï¼ä¸ç¾äºåå « |
| å°æ° | 1.2, 3.5 | ä¸ç¹äºï¼ä¸ç¹äº |
| ç¾åæ¯ | 15%, -10% | ç¾åä¹åäºï¼è´ç¾åä¹å |
| æ¥æ | 2025-01-15 | äºé¶äºäºå¹´ä¸æåäºæ¥ |
| 大æ°å | 6144, 234324 | å åä¸ç¾åååï¼äºåä¸ä¸ååä¸ç¾äºåå |
| è±æåä½ | 128GB, 273GB/s | ä¸ç¾äºåå «Gï¼äºç¾ä¸åä¸GBæ¯ç§ |
| ç§å¦è®°æ° | 1 PFLOPS | ä¸PFLOPS |
示ä¾å¯¹æ¯:
â é误: å®ä»·3999ç¾å
ï¼å
å128GBï¼å»å¹´10æ15æ¥å¼å
â
æ£ç¡®: å®ä»·ä¸åä¹ç¾ä¹åä¹ç¾å
ï¼å
åä¸ç¾äºåå
«GBï¼å»å¹´åæåäºæ¥å¼å
â é误: DeepSeek R1 14Bæ¯ç§2074个token
â
æ£ç¡®: DeepSeek R1è¸é¦çååBæ¯ç§ä¸¤åé¶ä¸åå个token
ç« è说æ:
- summary: 纯å 容æ»ç»ï¼ä¸å å«äºå¨å¼å¯¼
- references (å¯é): ä¸å¥è¯æ¦æ¬åèæ¥æº
- outro: æè°¢ + ä¸é®ä¸è¿å¼å¯¼
- 空å
容ç
[SECTION:xxx]为éé³ç« è
Step 4: Collect Media Assets
é¦å 询é®ç¨æ·ï¼æ¯å¦éè¦ä½¿ç¨ imagen skill çæ AI å¾çç´ æï¼
Claude éç« è询é®ç´ ææ¥æºï¼
- è·³è¿ – 纯æå卿
- æ¬å°æä»¶ – æå®è·¯å¾
- ç½é¡µæªå¾ – Playwright æªå¾
- ç½ç»æ£ç´¢ – æç´¢ä¸è½½
- AI çæ – ä½¿ç¨ imagen skillï¼éç¨æ·ç¡®è®¤ï¼
å¦æç¨æ·éæ© AI çæï¼è°ç¨ imagen skill çæå¾çï¼
ä½¿ç¨ imagen skill çæï¼[å¾çæè¿°]
ç´ æä¿åå° public/media/{video-name}/ï¼çæ media_manifest.jsonã
Step 5: Generate Publish Info (Part 1)
åºäº podcast.txt çæ publish_info.md:
- æ é¢ï¼æ°å + ä¸»é¢ + å¸å¼è¯ï¼
- æ ç¾ï¼10个ï¼å«äº§åå/é¢åè¯/ç鍿 ç¾ï¼
- ç®ä»ï¼100-200åï¼
Step 6: Generate Video Thumbnail
询é®ç¨æ·éæ©å°é¢çææ¹å¼:
- Remotionçæ – ä»£ç æ§å¶ï¼é£æ ¼ä¸è§é¢ä¸è´
- AIæçå¾ï¼imagen skillï¼ – ä½¿ç¨ imagen skill çæåæå°é¢
- 两è é½çæ – åæ¶çæä¸¤ç§é£æ ¼ä¾éæ©
â ï¸ å¿ é¡»çæä¸¤ä¸ªæ¯ä¾: 16:9 (ææ¾é¡µ) å 4:3 (æ¨èæµ/卿)ï¼ç¼ºä¸ä¸å¯
Remotion 渲æå°é¢:
npx remotion still src/remotion/index.ts Thumbnail16x9 videos/{name}/thumbnail_remotion_16x9.png
npx remotion still src/remotion/index.ts Thumbnail4x3 videos/{name}/thumbnail_remotion_4x3.png
ä½¿ç¨ imagen skill çæå°é¢:
ä½¿ç¨ imagen skill çæè§é¢å°é¢ï¼
- 主é¢ï¼[è§é¢ä¸»é¢]
- 飿 ¼ï¼ç§ææ/ç®çº¦/活泼
- æ¯ä¾ï¼16:9 å 4:3
Step 7: Generate TTS Audio
cp ~/.claude/skills/video-podcast-maker/generate_tts.py .
python3 generate_tts.py --input videos/{name}/podcast.txt --output-dir videos/{name}
å¤é³å/å鳿 ¡æ£ (SSML Phoneme)
TTS èæ¬æ¯æä¸ç§æ¹å¼æ ¡æ£åé³ï¼ä¼å 级ä»é«å°ä½ï¼
1. å èæ æ³¨ (æé«ä¼å 级) – å¨ podcast.txt ä¸ç´æ¥æ 注ï¼
æ¯ä¸ªæ§è¡å¨[zhà xÃng qì]齿èªå·±çä¸ä¸æçªå£
妿ä¸åæ ¼ï¼å°±æåéå[chóng zuò]
2. 项ç®è¯å
¸ – å¨ videos/{name}/phonemes.json ä¸å®ä¹ï¼
{
"æ§è¡å¨": "zhà xÃng qì",
"éå": "chóng zuò",
"ä¸è¡å½ä»¤": "yì háng mìng lìng"
}
3. å ç½®è¯å ¸ – é¢ç½®å¸¸è§å¤é³åï¼èªå¨åºç¨ï¼ï¼
| è¯è¯ | æ¼é³ | 说æ |
|---|---|---|
| æ§è¡/è¿è¡/å¹¶è¡ | xÃng | “衔佔æ§è¡”ä¹ |
| ä¸è¡å½ä»¤/代ç è¡ | háng | “衔佔è¡å”ä¹ |
| éå/éæ°/éå¤ | chóng | “é”ä½”éå¤”ä¹ |
æ¼é³æ ¼å¼: 使ç¨å¸¦å£°è°ç¬¦å·çæ¼é³ï¼å¦ zhà xÃng qìï¼ï¼èæ¬ä¼èªå¨è½¬æ¢ä¸º Azure SAPI æ ¼å¼ã
Outputs: podcast_audio.wav, podcast_audio.srt, timing.json
Step 7.5: Design System Component Check (å¿ å)
å¨å建è§é¢ç»ä»¶åï¼å¿
é¡»æ£æ¥ remotion-design-master 设计系ç»å¯ç¨ç»ä»¶ï¼
# ååºææå¯ç¨ç»ä»¶
ls ~/.claude/skills/remotion-design-master/src/components/
æ¨èç»ä»¶æ¸ å
| ç»ä»¶ | ç¨é | è·¯å¾ | 使ç¨å»ºè®® |
|---|---|---|---|
| ChapterProgressBar | åºé¨ç« èè¿åº¦æ¡ | navigation/ChapterProgressBar.tsx |
â é»è®¤ä½¿ç¨ |
| FadeIn | æ·¡å ¥å¨ç» | animations/FadeIn.tsx |
æ¨è |
| SlideIn | æ»å ¥å¨ç» | animations/SlideIn.tsx |
æ¨è |
| FullBleed | å ¨å±å¸å± | layouts/FullBleed.tsx |
æ¨è |
| ContentArea | å 容åºå | layouts/ContentArea.tsx |
æ¨è |
| Title | æ é¢ç»ä»¶ | ui/Title.tsx |
æ¨è |
注æï¼ChapterProgressBar é»è®¤å¯ç¨ãå¦éå ³éï¼è¯·å¨ Step 8 æ¶åç¥ Claudeã
éªè¯å½ä»¤
# æ£æ¥è®¾è®¡ç³»ç»æ¯å¦å·²å®è£
[ -d "src/remotion/design" ] && echo "â 设计系ç»å·²å®è£
" || echo "â éè¦å®è£
设计系ç»"
# æ£æ¥ ChapterProgressBar æ¯å¦åå¨
[ -f "src/remotion/design/components/navigation/ChapterProgressBar.tsx" ] && echo "â ChapterProgressBar å¯ç¨" || echo "â éè¦ä»è®¾è®¡ç³»ç»å¤å¶"
å¦æè®¾è®¡ç³»ç»æªå®è£ ï¼æ§è¡ï¼
cp -r ~/.claude/skills/remotion-design-master/src/* src/remotion/design/
Step 8: Create Remotion Composition
å¤å¶æä»¶å° public/:
cp videos/{name}/podcast_audio.wav videos/{name}/timing.json public/
ä½¿ç¨ timing.json 忥ã
æ åè§é¢æ¨¡æ¿ï¼å¿ é¡»éµå¾ªï¼
import { AbsoluteFill, Audio, Sequence, staticFile } from 'remotion'
import timingData from '../../public/timing.json'
import { ChapterProgressBar } from './design/components/navigation/ChapterProgressBar' // 使ç¨è®¾è®¡ç³»ç»
// ç« èä¸æåæ å°
const sectionNamesCN: Record<string, string> = {
hero: 'å¼åº', features: 'åè½', demo: 'æ¼ç¤º', summary: 'æ»ç»', outro: 'ç»è¯',
}
export const MyVideo = () => (
<AbsoluteFill style={{ background: '#fff' }}>
<Audio src={staticFile('podcast_audio.wav')} />
{/* 4K å
容åºå - scale(2) å®¹å¨ */}
<AbsoluteFill style={{ transform: 'scale(2)', transformOrigin: 'top left', width: '50%', height: '50%' }}>
{timingData.sections.map((section: any) => (
<Sequence key={section.name} from={section.start_frame} durationInFrames={section.duration_frames}>
<SectionComponent name={section.name} />
</Sequence>
))}
</AbsoluteFill>
{/* â ï¸ è¿åº¦æ¡å¿
é¡»æ¾å¨ scale(2) 容å¨å¤é¨ */}
<ChapterProgressBar />
</AbsoluteFill>
)
宿´ ChapterProgressBar å®ç° è§
remotion-design-master设计系ç»ã
å ³é®æ¶æè¯´æ
| è¦ç¹ | 说æ |
|---|---|
| ChapterProgressBar ä½ç½® | å¿
é¡»æ¾å¨ scale(2) 容å¨å¤é¨ï¼å¦å宽度ä¼è¢«å缩 |
| ç« è宽度åé | ä½¿ç¨ flex: ch.duration_frames ææ¶é¿æ¯ä¾åé
|
| è¿åº¦æç¤º | å½åç« èå æ¾ç¤ºç½è²è¿åº¦æ¡ï¼åºé¨æ¾ç¤ºæ»è¿åº¦ |
| 4K ç¼©æ¾ | å
容åºåä½¿ç¨ scale(2) ä» 1920Ã1080 æ¾å¤§å° 3840Ã2160 |
ChapterProgressBar é»è®¤å¯ç¨ï¼æä¾ç¨æ·å¯¼èªåè¿åº¦åé¦ãå¦ä¸éè¦ï¼å¯å¨å建è§é¢ç»ä»¶æ¶åç¥ Claude å ³éã
Step 8.5: Preview & Debug (Optional)
Claude behavior: Ask before skipping: “è¦å ç¨ Remotion Studio é¢è§åï¼å¯ä»¥å¨ 4K 渲æååç°é®é¢ï¼èçæ¶é´ã”
Use Remotion Studio for real-time preview before final render:
# Start Remotion Studio (opens browser)
npx remotion studio src/remotion/index.ts
Studio features:
- Real-time preview with timeline scrubbing
- Hot reload on code changes
- Visual debugging of animations and layout
Alternative: Quick preview render
# 720p preview (~4x faster than 4K)
npx remotion render src/remotion/index.ts CompositionId videos/{name}/preview.mp4 --scale 0.33 --crf 28
# Preview first 10 seconds only
npx remotion render src/remotion/index.ts CompositionId videos/{name}/preview.mp4 --frames 0-300 --scale 0.5
# Static frame screenshots
npx remotion still src/remotion/index.ts CompositionId videos/{name}/frame_0.png --frame 0
npx remotion still src/remotion/index.ts CompositionId videos/{name}/frame_300.png --frame 300
Recommended workflow:
- Use
remotion studiofor iterative development - Quick preview render to check full flow
- Final 4K render when satisfied
Step 8.5: Preview & Pronunciation Check (é¢è§å¹¶æ ¡æ£åé³)
卿¸²ææç»è§é¢åï¼ä½¿ç¨ Remotion Studio é¢è§é³é¢åè§é¢ï¼æ£æ¥åé³åç¡®æ§ã
1. å¯å¨ Remotion Studio é¢è§
npx remotion studio
2. æ£æ¥åé³
ææ¾é³é¢ï¼ä»ç»å¬æ¯ä¸ªè¯çåé³ã妿åç°åé³ä¸åç¡®ï¼
æ¹æ³ A: å èæ æ³¨ (æ¨èï¼ç«å³çæ)
# å¨ podcast.txt ä¸ç´æ¥æ 注é误åé³
è¿ä¸ªè¯è¯[zhèng què de fÄ yÄ«n]éè¦æ ¡æ£
æ¹æ³ B: 项ç®è¯å ¸ (æ¨èï¼å¯å¤ç¨)
// å¨ videos/{name}/phonemes.json 䏿·»å
{
"è¯è¯": "zhèng què de fÄ yÄ«n"
}
3. éæ°çæ TTS
# ä¿®æ¹åéæ°çæ
python3 generate_tts.py --input videos/{name}/podcast.txt --output-dir videos/{name}
# éæ°å¤å¶å° public/
cp videos/{name}/podcast_audio.wav videos/{name}/timing.json public/
4. éæ°é¢è§
npx remotion studio
é夿¥éª¤ 2-4 ç´å°å鳿»¡æã
Step 9: Render Video
Use
npx remotion studiofor preview, then render directly for final output.
npx remotion render src/remotion/index.ts CompositionId videos/{name}/output.mp4 --video-bitrate 16M
éªè¯ 4K:
ffprobe -v quiet -show_entries stream=width,height -of csv=p=0 videos/{name}/output.mp4
# ææ: 3840,2160
Step 10: Mix with Background Music
cp ~/.claude/skills/video-podcast-maker/music/perfect-beauty-191271.mp3 videos/{name}/bgm.mp3
ffmpeg -y \
-i videos/{name}/output.mp4 \
-stream_loop -1 -i videos/{name}/bgm.mp3 \
-filter_complex "[0:a]volume=1.0[a1];[1:a]volume=0.05[a2];[a1][a2]amix=inputs=2:duration=first[aout]" \
-map 0:v -map "[aout]" \
-c:v copy -c:a aac -b:a 192k \
videos/{name}/video_with_bgm.mp4
Step 11: Add Subtitles (å¯é)
Claude behavior: Ask before skipping: “éè¦ç§å½åå¹åï¼åå¹å¯ä»¥æé«è§é¢çå¯è®¿é®æ§ã”
å¦ä¸éè¦åå¹ï¼
cp videos/{name}/video_with_bgm.mp4 videos/{name}/final_video.mp4
æ·»å åå¹ï¼çº¯ç½èæ¯ç¨æ·±è²åå¹ï¼:
ffmpeg -y -i videos/{name}/video_with_bgm.mp4 \
-vf "subtitles=videos/{name}/podcast_audio.srt:force_style='FontName=PingFang SC,FontSize=14,PrimaryColour=&H00333333,OutlineColour=&H00FFFFFF,Bold=1,Outline=2,Shadow=0,MarginV=20'" \
-c:v libx264 -crf 18 -preset slow -s 3840x2160 \
-c:a copy videos/{name}/final_video.mp4
å ³é®åæ°:
-s 3840x2160– å¼ºå¶ 4K-crf 18 -preset slow– é«è´¨éç¼ç
Step 12: Complete Publish Info (Part 2)
ä» timing.json çæ Bç«ç« èï¼
00:00 å¼åº
00:23 åè½ä»ç»
00:55 æ¼ç¤º
01:20 æ»ç»
æ ¼å¼ï¼MM:SS ç« èæ é¢ï¼æ¯æ®µé´é â¥5ç§ã
Step 13: Verify Output
è§é¢å®æåï¼æ§è¡ä»¥ä¸éªè¯ï¼
13.1 æä»¶å卿§æ£æ¥
VIDEO_DIR="videos/{name}"
echo "=== æä»¶æ£æ¥ ==="
for f in podcast.txt podcast_audio.wav podcast_audio.srt timing.json output.mp4 final_video.mp4; do
[ -f "$VIDEO_DIR/$f" ] && echo "â $f" || echo "â $f 缺失"
done
13.2 ææ¯ææ éªè¯
echo "=== ææ¯ææ ==="
# å辨ç
RES=$(ffprobe -v quiet -select_streams v:0 -show_entries stream=width,height -of csv=p=0 "$VIDEO_DIR/final_video.mp4")
[ "$RES" = "3840,2160" ] && echo "â å辨ç: 3840x2160 (4K)" || echo "â å辨ç: $RES (é4K)"
# æ¶é¿
DUR=$(ffprobe -v quiet -show_entries format=duration -of csv=p=0 "$VIDEO_DIR/final_video.mp4" | cut -d. -f1)
echo "â æ¶é¿: ${DUR}s"
# ç¼ç
CODEC=$(ffprobe -v quiet -select_streams v:0 -show_entries stream=codec_name -of csv=p=0 "$VIDEO_DIR/final_video.mp4")
echo "â è§é¢ç¼ç : $CODEC"
# æä»¶å¤§å°
SIZE=$(ls -lh "$VIDEO_DIR/final_video.mp4" | awk '{print $5}')
echo "â æä»¶å¤§å°: $SIZE"
13.3 éªè¯æ¥å模æ¿
宿éªè¯åï¼åç¨æ·æ¥åï¼
=== éªè¯å®æ ===
â æä»¶å®æ´æ§: 6/6
â å辨ç: 3840x2160
â æ¶é¿: XXs
â ç¼ç : h264
â 大å°: XXX MB
æ¯å¦éè¦æ¸
çä¸´æ¶æä»¶ï¼(Step 14)
Step 14: Cleanup (å¯é)
Claude behavior: Ask before skipping: “è¦æ¸ çä¸´æ¶æä»¶åï¼å¯ä»¥éæ¾ç£ç空é´ï¼ä½ä¼å é¤ä¸é´äº§ç©ã”
14.1 ååºä¸´æ¶æä»¶
æ§è¡åï¼å åç¨æ·å±ç¤ºå°è¢«å é¤çæä»¶ï¼
VIDEO_DIR="videos/{name}"
echo "=== å°å é¤çä¸´æ¶æä»¶ ==="
ls -lh "$VIDEO_DIR"/part_*.wav 2>/dev/null | awk '{print $9, "(" $5 ")"}'
ls -lh "$VIDEO_DIR"/concat_list.txt 2>/dev/null | awk '{print $9, "(" $5 ")"}'
ls -lh "$VIDEO_DIR"/output.mp4 2>/dev/null | awk '{print $9, "(" $5 ")"}'
ls -lh "$VIDEO_DIR"/video_with_bgm.mp4 2>/dev/null | awk '{print $9, "(" $5 ")"}'
echo ""
echo "=== å°ä¿ççæä»¶ ==="
ls -lh "$VIDEO_DIR"/final_video.mp4 "$VIDEO_DIR"/podcast_audio.wav "$VIDEO_DIR"/podcast_audio.srt "$VIDEO_DIR"/timing.json "$VIDEO_DIR"/podcast.txt 2>/dev/null | awk '{print $9, "(" $5 ")"}'
14.2 ç¨æ·ç¡®è®¤
询é®ç¨æ·:
以ä¸ä¸´æ¶æä»¶å°è¢«å é¤ï¼ä¿çæç»æååæºæä»¶ãæ¯å¦ç»§ç»ï¼
14.3 æ§è¡æ¸ ç
ç¨æ·ç¡®è®¤åæ§è¡ï¼
VIDEO_DIR="videos/{name}"
rm -f "$VIDEO_DIR"/part_*.wav
rm -f "$VIDEO_DIR"/concat_list.txt
rm -f "$VIDEO_DIR"/output.mp4
rm -f "$VIDEO_DIR"/video_with_bgm.mp4
echo "â ä¸´æ¶æä»¶å·²æ¸
ç"
14.4 æ¸ çåæä»¶ç»æ
videos/{name}/
âââ final_video.mp4 # æç»æå
âââ podcast.txt # åå§èæ¬
âââ podcast_audio.wav # é³é¢
âââ podcast_audio.srt # åå¹
âââ timing.json # æ¶é´è½´
âââ topic_research.md # ç ç©¶èµæ
âââ publish_info.md # åå¸ä¿¡æ¯
âââ thumbnail_*_16x9.png # å°é¢å¾ 16:9 (å¿
é¡»)
âââ thumbnail_*_4x3.png # å°é¢å¾ 4:3 (å¿
é¡»)
Background Music Options
Available at ~/.claude/skills/video-podcast-maker/music/:
perfect-beauty-191271.mp3– Upbeat, positivesnow-stevekaldes-piano-397491.mp3– Calm piano
Requirements
System Tools
brew install ffmpeg node # macOS
Python Dependencies
pip install azure-cognitiveservices-speech requests
Node.js Dependencies
npm install remotion @remotion/cli @remotion/player
Environment Variables
# Azure TTS (required)
export AZURE_SPEECH_KEY="your-azure-speech-key"
export AZURE_SPEECH_REGION="eastasia"
# Optional: AI image generation
export GEMINI_API_KEY="..." # imagen (Google)
export DASHSCOPE_API_KEY="..." # imagenty (é¿éäº)
Optional: AI Image Generation
pip install google-genai pillow # imagen
pip install dashscope requests # imagenty
Troubleshooting (常è§é®é¢)
TTS: Azure API å¯é¥é误
çç¶: Error: Authentication failed, HTTP 401 Unauthorized
è§£å³æ¹æ¡:
# æ£æ¥ç¯å¢åé
echo $AZURE_SPEECH_KEY
echo $AZURE_SPEECH_REGION
# 设置ç¯å¢åé
export AZURE_SPEECH_KEY="your-key-here"
export AZURE_SPEECH_REGION="eastasia"
FFmpeg: BGM æ··é³é®é¢
çç¶: BGM é³éè¿å¤§çä½äººå£°ï¼BGM ç»å°¾çªç¶ä¸æ
è§£å³æ¹æ¡:
# åºç¡æ··é³ï¼äººå£°ä¸ºä¸»ï¼BGM éä½ï¼
ffmpeg -i voice.mp3 -i bgm.mp3 \
-filter_complex "[0:a]volume=1.0[voice];[1:a]volume=0.15[bgm];[voice][bgm]amix=inputs=2:duration=first" \
-ac 2 output.mp3
# 带淡å
¥æ·¡åºçæ··é³
ffmpeg -i voice.mp3 -i bgm.mp3 \
-filter_complex "
[0:a]volume=1.0[voice];
[1:a]volume=0.15,afade=t=in:st=0:d=2,afade=t=out:st=58:d=2[bgm];
[voice][bgm]amix=inputs=2:duration=first
" output.mp3
å¿«éæ£æ¥æ¸ å
渲æåæ£æ¥:
- ææç´ ææä»¶åå¨
- timing.json æ ¼å¼æ£ç¡®
- é³é¢æ¶é¿ä¸ timing å¹é
- ç¯å¢åé已设置
- ç£ç空é´å è¶³ (>20GB for 4K)
渲æåæ£æ¥:
- è§é¢æ¶é¿æ£ç¡®
- é³ç»åæ¥
- å广¾ç¤ºæ£å¸¸
- æ é»å±/空ç½å¸§