huashu-slides
npx skills add https://github.com/alchaincyf/huashu-skills --skill huashu-slides
Agent 安装分布
Skill 文档
AI Presentation Workflow
Create professional presentations: Content â Design â Build â Assembly â Polish.
Step 0: Choose Workflow Settings
At the start of every presentation task, ask the user TWO choices:
0-A. Collaboration Mode
| Mode | Description | Checkpoints |
|---|---|---|
| Full Auto | Minimal interaction. Confirm topic only, deliver final PPTX. | 1 checkpoint |
| Guided (recommended) | Confirm outline, pick design, preview before assembly. | 3 checkpoints |
| Collaborative | Review every slide, approve every illustration, full control. | Per-slide |
If the user doesn’t specify, default to Guided mode.
0-B. Assembly Method
| Method | How it works | Best for |
|---|---|---|
| Editable HTML (Path A) | HTML slides + selective AI illustrations â html2pptx â editable PPTX | Need to edit text later, precise layout, corporate decks |
| Full AI Visual (Path B) | Every slide as a complete AI-generated image â create_slides.py â image PPTX | Maximum visual impact, artistic presentations, quick drafts |
Trade-offs:
| Path A: Editable HTML | Path B: Full AI Visual | |
|---|---|---|
| Text | Editable in PPT | Baked into image (not editable) |
| Visual quality | Good with illustrations | Excellent â cohesive design |
| Layout control | Pixel-precise | AI-interpreted |
| File size | Smaller (~5-25MB) | Larger (~30-80MB) |
| Chinese text | Perfect (font rendering) | Usually good (AI may occasionally misrender) |
| Speed | Faster (HTML creation) | Slower (image generation per slide) |
If the user doesn’t specify, default to Path A (Editable HTML).
Step 1: Content Structuring
Turn raw material into a slide-by-slide outline.
Per slide, define:
- Title â a complete assertion sentence (not a topic word)
- Key points â 3-4 maximum
- Visual type â illustration / chart / diagram / icon / quote
- Path A: Illustration needed? â Yes/No. If yes, one-line description.
- Path B: Visual scene description â one paragraph describing the complete slide visual (layout + imagery + mood).
Assertion-Evidence rule:
| Bad title | Good title |
|---|---|
| Q3 Sales | Q3éå®å¢é¿23%ï¼æ°ç¨æ·æ¯ä¸»è¦é©±å¨å |
| Methodology | æä»¬éè¿åç²å®éªéªè¯äºè¿ä¸ªç»è®º |
è¯è¨è§åï¼slideå 容ä¸å¾ç¨ä¸æï¼ä» ä¿çå¿ è¦çè±ææ¯è¯ï¼äººåãåçåãææ¯ä¸æåè¯ï¼ã Section labelï¼å¦ INSIGHTãTAKEAWAYï¼å¯ç¨è±æä½ä¸ºè®¾è®¡å ç´ ã
â Checkpoint 1 (Guided + Collaborative)
Present the outline as a table:
Path A:
| # | Title (assertion) | Key Points | Visual Type | Illustration? |
|---|-------------------|------------|-------------|---------------|
| 1 | Cover: ... | â | Decorative | Yes: ... |
| 2 | ... | 1. ... 2. ... | Chart | No |
| 3 | ... | 1. ... 2. ... | Illustration | Yes: ... |
Path B:
| # | Title (assertion) | Key Points | Visual Scene Description |
|---|-------------------|------------|--------------------------|
| 1 | Cover: ... | â | Dark gradient bg, large title centered, abstract network nodes |
| 2 | ... | 1. ... 2. ... | Split layout: text left, bar chart right, clean white bg |
| 3 | ... | 1. ... 2. ... | Full illustration: person at crossroads with floating clocks |
Ask the user:
- Approve / adjust slide count
- Path A: Approve / adjust which slides get illustrations
- Path B: Approve / adjust visual scene descriptions
- Any content to add or remove
Step 2: Design System
Present 3 design system options for the user to choose from. Each is a complete visual language, not just a color palette.
CRITICAL: A design system is NOT just colors. It defines visual philosophy, typography ratios, composition rules, and emotional intent. This is the difference between “boring PPT” and “magazine-quality deck.”
ð£ï¸ Style Discussion (Optional, if user wants to explore)
If the user says things like:
- “ææ³è¦XX飿 ¼”ï¼ç°ä¸ä¸å ãç士å½é 主ä¹ãå 豪æ¯ãèå¾·éå®…ï¼
- “æä¸ç¡®å®æ³è¦ä»ä¹é£æ ¼”
- “è½ç»æççä¸å飿 ¼çä¾åå”
Then consult the design movements reference:
references/design-movements.md â 设计è¿å¨ä¸é£æ ¼åèåº
This file maps classic design movements (Neo-Brutalism, Swiss Style, Bauhaus, etc.) to our AI-ready style presets. Use it to:
- Translate user’s aesthetic language into actionable prompts
- Build shared vocabulary (“è¿ä¸ªæ¹ååç°ä¸ä¸å ” vs “é£ä¸ªåææä¸»ä¹”)
- Reference when designing new custom styles from scratch
After discussing movements, proceed to recommend 3 concrete presets below.
Design System Presets
â ï¸ CRITICAL INSIGHT: æç»/漫ç»ç±»é£æ ¼çAIçæææè¿å¥½äºãä¸ä¸æç®ãç±»é£æ ¼ã 漫ç»/æç»é£æ ¼ææç¡®çè§è§è¯è¨ï¼çº¿æ¡ãè§è²ãè²åï¼ï¼AIå¯ä»¥å ååæ¥ï¼æç®é£æ ¼ï¼æè²åº+åå æå+大éçç½ï¼ç¼ºä¹è§è§å ç´ ï¼çæåºæ¥ã空ãä¸ãå¹³ãã
Pick 3 that match the topic/mood. Use the topic recommendation table below, then present each with its full description.
æä¸»é¢èªå¨æ¨èï¼ä¼å 仿¤è¡¨éï¼ï¼
| 主é¢ç±»å | ç¬¬ä¸æ¨è | ç¬¬äºæ¨è | ç¬¬ä¸æ¨è |
|---|---|---|---|
| åç/产åä»ç» | Snoopyæ¸©ææ¼«ç» | Neo-Popæ°æ³¢æ® | æµ®ä¸ç»/æ¦ç ï¼ä¸æ¹åçï¼ |
| æè²/å¹è® | Neo-Brutalism | å¦ç¿æ¼«ç» | Snoopyæ¸©ææ¼«ç» |
| ææ¯å享 | xkcdç½æ¿ | Neo-Brutalism | Ligne Claire |
| æ°æ®æ¥å | Pentagramç¼è¾ | Fathomæ°æ® | Ligne Claire |
| å¹´è½»åä¼ | Neo-Pop | åç´ ç» | åçå°å· |
| åæ/èºæ¯ | 达达æ¼è´´ | åçå°å· | The Oatmeal |
| å½é£/䏿¹ | æ¦ç å£ç» | æµ®ä¸ç» | Takramæè¾¨ |
| æ£å¼åå¡ | Pentagramç¼è¾ | Müller-Brockmannç½æ ¼ | Buildæç® |
| 产ååå¸/keynote | èèææä¸»ä¹ | Neo-Pop | Pentagramç¼è¾ |
| å é¨å享 | Neo-Brutalism | The Oatmeal | xkcdç½æ¿ |
| è¡ä¸åæ/å¨è¯¢ | Fathomæ°æ® | Pentagramç¼è¾ | Müller-Brockmannç½æ ¼ |
| å¹è®è¯¾ä»¶/ææ | Takramæè¾¨ | 温æåäº | å¦ç¿æ¼«ç» |
| æèµ/èèµè·¯æ¼ | Buildæç® | Pentagramç¼è¾ | èèææä¸»ä¹ |
宿´18ç§é£æ ¼è¯¦ç»åèï¼ references/proven-styles-gallery.md
飿 ¼æ ·ä¾å¾çï¼ assets/style-samples/ ç®å½
ç¬¬ä¸æ¢¯éï¼å¼ºçæ¨èï¼æææå¥½ï¼ï¼
1. Warm Comic Strip â Snoopyæ¸©ææ¼«ç»é£
- Philosophy: Peanuts漫ç»ç温æä¸å²çæââç®åçè§è²è¯´çæ·±å»çè¯ï¼æ¥å¸¸åºæ¯ä¸è´å«äººçæºæ §
- Visual world: å头å°å©ãå°çãå°é¸ç»æä¸ä¸ªæ¸©æçå°ä¸çãèæ¯æç®ï¼èå°ã天空ãçå±ãæ ï¼ãè²è°åæ³é»çæ¥çº¸æ¼«ç»
- Reference: “Like a Peanuts comic strip â warm, philosophical, charming”
- Style guide:
references/proven-styles-snoopy.md - â ï¸ å ³é®ç»éªï¼ ä¸è¦å¨promptä¸è¿åº¦çº¦æè§è§ç»èï¼é¢è²æ¯ä¾ãæå¾ä½ç½®ãè§è²å§¿å¿ï¼ï¼å¦åä¼ä¸¥ééä½å¤æ ·æ§ãåªæè¿°æ 绪åå 容ï¼è®©AIèªç±åæ¥
2. Manga Educational â å¦ç¿æ¼«ç»é£
- Philosophy: Japanese educational manga (å¦ç¿æ¼«ç») â a character GUIDES you through the concept with reactions and drama
- Colors: Bright and warm palette, white bg with selective color panels, screen-tone gray for emphasis areas
- Ratio: 60% illustration / 30% text (in bubbles) / 10% effects
- Typography: Bold manga-style titles with impact, body text in speech/thought bubbles, onomatopoeia as decorative elements. Size contrast 3:1
- Composition: Dynamic manga panel layouts (3-5 panels per slide), character reactions drive emphasis, speed lines for energy, dramatic angles
- Visual language: Expressive anime-style characters, reaction faces (surprise, confusion, eureka!), manga effects (sweat drops, sparkles, speed lines), panel borders with varied thickness
- Reference: “Like a ‘Manga Guide to Statistics’ page â a character walks you through the concept, reacting with surprise and delight”
3. Ligne Claire Comics â æ¸ 线漫ç»é£
- Philosophy: Hergé’s Tintin tradition â maximum information clarity through visual restraint
- Colors: White/cream (#FFFDF7) bg, black (#000000) outlines, flat saturated fills (3-5 solid colors, no gradients)
- Ratio: 70% clean bg / 20% illustration / 10% text
- Typography: Hand-lettered feel for titles, clean sans-serif for body. Speech bubbles for key quotes. Title:body = 2.5:1
- Composition: Panel-based layouts (2-4 panels per slide), sequential left-to-right reading flow, clear gutters between panels
- Visual language: Uniform-weight outlines, flat colors without shading or hatching, no gradients, precise details but zero visual noise
- Reference: “Like a Tintin page explaining a concept â every panel advances understanding, nothing is decorative”
4. Neo-Pop Magazine â æ°æ³¢æ®æå¿é£
- Philosophy: Youth media / streetwear brand aesthetic, bold and playful
- Colors: Cream (#FFF8E7) bg, black (#000000) text, color-blocking with hot pink (#FF1493) + cyan (#00CED1) + yellow (#FFD700)
- Ratio: 50% bg / 25% color blocks / 25% content
- Typography: Headlines 40-50% of slide area (typography AS the visual), thick black borders around text blocks, 10:1 size ratio vs body
- Composition: Modular color blocks with “controlled chaos”, stacked asymmetric layouts, thick borders
- Visual language: Pixel-art 8-bit icons, cutout photography, speech bubbles, bold graphic surfaces
- Reference: “Like a Supreme lookbook meets a HYPEBEAST article â treats typography as graphic art”
ç¬¬äºæ¢¯éï¼æ¨èï¼ç¹å®åºæ¯ææå¥½ï¼ï¼
5. Whiteboard Sketch â xkcdç½æ¿æç»é£
- Philosophy: xkcd meets a professor’s whiteboard â extreme minimalism forces focus on the idea itself
- Colors: White (#FFFFFF) bg, black (#000000) ink, ONE accent color for emphasis (red #FF4444 or blue #4488FF)
- Ratio: 85% white space / 10% sketch / 5% accent highlight
- Typography: Hand-drawn/handwritten feel for everything, rough uneven baselines, arrows and annotations everywhere. Key numbers can be large (60pt+)
- Composition: Freeform whiteboard layout, hand-drawn arrows connecting concepts, diagrams and stick figures, informal and alive
- Visual language: Stick figures, hand-drawn charts and graphs, wobbly lines, annotation arrows, circled keywords, equation-style layouts
- Reference: “Like an xkcd ‘What If?’ explanation â simple drawings that make complex ideas instantly click”
6. Soviet Constructivism â èèææä¸»ä¹
- Philosophy: Revolutionary propaganda poster â power through geometry and limited color
- Colors: Revolutionary red (#CC0000) 40% + black (#1A1A1A) 25% + cream white (#F5E6D3) 30%
- Typography: All text rotated 15-30 degrees, NO horizontal lines, bold condensed
- Composition: Diagonal wedge from bottom-left to top-right, geometric shapes growing small to large (visual crescendo)
- Visual language: NO gradients, pure flat fills + sharp edges, three-color limit, propaganda poster energy
- Reference: “Like a 1920s Rodchenko poster â power, urgency, and geometric precision”
7. Warm Narrative â æ¸©æåäºé£
- Philosophy: Friendly storytelling, like a TED talk visual or Airbnb pitch deck
- Colors: Warm cream (#FDF6EC) bg, dark charcoal (#3D3D3D) text, coral (#E17055) accent
- Ratio: 60% warm bg / 25% content / 15% illustration
- Typography: Headlines bold and warm, 3:1 ratio to body. Short sentences, not bullets
- Composition: Illustration occupies 40-50% of slide, text wraps around visuals, rounded shapes
- Visual language: Flat vector illustrations with warm palette, people-centric imagery, storytelling flow
- Reference: “Like a Mailchimp or Notion brand presentation â approachable and human”
æ´å¤é£æ ¼ï¼ç¬¬äº/䏿¢¯éï¼ è¯¦è§ references/proven-styles-gallery.mdï¼å
æ¬ï¼The Oatmealä¿¡æ¯å¾æ¼«ç»ãæ¦ç
å£ç»ãæµ®ä¸ç»ãåçå°å·Risographãçè½´æµIsometricãBauhauså
豪æ¯ãå·¥ç¨èå¾Blueprintãå¤å¤å¹¿åVintage Adã达达æ¼è´´Collageãåç´ ç»Pixel Art
第åç±»ï¼Professional / Editorial 设计系ç»ï¼Path A ä¸ç¨ï¼
â ï¸ ä»¥ä¸é£æ ¼ 强çå»ºè®®ä½¿ç¨ Path Aï¼HTMLâPPTXï¼ãå®ä»¬ä¾èµç²¾ç¡®æçãæ°æ®å¯è§ååç½æ ¼ç³»ç»ï¼AIå¾ççææ æ³è¾¾å°æé精度ãå£è è¡ä¸åææ¡ä¾å·²éªè¯ Path A + Pentagramç¼è¾é£æ ¼çåºè²ææã
8. Pentagram Editorial â ç¼è¾æå¿é£ï¼ä¿¡æ¯å»ºçæ´¾ï¼
- Philosophy: Pentagram/Michael Bierut â åä½å³è¯è¨ï¼ç½æ ¼å³ææ³ãç¨æåº¦å å¶çè®¾è®¡è®©æ°æ®åå 容èªå·±è¯´è¯
- Colors: 奶油ç½(#FFFDF7) bg, è¿é»(#1A1A1A) text, ONE accent color (妿©çº¢#D4480Bæåçè²)
- Ratio: 60% whitespace / 30% content / 10% accent
- Typography: ç²é»æ é¢(28pt+) + è½»æ£æ(10-13pt), è±æsection labelä½ä¸ºè®¾è®¡å ç´ (INSIGHT / PART 03)
- Composition: çå£«ç½æ ¼ç³»ç», 2pxé»è²è¾¹æ¡å¡ç, ç²¾ç¡®çæ°´å¹³åé线, æ°æ®å¯è§åå åµ
- Visual language: æç®å¾æ , æ¡å½¢å¾/饼å¾/è¶å¿çº¿, calloutæ¡, tagæ ç¾
- Reference: “Like a McKinsey insight report meets Monocle magazine â data-rich but editorially elegant”
- æ§è¡è·¯å¾: Path A onlyï¼HTMLâPPTXï¼
- 宿éªè¯: å£è
è¡ä¸åæ15页deckï¼
_temp/å£è è¡ä¸åæ/slides/ï¼
9. Fathom Data Narrative â æ°æ®åäºé£ï¼ç§å¦æåæ´¾ï¼
- Philosophy: Fathom Information Design â æ¯ä¸ä¸ªåç´ é½å¿ é¡»æ¿è½½ä¿¡æ¯ãç§å¦ä¸¥è°¨+设计ä¼é
- Colors: ç½(#FFFFFF) bg, æ·±ç°(#333) text, æµ·åè(#1A365D) primary + ä¸ä¸ªhighlight color
- Ratio: 50% charts/data / 30% text / 20% whitespace
- Typography: GT America/Graphik飿 ¼çsans-serif, 大æ°å(60pt+)ä½ä¸ºè§è§éç¹, 精确çèæ³¨/æ¥æºæ 注
- Composition: é«ä¿¡æ¯å¯åº¦ä½ä¸æ¥æ¤, 注éç³»ç»åµå ¥å¸å±, small multipleså¾è¡¨éµå, ç²¾ç¡®çæ¶é´çº¿
- Visual language: æ£ç¹å¾, çåå¾, timeline, 带注éçå¾è¡¨, æ°æ®æ ç¾ç²¾ç¡®å°å°æ°
- Reference: “Like a Nature paper’s data supplement meets a Bloomberg data feature”
- æ§è¡è·¯å¾: Path A onlyï¼HTMLâPPTXï¼
10. Müller-Brockmann Grid â çå£«ç½æ ¼é£ï¼çº¯ç²¹ä¸»ä¹æ´¾ï¼
- Philosophy: Josef Müller-Brockmann â å®¢è§æ§å³ç¾ãæ°å¦ç²¾ç¡®çç½æ ¼ç³»ç»è®©ä»»ä½æ··ä¹±çä¿¡æ¯å徿åº
- Colors: ç½(#FFFFFF) bg, é»(#000) text, æå¤ä¸ä¸ªå¼ºè°è²
- Ratio: 70% structured grid / 20% text / 10% accent
- Typography: Akzidenz-Grotesk/Helvetica, ä¸¥æ ¼ç8ptåºçº¿ç½æ ¼, ç»å¯¹å·¦å¯¹é½, åé对æ¯(300 vs 700)
- Composition: 8åæ°å¦ç½æ ¼, ææå ç´ å¯¹é½å°ç½æ ¼çº¿, ç»å¯¹ä¸å è®¸è£ é¥°å ç´ , åè½ä¸»ä¹è³ä¸
- Visual language: 纯å ä½å¾å½¢, é»è²çº¿æ¡è¡¨æ ¼, 精确对é½çå表, æ 徿 æ æç»
- Reference: “Like the original Swiss Style poster â timeless, rational, zero decoration”
- æ§è¡è·¯å¾: Path A onlyï¼HTMLâPPTXï¼
11. Build Luxury Minimal â 奢便ç®é£ï¼å½ä»£åçæ´¾ï¼
- Philosophy: Build Studio â ç²¾è´çç®åæ¯å¤ææ´é¾ãç¨å¤§éçç½åå¾®å¦åéååä¼ è¾¾é«ç«¯æ
- Colors: 纯ç½(#FFFFFF) bg, æ·±ç°(#2D2D2D) text, åä¸accent(åçè²)æå°é使ç¨
- Ratio: 75% whitespace / 15% text / 10% accent
- Typography: åéååæå¾®å¦(200-600), æ é¢å·¨å¤§(48pt+)ä½è½», æ£æå°èç²¾(12pt), åé´è·å®½æ¾
- Composition: é»éæ¯ä¾æå¾, å ç´ æå°, æ¯é¡µåªè¯´ä¸ä»¶äº, å¼å¸æä¼å
- Visual language: é«ç«¯äº§åå¾(妿æ), æç®å¾æ 线æ¡, 大é¢ç§¯çº¯è²å, åè§å¡ç
- Reference: “Like an Apple keynote meets a Celine lookbook â confident restraint”
- æ§è¡è·¯å¾: Path Aï¼HTMLâPPTXï¼
12. Takram Speculative â æ¥å¼æè¾¨é£ï¼ä¸æ¹å²å¦æ´¾ï¼
- Philosophy: Takram â ææ¯æ¯æèçåªä»ãç¨æåçç§ææåæ¦å¿µååå¾ä¼ 达深度æè
- Colors: æç°(#F5F3EF) bg, æ·±ç°(#3D3D3D) text, é¼ å°¾è绿(#8B9D77) accent
- Ratio: 55% warm bg / 25% diagrams / 20% text
- Typography: åæ¶¦çsans-serif, æ é¢ä¸ç¨ç²ä½èç¨å¤§å°ºå¯¸(36pt+), æ£ææ¸©æ(14pt), è¡é«å®½æ¾(1.8)
- Composition: æåé´å½±(blur 20px+), åè§(16px+), æ¦å¿µå¾/æµç¨å¾ä½ä¸ºæ ¸å¿è§è§, å¡çå¼å¸å±
- Visual language: æ¦å¿µååå¾, æåæ¸å, æµç¨å¾å³èºæ¯, æç»æå¾æ , èªç¶è²è°
- Reference: “Like a Takram project page â where technology feels thoughtful, not aggressive”
- æ§è¡è·¯å¾: Path Aï¼HTMLâPPTXï¼é å¾å¯AIè¾ å©çæï¼
æ´æ·±å
¥ç飿 ¼ç»èï¼åè design-philosophy skill ç references/design-styles.mdï¼å
å«20ç§è®¾è®¡å²å¦ç宿´æç¤ºè¯DNA
ð¨ Custom Character Style (User-Defined)
Users may want to reference specific cartoon/anime aesthetics. When a user says “do it in Doraemon style” or “like Studio Ghibli”, treat this as a style reference, not a request to draw copyrighted characters. Build a custom Design System by extracting the visual DNA of that style.
How to convert a character reference into a Design System:
| User says | Extract these visual traits |
|---|---|
| “Doraemon style” | Round shapes, bright primary blue + white + red, simple backgrounds, cute proportions, magical gadget reveals |
| “Studio Ghibli” | Watercolor textures, natural greens and sky blues, detailed backgrounds with simple characters, warmth and wonder |
| “Calvin and Hobbes” | Dynamic ink brushwork, expressive motion lines, philosophical contrast between fantasy and reality, lush outdoor scenes |
| “One Piece manga” | Bold dynamic lines, exaggerated proportions, dramatic action poses, high energy, thick outlines |
| “Crayon Shin-chan” | Crude crayon-like lines, flat bright colors, comedic proportions, everyday scenarios made absurd |
| “Adventure Time” | Geometric simple shapes, pastel candy colors, thin outlines, whimsical surreal backgrounds |
Template for custom style:
[User Style]: "[reference name]"
â Shape language: [round/angular/geometric/organic]
â Line quality: [thin uniform / thick varied / sketchy / brushwork]
â Color palette: [specific colors extracted from that aesthetic]
â Character style: [proportions, expressiveness level]
â Background treatment: [detailed/minimal/abstract]
â Emotional tone: [warm/energetic/philosophical/surreal]
Typography Rules (All Presets)
- Max 2 font families (1 heading + 1 body)
- Heading: bold, personality â â¥36pt (trend: even larger, as graphic surface)
- Body: clean, readable â â¥18pt
- Chinese: system default (PingFang SC / Microsoft YaHei)
- Key principle: Typography is a DESIGN ELEMENT, not just an information container
â Checkpoint 2 (Guided + Collaborative)
Ask the user to pick one of the 3 proposed design systems, or describe their own preference. Show the full description including philosophy, visual language, and reference.
Step 3: Build Slides
Step 3-A: HTML + Selective Illustrations (Path A)
Generate AI illustrations for key slides, then create HTML slide files.
Which slides need illustrations? Prioritize:
- Cover slide â always. Sets the visual tone.
- Key insight slides â the “aha moment” slides benefit most.
- Closing slide â optional but impactful.
- Data-heavy slides â charts/diagrams instead of AI art.
Illustration Generation â use nano-banana-pro skill:
export $(grep GEMINI_API_KEY ~/.claude/.env) && \
uv run ~/.claude/skills/nano-banana-pro/scripts/generate_image.py \
--prompt "[description]" \
--filename "[timestamp]-slide-[N]-[name].png" \
--resolution 2K
Base Style Prompt â define ONE style suffix, append to every illustration:
[Base Style]: flat vector illustration, [palette background color] background,
[accent color] highlight elements, clean minimalist aesthetic,
professional presentation style, no text in image
Per-slide prompt = [specific content] + [Base Style]
Key rules:
- Always include “no text in image” â text will be added as editable elements
- Use descriptive paragraphs, not keyword lists
- Specify hex colors explicitly
- Use “flat vector” / “flat illustration” for consistency
Embedding in HTML slides:
<!-- Side illustration (recommended) -->
<div class="left"><!-- text content --></div>
<div class="right"><img src="illustration.png" style="width: 280pt; height: 280pt;"></div>
<!-- Background illustration -->
<body style="background-image: url('illustration.png'); background-size: cover;">
â Checkpoint 3-A (Guided: preview 2-3 key illustrations; Collaborative: every one)
Show generated illustrations. Ask: Approve / regenerate / style consistent?
Step 3-B: Full AI Slide Generation (Path B)
Generate EVERY slide as a complete AI image â layout, text, visuals, all in one.
â ï¸ THE #1 MISTAKE: Over-constraining the prompt with layout details and visual restrictions. More constraints = LESS creativity and diversity. The AI generates best when given mood + reference + content, NOT specific positions, color ratios, or character restrictions.
The Golden Rule of AI Image Prompts
SHORT prompts > LONG prompts. A 3-sentence prompt describing mood and content produces better results than a 30-line prompt specifying every visual detail. Specifically:
| DON’T (kills diversity) | DO (enables creativity) |
|---|---|
| Specify color ratios (60%/25%/15%) | Describe the mood (“warm like a Sunday comic page”) |
| Dictate layout positions (“title centered, image on right”) | Reference a specific aesthetic (“Peanuts comic strip”) |
| Restrict characters (“NOT Snoopy â an original character”) | Let AI interpret the style naturally |
| List every visual element to include | Describe what the viewer should FEEL |
| Repeat the base style in every per-slide prompt | Define base style once, keep per-slide prompts short |
Base Style Prompt â Keep it SHORT
Define a base style once, append to every slide. Keep it under 5 lines. The base style sets the mood; per-slide prompts add the content.
[Base Style]:
VISUAL REFERENCE: [Specific art/design aesthetic in one sentence]
CANVAS: 16:9 aspect ratio, 2048x1152 pixels, high quality rendering.
COLOR SYSTEM: [Describe the mood/feel of colors, not exact ratios]
Example (good â concise):
VISUAL REFERENCE: Charles Schulz Peanuts comic strip â warm, philosophical, charming.
Characters include round-headed kids, a lovable beagle dog, and a small yellow bird.
CANVAS: 16:9 aspect ratio, 2048x1152 pixels, high quality rendering.
COLOR SYSTEM: Warm cream/newspaper tone background, soft muted pastels, warm ink lines.
Anti-pattern (bad â over-specified): Do NOT include typography sizes, color ratios, composition percentages, margin specifications, or visual weight distributions in the base style. These constraints reduce diversity without improving quality.
Per-Slide Prompt Structure
Keep per-slide prompts short and focused. Do NOT repeat base style details or over-specify visual layout.
Create a [style] slide about [topic].
[Base Style]
DESIGN INTENT: [1 sentence â what the viewer should FEEL]
TEXT TO RENDER:
- Title: "[exact text]"
- Body: "[exact text]"
[Optional: 1-2 sentences describing mood or scene. Let AI decide composition.]
Example â GOOD vs BAD
BAD (traditional PPT â boring):
Design a professional presentation slide.
Professional presentation slide, 16:9 aspect ratio, 2048x1152 pixels.
Dark navy background, light gray text, gold accent.
Slide type: content. Layout: Title at top-left, two columns below.
Title: "çæ¶¨æææ¶çç»æ"
Body: "è¡æä»·: 100å
, æå©é: 10å
"
Visual: a line chart showing call option payoff
â Result: Generic PPT that could come from any template
GOOD (magazine-level â stunning):
Create a slide that feels like a Bloomberg terminal data visualization
brought to life as editorial art.
VISUAL REFERENCE: Bloomberg Businessweek data feature meets cinematic lighting.
CANVAS: 16:9, 2048x1152, sharp rendering.
COLOR SYSTEM: Deep black (#0A0A0A) background 75%, white text 15%,
gold (#BF9A4A) accent 10%. The gold represents profit â it should GLOW.
TYPOGRAPHY: The number "110" rendered at 100pt as the dominant visual anchor
(the break-even point IS the story). Supporting text at 14pt, muted gray.
DESIGN INTENT: The viewer should instantly FEEL the asymmetry of options â
limited downside, unlimited upside. The visual must make this visceral,
not just informational.
TEXT TO RENDER:
- Hero metric: "110" (giant, gold, the break-even price)
- Title: "çäºå¹³è¡¡ç¹" (medium, white, above the number)
- Left data: "è¡æä»· 100" "æå©é 10" (small, gray, understated)
- Insight: "äºææåº ç婿 é" (accent color, bottom)
VISUAL NARRATIVE: A single golden curve emerges from the left side of the slide,
flat and muted in gray at -10 (the maximum loss), then suddenly bending upward
at the strike price, transitioning from gray to brilliant gold as it rises
into the profit zone. The curve should feel like a ray of light breaking
through darkness. The profitable area above zero glows with warm gold
atmospheric lighting, like sunrise. The chart has NO grid lines, NO axes labels
cluttering the visual â just the pure, dramatic curve and the giant "110"
floating at the inflection point.
â Result: An editorial data visualization that tells a story
Key Rules for Path B Prompts
Prompt Quality Checklist (verify before every generation):
- Visual Reference â Does the prompt name a specific art style or publication? (NOT just “professional” or “modern”)
- Mood, not Layout â Does the prompt describe what the viewer should FEEL, not where elements should be PLACED?
- Text Content â Are all texts to render listed clearly and accurately?
- Short Enough â Is the prompt concise? Long prompts with detailed specs REDUCE diversity. Remove anything the AI can decide on its own.
- NO Micro-Management â No hex color ratios, no typography sizes, no composition percentages, no character pose instructions.
Technical Rules:
- Always specify resolution:
2048x1152(2K, 16:9) for crisp text - Include ALL text verbatim â AI must render exact words
- 䏿ä¼å : slideä¸çæåä¸å¾ç¨ä¸æï¼ä» ä¿çå¿ è¦è±ææ¯è¯
- Chinese text tip: Keep titles short (â¤8 characters) for best rendering
- Use descriptive paragraphs, not keyword lists
- Generate in parallel: Run 3-5 slide generations concurrently for speed
- Consistency: The Base Style is applied to EVERY slide. It’s a system, not a suggestion
Generation command (same tool, but full-slide prompts):
export $(grep GEMINI_API_KEY ~/.claude/.env) && \
uv run ~/.claude/skills/nano-banana-pro/scripts/generate_image.py \
--prompt "[full slide prompt]" \
--filename "slide-[NN]-[name].png" \
--resolution 2K
Quality check after generation:
- Text accuracy â verify all Chinese/English text rendered correctly
- Layout â elements positioned as described
- Style consistency â colors and design language match across slides
- If a slide has text errors â regenerate with adjusted prompt (simplify text or shorten)
â Checkpoint 3-B (Guided: preview all slides as a set; Collaborative: approve each)
Show ALL generated slide images to the user. Ask:
- Text readable and accurate?
- Visual style consistent across slides?
- Any slides to regenerate?
Step 4: PPTX Assembly
4-A: html2pptx Workflow (Path A)
Create HTML files per slide, convert with html2pptx.js:
const pptxgen = require('pptxgenjs');
const html2pptx = require(process.env.HOME + '/.agents/skills/pptx/scripts/html2pptx.js');
const pptx = new pptxgen();
pptx.layout = 'LAYOUT_16x9';
await html2pptx('slide1.html', pptx);
await html2pptx('slide2.html', pptx);
await pptx.writeFile({ fileName: 'output.pptx' });
HTML rules (from pptx skill):
- Body dimensions:
width: 720pt; height: 405pt(16:9) - ALL text must be in
<p>,<h1>–<h6>,<ul>,<ol>tags - Backgrounds/borders only on
<div>elements - No CSS gradients â pre-render as PNG with Sharp
- Use web-safe fonts only (Arial, Helvetica, Georgia, etc.)
- Images:
<img src="illustration.png" style="width: Xpt; height: Ypt;">
Known issue: Chinese characters in file paths can break image loading. Use symlinks to ASCII paths if needed:
ln -sf "/path/with/䏿/" /tmp/ascii-path
4-B: Image Assembly (Path B)
Assemble generated slide images into PPTX using create_slides.py:
uv run ~/.claude/skills/image-to-slides/scripts/create_slides.py \
slide-01-cover.png slide-02-intro.png slide-03-definition.png ... \
--layout fullscreen \
--bg-color 000000 \
-o output.pptx
Recommended layout for Path B: fullscreen â images fill the entire slide since they already contain all layout, text, and visuals.
| Layout | Use case |
|---|---|
fullscreen |
AI-generated full-page slides (Path B default) |
title_above |
Image + editable title (hybrid approach) |
title_left |
Split: text + visual |
center |
Centered image with padding |
grid |
Multiple images per slide |
Step 5: Preview & Polish
Preview
Path A: Screenshot 3-4 key HTML slides with Playwright:
npx playwright screenshot "file:///path/to/slide.html" preview.png \
--viewport-size=960,540 --wait-for-timeout=1000
Path B: Show the generated slide images directly (they ARE the slides). Use Read tool to display 3-4 key PNGs.
â Checkpoint 4 (All modes)
Show preview to the user. The PPTX file is ready â ask:
- Any slides to adjust?
- Ready to open in Keynote/PowerPoint?
Final Polish (in Keynote/PowerPoint)
- Transitions and animations
- Speaker notes
- Brand logo placement
- Path A: Final text adjustments (editable)
- Path B: Text NOT editable â if text errors found, regenerate the slide image
Design Quick Reference
5/5/5 rule: â¤5 words/line, â¤5 bullets/slide, â¤5 text-heavy slides in a row
Cognitive load: One idea per slide. ~1 min per slide. Slides complement speech, never duplicate it.
Visual hierarchy: F/Z-pattern reading flow. Title:body size â 3:1. Every slide should have a visual element.
Detailed references:
references/proven-styles-gallery.mdâ 17 tested visual styles with tiered recommendationsreferences/proven-styles-snoopy.mdâ Snoopy/Peanuts style detailed per-slide templatesreferences/prompt-templates.mdâ Content generation and image promptsreferences/design-principles.mdâ Full design framework, color palettes, typography
Related Skills
| Skill | Role |
|---|---|
pptx |
Advanced PPTX creation/editing (html2pptx, templates) |
nano-banana-pro |
AI illustration generation (Gemini 3 Pro Image) |
multi-model |
External AI for content drafting |
design-philosophy |
20ç§è®¾è®¡å²å¦æ·±åº¦åèï¼é£æ ¼DNA + åºæ¯æ¨¡æ¿ + è¯å®¡æ åï¼ãProfessional/Editorial飿 ¼çè¯¦ç»æç¤ºè¯åè¯å®¡æå卿¤ |
Output
.pptxfiles compatible with PowerPoint, Keynote, Google Slides- Web-safe fonts for cross-platform compatibility
- AI illustrations as separate PNG files (reusable)
è±ååºå | AI Native Coder · ç¬ç«å¼åè å ¬ä¼å·ãè±åã| 30ä¸+ç²ä¸ | AIå·¥å ·ä¸æçæå 代表ä½ï¼å°ç«è¡¥å ç¯ï¼AppStoreä»è´¹æ¦Top1ï¼Â·ã䏿¬ä¹¦ç©è½¬DeepSeekã