superimage-generator
npx skills add https://github.com/bluebagai/skills --skill superimage-generator
Agent 安装分布
Skill 文档
Superprompt Generator
Transform basic prompts into professional-grade superprompts optimized for specific AI image models.
Terminology:
- prompt = user’s basic input (e.g., “a woman taking a selfie”)
- superprompt = the enhanced, structured output with detailed specifications
â ï¸ MANDATORY WORKFLOW – DO NOT SKIP ANY STEPS
This skill has a sequential, dependency-based workflow. Each step builds on the previous one. Skipping steps produces inferior results and failed generations.
Critical Enforcement Rules:
- Step 1 MUST complete before Step 2 â No model selection without intent clarification
- Step 3 MUST check
examples/INDEX.mdbefore generating â Read the index, follow its instructions. Do NOT read more than 2 example files. - Step 4 VALIDATION is mandatory â no exceptions â Run validation script before proceeding
- Do NOT flatten (Step 5) until validation returns
â VALIDâ Invalid prompts fail on models - Always show user the validation report â They need to see quality scores and any warnings
If you are tempted to skip steps: DON’T. The workflow exists for a reason.
Contents
- Quick Start
- Workflow Selection
- Workflow
- Reference Image Workflow
- Superprompt Playbook (see also playbook.md)
- Examples
- Model Templates
- Validation Scripts
- Aspect Ratio Reference
Quick Start
[! IMPORTANT] Do not act after reading the quick start section alone. Make sure to read Workflow Selection before commiting to generating an output.
Input: "a woman taking a selfie"
Output: Structured superprompt with subject, wardrobe, pose, scene, lighting, camera blocks containing anatomical precision, fabric details, camera technicals, and negative constraints.
With Reference Image: "gym selfie" + [uploaded face reference]
â See Reference Image Workflow below.
Workflow Selection
Does the user provide a reference image for face/identity preservation?
- YES â Follow Reference Image Workflow
- NO â Follow Standard Workflow below
Workflow
Copy this checklist and track progress â each step is mandatory::
Superprompt Generation (SEQUENTIAL):
- [ ] Step 1: Clarify user intent â START HERE
- [ ] Step 2: Select target model â DEPENDS ON STEP 1
- [ ] Step 3: Generate structured superprompt â DEPENDS ON STEP 2
- [ ] Step 4: Validate with script â MANDATORY, NON-OPTIONAL
- [ ] Step 5: Flatten for model format â ONLY AFTER VALIDATION PASSES
- [ ] Step 6: Return to user â FINAL STEP
Step 1: Clarify Intent
DO NOT skip this step. If unsure about intent, ask the user:
- Main subject? (person, product, landscape, architecture, food, animal)
- Style? (photorealistic, artistic, illustration)
- Aspect ratio preference? (9:16 mobile, 16:9 landscape, 1:1 square)
- Any other specific mood or setting?
Document their answers. You need these for Step 2.
Step 2: Select Model
Model selection depends on Step 1 answers. Use the table below:
Defaults by subject type:
- Person/human subject â
nano-banana(anatomical precision, complex poses) - Product/object/landscape â
flux(general photorealism)
When to use a different model:
- Need text rendering? â
ideogramorgpt-image-1.5 - Need face/identity preservation? â
flux-kontext-max - Need artistic/aesthetic style? â
midjourney - Need video/motion? â
luma - Need iterative editing? â
gpt-image-1.5
| Model | Best For | Template |
|---|---|---|
nano-banana |
Default (people) – Anatomical precision, poses | models/nano-banana.md |
flux |
Default (non-people) – General photorealism | models/flux.md |
flux-kontext-max |
Image editing, identity preservation | models/flux-kontext-max.md |
gpt-image-1.5 |
Reasoning-based, UI mockups, logos | models/gpt-image-1.5.md |
seedream-4.5 |
Natural language, text in images, 4K | models/seedream-4.5.md |
midjourney |
Aesthetic quality, artistic | models/midjourney.md |
ideogram |
Typography, logos | models/ideogram.md |
recraft |
Illustrations, vectors | models/recraft.md |
luma |
Video, 3D, motion | models/luma.md |
reve |
Artistic styles | models/reve.md |
grok-imagine |
Quick iterations | models/grok-imagine.md |
If uncertain which model to use: Ask the user about subject type (person vs. object/scene) and desired style, then apply the defaults above.
Step 3: Generate Superprompt
Before generating:
- Open
examples/INDEX.md - Follow its instructions â read ONE universal example + ONE category match
- Use as structural reference, do NOT copy verbatim
Consider perspective:
- If the user’s request implies mood, tension, scale, or narrative â check
resources/perspectives.mdfor a perspective that enhances the image - If the user specifies a camera angle already, respect it
- Default: do NOT force an unusual perspective on every prompt. Standard eye-level is fine when nothing calls for more
Then use the schema structure matching your selected model.
WARNING — OUTPUT FORMAT: Your generated JSON must contain ONLY the superprompt fields at the top level. Do NOT wrap these in a
superpromptkey. Do NOT include metadata keys likename,description,input_prompt,target_model,model_adaptations,flattened_prompt,color_palette, orplaybook_principles_applied. The example files in theexamples/directory contain these wrapper fields for documentation purposes only — the validator checks the superprompt content directly at the top level. If you include wrapper keys, validation WILL fail because the validator will not find the required fields at the root of the JSON.
Choose prompt_type based on subject:
portraitâ people, characters, subjects with identity/expressionproductâ objects, items, still life, architecturecollageâ multi-panel compositions, editorial layouts
Schema depends on target model:
- Nano-banana â uses its own flat structure (no
prompt_typefield). See nano-banana example below. - Luma â uses its own video-specific structure. See models/luma.md.
- All other models â use the base schema with
prompt_typediscriminator andmeta.target_model. See portrait/product examples below.
Example: Portrait (base schema â flux, midjourney, ideogram, recraft, gpt-image-1.5, seedream-4.5, etc.)
Required: prompt_type, meta, subject, wardrobe, pose_action, environment, lighting, camera_technical, realism_anchors, negative_prompt
{
"prompt_type": "portrait",
"meta": {
"quality_tier": "professional",
"aspect_ratio": "9:16",
"style": "photorealistic",
"target_model": "flux"
},
"subject": {
"description": "young woman mid-20s, honey blonde beach waves, confident smile",
"identity": {
"age_range": "mid-20s",
"type": "young woman",
"distinguishing_features": ["honey blonde beach waves", "confident smile"]
},
"physical_details": {
"skin": { "tone": "light warm", "texture": "visible pores, natural" },
"hair": {
"color": "honey blonde",
"style": "beach waves, past shoulders"
},
"face": { "expression": "confident smile", "gaze": "direct at camera" }
},
"expression_mood": "confident, relaxed"
},
"wardrobe": {
"clothing": [
{
"type": "tank top",
"material": "cotton",
"fit": "fitted",
"color": "white"
},
{ "type": "denim shorts", "fit": "high-waisted", "details": "frayed hem" }
],
"accessories": ["gold layered necklaces", "small hoop earrings"]
},
"pose_action": {
"position": "mirror selfie, phone at chest level",
"posture": "slight hip tilt"
},
"environment": {
"location": "modern apartment bedroom",
"background": ["natural light from window"],
"atmosphere": "casual, warm"
},
"lighting": {
"type": "natural",
"quality": "soft diffused",
"effects": ["window light", "subtle shadows"]
},
"camera_technical": {
"device": "iPhone 15 Pro",
"lens": "24mm wide",
"aperture": "f/2.0"
},
"realism_anchors": [
"visible skin pores",
"fabric texture",
"natural window light"
],
"negative_prompt": [
"cartoon",
"deformed",
"blurry",
"airbrushed",
"watermark",
"low quality"
]
}
Example: Product (base schema)
Required: prompt_type, meta, subject, environment, lighting, camera_technical, realism_anchors, negative_prompt
{
"prompt_type": "product",
"meta": {
"quality_tier": "professional",
"aspect_ratio": "1:1",
"style": "photorealistic",
"target_model": "flux"
},
"subject": {
"description": "vintage 1970s Polaroid camera, cream and brown body, rainbow stripe",
"physical_details": {
"materials": ["plastic body", "metal accents"],
"finish": "matte with age patina"
}
},
"environment": {
"location": "weathered wooden table surface",
"background": ["soft out-of-focus bookshelf", "warm afternoon light"],
"atmosphere": "nostalgic, cozy"
},
"lighting": {
"type": "golden-hour",
"direction": "side lighting from window",
"quality": "soft warm",
"effects": ["warm highlights", "soft shadows"]
},
"camera_technical": { "lens": "50mm macro", "aperture": "f/2.8" },
"realism_anchors": [
"surface scratches",
"dust particles",
"leather texture",
"metal patina"
],
"negative_prompt": [
"blurry",
"low quality",
"oversaturated",
"artificial",
"floating",
"watermark"
]
}
Example: Nano-banana model
Nano-banana uses its own flat structure â no prompt_type field needed.
Required: subject (with body), wardrobe, scene, lighting, camera
{
"subject": {
"description": "young woman mid-20s, honey blonde beach waves, confident smile",
"body": {
"physique": "natural athletic build, hourglass figure",
"anatomy": "defined shoulders, narrow waist, curved hips",
"details": "visible skin pores, natural texture, subtle imperfections"
}
},
"wardrobe": {
"top": "white cotton tank top, fitted, subtle outline visible",
"bottom": "high-waisted denim shorts, frayed hem",
"accessories": ["gold layered necklaces", "small hoop earrings"]
},
"pose_action": {
"description": "mirror selfie, holding smartphone at chest level, slight hip tilt"
},
"scene": {
"environment": "modern apartment bedroom, natural light from window"
},
"lighting": {
"type": "soft natural",
"effects": ["diffused window light", "subtle shadows"]
},
"camera": {
"technical": "iPhone 15 Pro, 24mm wide, f/2.0",
"aspect_ratio": "9:16",
"negative_constraints": "no cartoon, no deformed, no blurry, no airbrushed, no watermark"
}
}
Step 4: Validate
# Auto-detects model from meta.target_model (defaults to nano-banana if absent)
node /skills/generating-image-superprompts/scripts/validate-prompt.js --input prompt.json
# Or specify model explicitly
node /skills/generating-image-superprompts/scripts/validate-prompt.js --input prompt.json --model flux
Validation checks: required fields per prompt_type and model, conflicts, model-specific requirements, negative coverage.
IMPORTANT — JSON structure for validation: The JSON you pass to the validation script must have the prompt fields (
subject,wardrobe,scene,lighting,camera, etc.) at the top level of the object. The example files in theexamples/directory wrap these fields inside asuperpromptkey alongside documentation metadata (name,description,model_adaptations, etc.), but the validator does NOT expect that wrapper — it expects fields at root level.Common failure: If validation fails with “Required” errors for
subject,wardrobe,scene,lighting, orcamera, you almost certainly wrapped the prompt fields inside asuperpromptobject. Remove the wrapper and put all fields at the root of the JSON.
If validation fails:
- Review the error messages carefully
- Check whether you accidentally wrapped prompt fields in a
superpromptkey or included metadata keys (name,description,model_adaptations, etc.) — if so, remove them and put prompt fields at root level - Fix the identified issues in your JSON
- Re-run validation
- Only proceed to Step 5 when validation passes
Repeat this loop until you get â VALID status.
Step 5: Flatten for Model
Convert your structured JSON into the model’s expected text format. “Flattening” means joining nested values into a comma-separated string.
Example (Flux):
Structured JSON:
{
"quality_prefix": "masterpiece, 8k uhd",
"subject": "young woman mid-20s, honey blonde beach waves",
"wardrobe": "white tank top, denim mini skirt",
"pose": "mirror selfie, back arch",
"lighting": "hard direct sunlight"
}
Flattened prompt (ready for model):
masterpiece, 8k uhd, young woman mid-20s, honey blonde beach waves, white tank top, denim mini skirt, mirror selfie, back arch, hard direct sunlight
See model templates for model-specific flattening rules (some models like GPT Image 1.5 prefer natural language paragraphs instead of comma-separated tags).
Step 6: Return to User
Always provide:
- Ready-to-use flattened prompt (copy-paste ready for the model)
- Recommended model settings (guidance_scale, steps, aspect_ratio)
Optionally provide: 3. Structured JSON â include when the user explicitly requests it OR when they need to iterate/edit the prompt later
Reference Image Workflow
When user provides a reference image for face/identity preservation, you still generate a complete superprompt â you just add a reference block to preserve identity.
[! IMPORTANT] > This is NOT an alternative to the standard workflow. You follow the standard workflow AND add reference settings.
Reference Image Generation:
- [ ] Step 1: Confirm reference image uploaded
- [ ] Step 2: Follow Standard Workflow Steps 1-3 (generate FULL superprompt)
- [ ] Step 3: Add reference block to the superprompt
- [ ] Step 4: Ensure subject.appearance.skin references the image
- [ ] Step 5: Continue with Standard Workflow Steps 4-6 (validate, flatten, return)
What You Generate
You generate a complete superprompt with all the usual blocks (subject, wardrobe, pose, scene, lighting, camera) PLUS a reference block at the top.
Reference Block Structure
Add this block to your full superprompt:
{
"reference": {
"face_identity": "use uploaded reference image",
"identity_lock": true,
"face_accuracy": "100% identical to reference â same facial structure, proportions, skin texture, expression, and details"
}
}
Complete Example (Reference + Full Superprompt)
{
"reference": {
"face_identity": "use uploaded reference image",
"identity_lock": true,
"face_accuracy": "100% identical to reference â same facial structure, proportions, skin texture"
},
"subject": {
"description": "young woman mid-20s, athletic build",
"body": {
"physique": "toned athletic build",
"anatomy": "defined shoulders, narrow waist",
"details": "visible skin pores, natural texture"
},
"appearance": {
"skin": {
"tone": "same as reference image",
"texture": "natural, visible pores"
},
"expression": "confident smile",
"gaze": "direct at camera"
}
},
"wardrobe": {
"top": "black sports bra, fitted",
"bottom": "high-waisted grey leggings",
"accessories": "small stud earrings"
},
"pose": "gym mirror selfie, phone at chest level, slight hip tilt",
"scene": {
"environment": "modern gym, weight racks in background",
"lighting": { "type": "overhead fluorescent", "quality": "bright, even" }
},
"lighting": {
"type": "gym fluorescent",
"effects": "even illumination, subtle shadows"
},
"camera": {
"technical": "iPhone 15 Pro, 24mm wide, f/1.8",
"aspect_ratio": "9:16",
"negative_constraints": "no cartoon, no deformed, no blurry, no face change, no identity drift"
}
}
Key Reference Settings
| Setting | Purpose |
|---|---|
identity_lock: true |
Prevents face drift |
face_accuracy: "100%" |
Exact facial match |
skin.tone: "same as reference" |
Preserves skin characteristics |
Reference-Compatible Models
- Flux IP-Adapter – Best for face preservation
- Flux Kontext Max – Exceptional identity preservation across edits
- GPT Image 1.5 – Robust facial and identity preservation for iterative edits
- Seedream 4.5 – Supports up to 14 reference images for character consistency
- Nano Banana – Strong identity lock support
- Midjourney –cref – Character reference flag
See examples/reference-image-gym.json for complete example.
Superprompt Playbook
Core principles for high-quality prompts. See playbook.md for detailed explanations and examples.
The 7 Principles (quick reference):
- Hierarchical Specificity â be specific, not vague (
"fitted burgundy velvet midi dress"not"wearing a dress") - Realism Anchors â ground in physical reality (
visible pores,fabric weave,accurate shadows) - Technical Camera Language â use photography terms (
85mm lens,f/1.8 aperture) - Negative Prompt Hygiene â exclude quality issues, anatomy errors, style leaks
- Spatial Consistency â define camera angle and subject positioning clearly
- Lighting as Storytelling â golden hour = warm; hard midday = bold; rim = dramatic
- Identity Lock â for character consistency, lock age, hair, body type
Examples
Complete examples with input â output transformations. Each includes prompts adapted for multiple models.
Key Examples by Category
Reference Image (Start here for identity preservation):
- examples/reference-image-gym.json – Face/identity preservation workflow with best practices
Mirror Selfies (Most common use case):
- examples/nano-banana-mirror-selfie.json – 4 variations, deep anatomical detail
- examples/elevator-mirror-selfie.json – Multiple reflections, late-night mood
- examples/lifestyle-selfie.json – Street shop window reflection
Outdoor/Lifestyle:
- examples/backyard-bikini.json – iPhone outdoor realism, gravity physics
- examples/jet-ski-lookback.json – Ocean golden hour, torso twist pose
- examples/tropical-beach-selfie.json – Beach selfie POV, straw hat
Indoor/Casual:
- examples/bedroom-prone-selfie.json – Cozy bed selfie, pink loungewear
- examples/kitchen-candid.json – Apartment evening, rainy city, mixed lighting
Artistic/Stylized:
- examples/bw-window-portrait.json – Text-to-structured, B&W cinematic
- examples/streetwear-editorial-collage.json – Multi-panel poster, analog-digital fusion
- examples/fashion-studio-collage.json – Multi-pose collage, same model 6-8 poses
Additional Examples
Browse the examples/ directory for 50+ additional examples covering:
- Athletic/gym poses, complex wardrobe, branded items
- Couples, group shots, dual subjects
- Night photography, flash aesthetics, low-light
- Editorial fashion, collages, split layouts
- Character/cosplay, fantasy elements
Model Templates
Each model has specific prompt formatting requirements:
- models/flux.md – Quality tags first, photography language
- models/flux-kontext-max.md – Natural language editing, preservation clauses, 512 token limit
- models/gpt-image-1.5.md – Natural language, quality parameter, no quality tags needed
- models/seedream-4.5.md – Coherent natural language, concise prompts, text in quotes
- models/nano-banana.md – Deep JSON, anatomical terms, inline negatives
- models/black-forest-labs.md – Detailed BFL/Flux guide
- models/midjourney.md – Parameters (–ar, –v, –style)
- models/ideogram.md – Text in quotes, magic prompt
- models/recraft.md – Style presets, color palettes
- models/luma.md – Motion descriptions, camera movement
- models/reve.md – Art movement references
- models/grok-imagine.md – Natural language, social-ready
Validation Scripts
# Validate a prompt
node /skills/generating-image-superprompts/scripts/validate-prompt.js --input prompt.json --model nano-banana
# Strict mode (warnings as errors)
node /skills/generating-image-superprompts/scripts/validate-prompt.js --input prompt.json --model nano-banana --strict
# Pipe from stdin
cat prompt.json | node /skills/generating-image-superprompts/scripts/validate-prompt.js --model nano-banana
Output includes quality score (0-100) for specificity, realism, technical, and negative coverage.
Aspect Ratio Reference
| Ratio | Use Case |
|---|---|
| 9:16 | Mobile/Stories/TikTok |
| 16:9 | Desktop/YouTube |
| 1:1 | Instagram/Profile |
| 4:5 | Instagram portrait |