nanobanana
npx skills add https://github.com/kenneth-liao/ai-launchpad-marketplace --skill nanobanana
Agent 安装分布
Skill 文档
Nano Banana – AI Image Generation
Generate and edit images using Google Gemini models. Supports two models:
- Pro (
gemini-3-pro-image-preview) â High quality, complex prompts, thinking mode - Flash (
gemini-2.5-flash-image) â Fast, cheap, good for iteration
Prerequisites
Required:
GEMINI_API_KEYâ Get from Google AI Studiouv(recommended) or Python 3.10+ withgoogle-genaiinstalled
With uv (recommended â zero setup):
Dependencies are declared inline via PEP 723 and auto-installed on first run. Just use uv run instead of python3.
With pip (fallback):
pip install -r <skill_dir>/requirements.txt
Quick Start
Default output: Images save to ~/Downloads/nanobanana_<timestamp>.png automatically. Do NOT pass -o unless the user specifies where to save. If the user provides a filename without a directory (e.g., “save it as robot.png”), use -o ~/Downloads/robot.png.
Generate an image:
uv run <skill_dir>/scripts/generate.py "a cute robot mascot, pixel art style"
Edit an existing image:
uv run <skill_dir>/scripts/generate.py "make the background blue" -i input.jpg
Use Flash model for fast iteration:
uv run <skill_dir>/scripts/generate.py "quick sketch of a cat" --model flash
Multi-image reference (style + subject):
uv run <skill_dir>/scripts/generate.py "apply the style of the first image to the second" \
-i style_ref.png subject.jpg
Generate with specific aspect ratio and resolution:
uv run <skill_dir>/scripts/generate.py "cinematic landscape" --ratio 21:9 --size 4K
Save to a specific location:
uv run <skill_dir>/scripts/generate.py "logo design" -o ~/Projects/brand/logo.png
Model Selection Guide
| Pro (default) | Flash | |
|---|---|---|
| Speed | Slower | ~2-3x faster |
| Cost | Higher | Lower |
| Text rendering | Good | Unreliable |
| Complex scenes | Excellent | Adequate |
| Thinking mode | Yes | No |
| Best for | Final production images | Exploration, drafts, batch |
Rule of thumb: Use Flash for exploration and batch generation, Pro for final output.
Script Reference
scripts/generate.py
Main image generation script.
Usage: generate.py [OPTIONS] PROMPT
Arguments:
PROMPT Text prompt for image generation
Options:
-o, --output PATH Output file path (default: ~/Downloads/nanobanana_<timestamp>.png)
-i, --input PATH... Input image(s) for editing / reference (up to 14)
-m, --model MODEL Model: 'pro' (default), 'flash', or full model ID
-r, --ratio RATIO Aspect ratio (1:1, 16:9, 9:16, 21:9, etc.)
-s, --size SIZE Image size: 1K, 2K, or 4K (default: standard)
--search Enable Google Search grounding for accuracy
--retries N Max retries on rate limit (default: 3)
-v, --verbose Show detailed output
Supported aspect ratios:
1:1â Square (default)2:3,3:2â Portrait/Landscape3:4,4:3â Standard4:5,5:4â Photo9:16,16:9â Widescreen21:9â Ultra-wide/Cinematic
Image sizes:
1Kâ Fast, lower detail2Kâ Enhanced detail (2048px)4Kâ Maximum quality (3840px), best for text rendering
scripts/batch_generate.py
Generate multiple images with sequential naming.
Usage: batch_generate.py [OPTIONS] PROMPT
Arguments:
PROMPT Text prompt for image generation
Options:
-n, --count N Number of images to generate (default: 10)
-d, --dir PATH Output directory (default: ~/Downloads)
-p, --prefix STR Filename prefix (default: "image")
-m, --model MODEL Model: 'pro' (default), 'flash', or full model ID
-r, --ratio RATIO Aspect ratio
-s, --size SIZE Image size (1K/2K/4K)
--search Enable Google Search grounding
--retries N Max retries per image on rate limit (default: 3)
--delay SECONDS Delay between generations (default: 3)
--parallel N Concurrent requests (default: 1, max recommended: 5)
-q, --quiet Suppress progress output
Example:
uv run <skill_dir>/scripts/batch_generate.py "pixel art logo" -n 20 --model flash -d ./logos -p logo
Python API
Direct import (from another skill’s script):
Note: When importing as a Python module,
google-genaimust be available in the calling script’s environment. If usinguv run, add a PEP 723dependenciesblock to your own script (see example in Pattern 2 below).
import sys
from pathlib import Path
sys.path.insert(0, str(Path("<skill_dir>/scripts")))
from generate import generate_image, edit_image, batch_generate
# Generate image
result = generate_image(
prompt="a futuristic city at night",
output_path="city.png",
aspect_ratio="16:9",
image_size="4K",
model="pro",
)
# Edit existing image
result = edit_image(
prompt="add flying cars to the sky",
input_path="city.png",
output_path="city_edited.png",
)
# Multi-image reference
result = generate_image(
prompt="combine the color palette of the first with the composition of the second",
input_paths=["palette_ref.png", "composition_ref.png"],
output_path="combined.png",
)
Return structure (always present):
{
"success": True, # or False
"path": "/path/to/output.png", # or None on failure
"error": None, # or error message string
"metadata": {
"model": "gemini-3-pro-image-preview",
"prompt": "...",
"aspect_ratio": "16:9",
"image_size": "4K",
"use_search": False,
"input_images": None, # or list of paths
"text_response": "...", # optional text from model
"thinking": "...", # Pro model reasoning (when available)
"timestamp": "2025-01-26T...",
}
}
Downstream Skill Integration Guide
Pattern 1: CLI wrapper (recommended for simple use)
# In your skill's script:
uv run <nanobanana_dir>/scripts/generate.py "{prompt}" --model flash --ratio 16:9 -o output.png
Pattern 2: Python import with custom defaults
# /// script
# requires-python = ">=3.10"
# dependencies = [
# "google-genai>=1.0.0",
# ]
# ///
import sys
from pathlib import Path
NANOBANANA_DIR = Path("<nanobanana_dir>/scripts")
sys.path.insert(0, str(NANOBANANA_DIR))
from generate import generate_image
def generate_thumbnail(prompt: str, output_path: str) -> dict:
"""Generate a YouTube thumbnail with project defaults."""
return generate_image(
prompt=prompt,
output_path=output_path,
aspect_ratio="16:9",
image_size="2K",
model="flash",
max_retries=3,
)
Pattern 3: Batch with progress tracking
from batch_generate import batch_generate
def on_progress(completed, total, result):
print(f"Progress: {completed}/{total}")
results = batch_generate(
prompt="logo concept",
count=20,
output_dir="./logos",
prefix="logo",
model="flash",
aspect_ratio="1:1",
on_progress=on_progress,
)
successful = [r for r in results if r["success"]]
Pattern 4: Sequential generation for series
When a downstream skill needs multiple consistently-styled images (e.g., newsletter visuals, thumbnail A/B variants), use the anchor-and-reference pattern:
from generate import generate_image
# Step 1: Generate the style anchor
anchor = generate_image(
prompt="warm illustration style, earth tones, soft gradients, clean lines",
output_path="anchor.png",
model="pro",
)
# Step 2: Generate each image in the series, referencing the anchor
subjects = ["laptop on desk with coffee", "person reading a book", "sunrise over mountains"]
series_paths = [anchor["path"]]
for i, subject in enumerate(subjects):
result = generate_image(
prompt=f"{subject}, matching the visual style and color palette of the reference image exactly",
input_paths=[anchor["path"]], # always include the anchor
output_path=f"series_{i+1:02d}.png",
model="pro",
)
if result["success"]:
series_paths.append(result["path"])
The full sequential generation patterns are documented in the Sequential Generation section above.
Environment Variables
| Variable | Description | Default |
|---|---|---|
GEMINI_API_KEY |
Google Gemini API key | Required |
IMAGE_OUTPUT_DIR |
Default output directory | ~/Downloads |
Features
Text-to-Image Generation
Create images from text descriptions. Both models excel at:
- Photorealistic images
- Artistic styles (pixel art, illustration, etc.)
- Product photography
- Landscapes and scenes
Image Editing
Transform existing images with natural language:
- Style transfer
- Object addition/removal
- Background changes
- Color adjustments
Multi-Image Reference
Provide up to 14 reference images for:
- Style consistency across a series
- Subject consistency (same character, different poses)
- Brand-consistent generation
- Style + subject combination
High-Resolution Output
- 1K â Fast generation, good for drafts
- 2K â Enhanced detail (2048px)
- 4K â Maximum quality (3840px), best for text rendering
Google Search Grounding
Enable --search for factually accurate images involving:
- Real people, places, landmarks
- Current events
- Specific products or brands
Automatic Retry
Rate limit errors are automatically retried with exponential backoff (default: 3 retries). No action needed from callers.
SynthID Watermark Notice
All images generated by Gemini contain an invisible SynthID digital watermark. This is automatic, cannot be disabled, and survives common transformations (resize, crop, compression). Be aware of this for any use case requiring watermark-free output.
Sequential Generation
Use sequential generation to maintain visual consistency across a series of images. The core technique: generate an anchor image first, then pass it as a reference (-i) for every subsequent image in the series.
Pattern 1: Style-Board Anchoring
Generate a single anchor image that establishes the visual identity for a series. Reference it for all subsequent images.
When to use: Newsletter visual series, A/B thumbnail variants, brand-consistent image batches.
Workflow:
- Generate the anchor image with a prompt emphasizing style, palette, and mood:
uv run <skill_dir>/scripts/generate.py \
"modern flat illustration style, warm earth tones, soft gradients, clean lines, \
minimal detail, cozy atmosphere" \
--model pro -o anchor.png
- Generate each subsequent image referencing the anchor:
uv run <skill_dir>/scripts/generate.py \
"a laptop on a desk with coffee, matching the visual style, color palette, \
and lighting of the reference image exactly" \
-i anchor.png --model pro -o image_01.png
- Repeat step 2 for each image in the series, always referencing the same anchor.
Tip: Use Flash to draft the anchor quickly, then regenerate with Pro once you find a style you like.
Pattern 2: Subject Consistency
Keep the same character or subject looking consistent across different scenes and poses.
When to use: Mascot in multiple contexts, product photography series, recurring character.
Workflow:
- Generate the initial subject with clear, detailed appearance description:
uv run <skill_dir>/scripts/generate.py \
"a friendly robot mascot with round blue body, orange antenna, large expressive eyes, \
simple geometric design, standing front-facing on white background" \
--model pro -o subject_front.png
- Generate new scenes referencing the subject:
uv run <skill_dir>/scripts/generate.py \
"the same robot character from the reference image, now sitting at a desk typing, \
same proportions and colors, office background" \
-i subject_front.png --model pro -o subject_office.png
- For stronger consistency, reference 2-3 of the best previous outputs:
uv run <skill_dir>/scripts/generate.py \
"the same robot character from the reference images, now outdoors in a park, \
same proportions and colors, waving at the viewer" \
-i subject_front.png subject_office.png --model pro -o subject_park.png
Pattern 3: Progressive Accumulation
Build a reference pool over a long series, adding each successful output as a reference for the next.
When to use: Series of 5+ images where consistency must compound across the full set.
Workflow:
- Generate the anchor (same as Pattern 1, step 1).
- Generate image 2 referencing the anchor.
- Generate image 3 referencing anchor + image 2.
- Continue, keeping the 3-4 strongest references in the
-ilist. Drop weaker outputs.
Why cap at 3-4 references: More references dilute the style signal. The model averages across all inputs â too many and the result loses coherence. Keep only the images that best represent the target style.
Reference ordering matters: Place the style anchor first in the -i list. The model weights earlier references slightly more.
Best Practices
Prompt Writing
Good prompts include:
- Subject description
- Style/aesthetic
- Lighting and mood
- Composition details
- Color palette
See references/prompts.md for detailed prompt templates by category and model-specific tips.
Batch Generation Tips
- Use
--model flashfor exploration batches (faster, cheaper) - Generate 10-20 variations to explore options
- Default 3-second delay between sequential requests avoids rate limits
- Review results and iterate on best candidates with Pro model
Rate Limits
- Gemini API has usage quotas (~10 RPM free tier)
- Automatic retry with exponential backoff handles transient rate limits
- For large batches, use
--delay 5or--parallelwith modest concurrency - Check your quota at Google AI Studio
Troubleshooting
“uv: command not found”
- Install uv:
curl -LsSf https://astral.sh/uv/install.sh | shorbrew install uv
“Error: google-genai package not installed”
- Use
uv runinstead ofpython3to auto-install dependencies - Or install manually:
pip install -r <skill_dir>/requirements.txt
“GEMINI_API_KEY environment variable not set”
- Set
GEMINI_API_KEYin your environment before running
“No image in response”
- Prompt may have triggered safety filters
- Try rephrasing to avoid sensitive content
“Rate limit exceeded after N retries”
- Wait 30-60 seconds and try again
- Reduce batch parallelism or add longer delays
- Check your API quota
Import errors in batch_generate.py
- The script handles its own path setup; run from any directory
Future Capabilities
Multi-turn conversational editing â The Gemini API supports stateful chat sessions for iterative image editing (e.g., “make it bluer” â “now add a hat” â “zoom out”). This requires fundamentally different stateful architecture and is not currently implemented. No downstream skill currently needs this.
References
- references/prompts.md â Prompt examples, model-specific tips, multi-reference patterns
- references/gemini-api.md â Curated API reference for agent context