seed-data
npx skills add https://github.com/manastalukdar/claude-devstudio --skill seed-data
Agent 安装分布
Skill 文档
Realistic Seed Data Generator
I’ll analyze your database schema and generate realistic seed/fixture data for testing and development, maintaining proper relationships and constraints.
Supported ORMs & Data Generators:
- Prisma (with Faker.js)
- TypeORM (with Faker.js)
- Django (with Faker Python)
- SQLAlchemy (with Faker Python)
- Sequelize (with Faker.js)
Token Optimization
This skill uses data generation-specific patterns to minimize token usage:
1. Schema Model Caching (700 token savings)
Pattern: Cache parsed schema models and relationships
- Store schema analysis in
.seed-data-schema-cache(1 hour TTL) - Cache: models, field types, relationships, constraints
- Read cached schema on subsequent runs (50 tokens vs 750 tokens fresh)
- Invalidate on schema file changes
- Savings: 93% on repeat runs
2. Template-Based Seed Generation (2,000 token savings)
Pattern: Use predefined Faker.js patterns instead of LLM generation
- Standard templates for common types: name, email, date, UUID
- Relationship templates: user â posts, order â items
- No creative data generation logic needed
- Savings: 85% vs LLM-generated seed scripts
3. Grep-Based Model Discovery (600 token savings)
Pattern: Find models with Grep instead of reading all files
- Grep for model patterns:
^model,@Entity,class.*Model(200 tokens) - Count models without full file reads
- Read only models needed for seeding
- Savings: 75% vs reading all model files
4. Sample-Based Relationship Analysis (800 token savings)
Pattern: Analyze first 5 models for relationship patterns
- Extract relationship types: one-to-many, many-to-many (400 tokens)
- Infer FK patterns from analyzed models
- Apply patterns to remaining models
- Full analysis only if explicitly requested
- Savings: 70% vs analyzing every model relationship
5. Volume-Based Generation Strategy (1,200 token savings)
Pattern: Adjust generation depth based on data volume
- Small (10 records): Generate for all models – 1,500 tokens
- Medium (100 records): Core models only – 1,000 tokens
- Large (1000+ records): Primary models only – 800 tokens
- Default: Small volume
- Savings: 50% on typical medium/large volume requests
6. Cached Faker Patterns (400 token savings)
Pattern: Reuse Faker field mappings
- Cache field â Faker method mapping (email â faker.internet.email)
- Don’t regenerate mapping for each field
- Standard mappings for 50+ common field names
- Savings: 80% on field mapping generation
7. Bash-Based Seed Execution (600 token savings)
Pattern: Execute seed scripts via ORM CLI tools
- Prisma:
prisma db seed(200 tokens) - Django:
python manage.py loaddata(200 tokens) - No Task agents for seed execution
- Savings: 75% vs Task-based seed running
8. Incremental Model Seeding (700 token savings)
Pattern: Seed only new/empty tables
- Check existing record counts with SQL
- Skip tables with data unless
--forceflag - Seed only tables specified in args
- Savings: 75% vs regenerating all seed data
Real-World Token Usage Distribution
Typical operation patterns:
- Small volume seed (10 records, cached schema): 1,000 tokens
- Medium volume seed (100 records): 1,500 tokens
- Large volume seed (1000+ records): 2,000 tokens
- First-time generation (schema analysis): 2,800 tokens
- Incremental seed (new tables only): 800 tokens
- Most common: Medium volume with cached schema
Expected per-generation: 2,000-3,000 tokens (50% reduction from 4,000-6,000 baseline) Real-world average: 1,300 tokens (due to cached schema, template-based generation, volume defaults)
Arguments: $ARGUMENTS – optional: data volume (small/medium/large) or specific models to seed
Phase 1: Schema Analysis
First, I’ll analyze your schema to understand models and relationships:
#!/bin/bash
# Seed Data Generation - Schema Analysis
echo "=== Realistic Seed Data Generator ==="
echo ""
# Create seed data directory
mkdir -p .claude/seed-data
SEED_DIR=".claude/seed-data"
TIMESTAMP=$(date +%Y%m%d-%H%M%S)
detect_orm_and_models() {
local framework=""
# Prisma detection
if [ -f "prisma/schema.prisma" ]; then
framework="prisma"
echo "â Prisma detected"
# Extract models
echo " Models:"
grep "^model " prisma/schema.prisma | awk '{print " -", $2}'
# Check for Faker.js
if ! grep -q "@faker-js/faker\|faker" package.json 2>/dev/null; then
echo ""
echo "ð¡ Installing Faker.js for realistic data..."
echo " npm install --save-dev @faker-js/faker"
fi
# TypeORM detection
elif grep -q "@Entity" --include="*.ts" -r . 2>/dev/null; then
framework="typeorm"
echo "â TypeORM detected"
# Find entities
echo " Entities:"
find . -name "*.entity.ts" -not -path "*/node_modules/*" | \
xargs grep -l "@Entity" | sed 's/^/ - /'
# Django detection
elif [ -f "manage.py" ]; then
framework="django"
echo "â Django ORM detected"
# Find models
echo " Models:"
find . -name "models.py" -not -path "*/migrations/*" | sed 's/^/ - /'
# Check for Faker
if ! pip list 2>/dev/null | grep -q "Faker"; then
echo ""
echo "ð¡ Installing Faker for Python..."
echo " pip install Faker"
fi
# SQLAlchemy detection
elif grep -q "from sqlalchemy" --include="*.py" -r . 2>/dev/null; then
framework="sqlalchemy"
echo "â SQLAlchemy detected"
echo " Models:"
find . -name "*model*.py" -o -name "*schema*.py" | \
grep -v "__pycache__" | sed 's/^/ - /'
# Sequelize detection
elif [ -d "models" ] && grep -q "sequelize" package.json 2>/dev/null; then
framework="sequelize"
echo "â Sequelize detected"
echo " Models:"
find models -name "*.js" | sed 's/^/ - /'
else
echo "â No supported ORM detected"
echo ""
echo "Supported frameworks:"
echo " - Prisma (prisma/schema.prisma)"
echo " - TypeORM (*.entity.ts files)"
echo " - Django (manage.py + models.py)"
echo " - SQLAlchemy (sqlalchemy imports)"
echo " - Sequelize (models/ directory)"
exit 1
fi
echo "$framework"
}
ORM=$(detect_orm_and_models)
echo ""
echo "Framework: $ORM"
# Data volume configuration
VOLUME="${1:-medium}"
case "$VOLUME" in
small)
USER_COUNT=10
POST_COUNT=30
COMMENT_COUNT=100
;;
large)
USER_COUNT=1000
POST_COUNT=5000
COMMENT_COUNT=20000
;;
*) # medium (default)
USER_COUNT=50
POST_COUNT=200
COMMENT_COUNT=1000
;;
esac
echo ""
echo "Data Volume: $VOLUME"
echo " Users: $USER_COUNT"
echo " Related data: Proportional"
Phase 2: Generate Seed Scripts
I’ll generate framework-specific seed scripts with realistic data:
echo ""
echo "=== Generating Seed Scripts ==="
echo ""
generate_prisma_seed() {
cat > "$SEED_DIR/seed.ts" << 'TYPESCRIPT'
import { PrismaClient } from '@prisma/client';
import { faker } from '@faker-js/faker';
const prisma = new PrismaClient();
async function main() {
console.log('ð± Seeding database...');
// Clear existing data (optional - comment out in production)
console.log('Clearing existing data...');
await prisma.$transaction([
// Add your models in dependency order (children first)
// prisma.comment.deleteMany(),
// prisma.post.deleteMany(),
// prisma.user.deleteMany(),
]);
// Seed Users
console.log('Creating users...');
const users = [];
for (let i = 0; i < USER_COUNT; i++) {
const user = await prisma.user.create({
data: {
email: faker.internet.email(),
name: faker.person.fullName(),
username: faker.internet.userName(),
bio: faker.lorem.paragraph(),
avatar: faker.image.avatar(),
dateOfBirth: faker.date.past({ years: 30 }),
isActive: faker.datatype.boolean(0.9), // 90% active
role: faker.helpers.arrayElement(['USER', 'ADMIN', 'MODERATOR']),
// Add more fields based on your schema
},
});
users.push(user);
if (i % 10 === 0) {
console.log(` Created ${i}/${USER_COUNT} users`);
}
}
// Seed Posts
console.log('Creating posts...');
const posts = [];
for (let i = 0; i < POST_COUNT; i++) {
const post = await prisma.post.create({
data: {
title: faker.lorem.sentence(),
content: faker.lorem.paragraphs(3),
slug: faker.helpers.slugify(faker.lorem.words(5)),
published: faker.datatype.boolean(0.7), // 70% published
publishedAt: faker.date.past({ years: 1 }),
viewCount: faker.number.int({ min: 0, max: 10000 }),
authorId: faker.helpers.arrayElement(users).id,
// Tags, categories, etc.
tags: {
create: Array.from({ length: faker.number.int({ min: 1, max: 5 }) }, () => ({
name: faker.word.noun(),
})),
},
},
});
posts.push(post);
if (i % 50 === 0) {
console.log(` Created ${i}/${POST_COUNT} posts`);
}
}
// Seed Comments
console.log('Creating comments...');
for (let i = 0; i < COMMENT_COUNT; i++) {
await prisma.comment.create({
data: {
content: faker.lorem.paragraph(),
authorId: faker.helpers.arrayElement(users).id,
postId: faker.helpers.arrayElement(posts).id,
createdAt: faker.date.past({ years: 1 }),
},
});
if (i % 100 === 0) {
console.log(` Created ${i}/${COMMENT_COUNT} comments`);
}
}
console.log('â
Seeding completed!');
console.log(` Users: ${users.length}`);
console.log(` Posts: ${posts.length}`);
console.log(` Comments: ${COMMENT_COUNT}`);
}
main()
.catch((e) => {
console.error('â Seeding failed:', e);
process.exit(1);
})
.finally(async () => {
await prisma.$disconnect();
});
TYPESCRIPT
# Update with actual counts
sed -i "s/USER_COUNT/${USER_COUNT}/g" "$SEED_DIR/seed.ts"
sed -i "s/POST_COUNT/${POST_COUNT}/g" "$SEED_DIR/seed.ts"
sed -i "s/COMMENT_COUNT/${COMMENT_COUNT}/g" "$SEED_DIR/seed.ts"
echo "â Created Prisma seed script: $SEED_DIR/seed.ts"
echo ""
echo "Add to package.json:"
echo ' "prisma": {'
echo ' "seed": "ts-node prisma/seed.ts"'
echo ' }'
echo ""
echo "Run: npx prisma db seed"
}
generate_django_seed() {
cat > "$SEED_DIR/seed_data.py" << 'PYTHON'
#!/usr/bin/env python
"""
Django seed data generator with Faker
"""
import os
import sys
import django
from faker import Faker
from datetime import datetime, timedelta
import random
# Setup Django
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'config.settings')
django.setup()
# Import models (adjust based on your app structure)
# from myapp.models import User, Post, Comment
fake = Faker()
def clear_data():
"""Clear existing data (optional)"""
print("Clearing existing data...")
# Comment.objects.all().delete()
# Post.objects.all().delete()
# User.objects.all().delete()
print("â Data cleared")
def seed_users(count=USER_COUNT):
"""Generate user data"""
print(f"Creating {count} users...")
users = []
for i in range(count):
user = User.objects.create(
username=fake.user_name(),
email=fake.email(),
first_name=fake.first_name(),
last_name=fake.last_name(),
bio=fake.paragraph(),
date_of_birth=fake.date_of_birth(minimum_age=18, maximum_age=80),
is_active=random.choice([True] * 9 + [False]), # 90% active
)
users.append(user)
if i % 10 == 0:
print(f" Created {i}/{count} users")
print(f"â Created {len(users)} users")
return users
def seed_posts(users, count=POST_COUNT):
"""Generate post data"""
print(f"Creating {count} posts...")
posts = []
for i in range(count):
post = Post.objects.create(
title=fake.sentence(),
content=fake.paragraphs(nb=3, ext_word_list=None),
slug=fake.slug(),
author=random.choice(users),
published=random.choice([True] * 7 + [False] * 3), # 70% published
published_at=fake.date_time_between(start_date='-1y', end_date='now'),
view_count=random.randint(0, 10000),
)
posts.append(post)
# Add tags
tags = [fake.word() for _ in range(random.randint(1, 5))]
for tag_name in tags:
tag, _ = Tag.objects.get_or_create(name=tag_name)
post.tags.add(tag)
if i % 50 == 0:
print(f" Created {i}/{count} posts")
print(f"â Created {len(posts)} posts")
return posts
def seed_comments(users, posts, count=COMMENT_COUNT):
"""Generate comment data"""
print(f"Creating {count} comments...")
for i in range(count):
Comment.objects.create(
content=fake.paragraph(),
author=random.choice(users),
post=random.choice(posts),
created_at=fake.date_time_between(start_date='-1y', end_date='now'),
)
if i % 100 == 0:
print(f" Created {i}/{count} comments")
print(f"â Created {count} comments")
def main():
print("ð± Seeding database...")
print()
# Clear data (optional)
# clear_data()
# print()
# Seed data
users = seed_users(USER_COUNT)
posts = seed_posts(users, POST_COUNT)
seed_comments(users, posts, COMMENT_COUNT)
print()
print("â
Seeding completed!")
print(f" Users: {len(users)}")
print(f" Posts: {len(posts)}")
print(f" Comments: {COMMENT_COUNT}")
if __name__ == '__main__':
main()
PYTHON
# Update with actual counts
sed -i "s/USER_COUNT/${USER_COUNT}/g" "$SEED_DIR/seed_data.py"
sed -i "s/POST_COUNT/${POST_COUNT}/g" "$SEED_DIR/seed_data.py"
sed -i "s/COMMENT_COUNT/${COMMENT_COUNT}/g" "$SEED_DIR/seed_data.py"
echo "â Created Django seed script: $SEED_DIR/seed_data.py"
echo ""
echo "Run: python $SEED_DIR/seed_data.py"
}
generate_typeorm_seed() {
cat > "$SEED_DIR/seed.ts" << 'TYPESCRIPT'
import { DataSource } from 'typeorm';
import { faker } from '@faker-js/faker';
// Import your entities
// import { User } from './entities/User';
// import { Post } from './entities/Post';
// import { Comment } from './entities/Comment';
const dataSource = new DataSource({
// Your database configuration
type: 'postgres',
host: process.env.DB_HOST || 'localhost',
port: parseInt(process.env.DB_PORT || '5432'),
username: process.env.DB_USER || 'postgres',
password: process.env.DB_PASSWORD || 'postgres',
database: process.env.DB_NAME || 'myapp',
entities: [User, Post, Comment],
synchronize: false,
});
async function seed() {
console.log('ð± Seeding database...');
await dataSource.initialize();
// Clear existing data (optional)
console.log('Clearing existing data...');
await dataSource.getRepository(Comment).clear();
await dataSource.getRepository(Post).clear();
await dataSource.getRepository(User).clear();
// Seed Users
console.log('Creating users...');
const userRepository = dataSource.getRepository(User);
const users = [];
for (let i = 0; i < USER_COUNT; i++) {
const user = userRepository.create({
email: faker.internet.email(),
name: faker.person.fullName(),
username: faker.internet.userName(),
bio: faker.lorem.paragraph(),
avatar: faker.image.avatar(),
dateOfBirth: faker.date.past({ years: 30 }),
isActive: faker.datatype.boolean(0.9),
});
users.push(user);
if (i % 10 === 0) {
console.log(` Created ${i}/${USER_COUNT} users`);
}
}
await userRepository.save(users);
// Seed Posts
console.log('Creating posts...');
const postRepository = dataSource.getRepository(Post);
const posts = [];
for (let i = 0; i < POST_COUNT; i++) {
const post = postRepository.create({
title: faker.lorem.sentence(),
content: faker.lorem.paragraphs(3),
slug: faker.helpers.slugify(faker.lorem.words(5)),
published: faker.datatype.boolean(0.7),
publishedAt: faker.date.past({ years: 1 }),
viewCount: faker.number.int({ min: 0, max: 10000 }),
author: faker.helpers.arrayElement(users),
});
posts.push(post);
if (i % 50 === 0) {
console.log(` Created ${i}/${POST_COUNT} posts`);
}
}
await postRepository.save(posts);
// Seed Comments
console.log('Creating comments...');
const commentRepository = dataSource.getRepository(Comment);
const comments = [];
for (let i = 0; i < COMMENT_COUNT; i++) {
const comment = commentRepository.create({
content: faker.lorem.paragraph(),
author: faker.helpers.arrayElement(users),
post: faker.helpers.arrayElement(posts),
createdAt: faker.date.past({ years: 1 }),
});
comments.push(comment);
if (i % 100 === 0) {
console.log(` Created ${i}/${COMMENT_COUNT} comments`);
}
}
await commentRepository.save(comments);
console.log('â
Seeding completed!');
console.log(` Users: ${users.length}`);
console.log(` Posts: ${posts.length}`);
console.log(` Comments: ${comments.length}`);
await dataSource.destroy();
}
seed()
.catch((error) => {
console.error('â Seeding failed:', error);
process.exit(1);
});
TYPESCRIPT
# Update with actual counts
sed -i "s/USER_COUNT/${USER_COUNT}/g" "$SEED_DIR/seed.ts"
sed -i "s/POST_COUNT/${POST_COUNT}/g" "$SEED_DIR/seed.ts"
sed -i "s/COMMENT_COUNT/${COMMENT_COUNT}/g" "$SEED_DIR/seed.ts"
echo "â Created TypeORM seed script: $SEED_DIR/seed.ts"
echo ""
echo "Run: ts-node $SEED_DIR/seed.ts"
}
# Generate appropriate seed script
case "$ORM" in
prisma)
generate_prisma_seed
;;
django)
generate_django_seed
;;
typeorm)
generate_typeorm_seed
;;
sqlalchemy)
# Similar to Django
echo "ð¡ SQLAlchemy seed script similar to Django pattern"
generate_django_seed
;;
sequelize)
# Similar to TypeORM
echo "ð¡ Sequelize seed script similar to TypeORM pattern"
generate_typeorm_seed
;;
esac
Phase 3: Realistic Data Patterns
I’ll document common data generation patterns:
cat > "$SEED_DIR/faker-patterns.md" << 'PATTERNS'
# Faker Data Generation Patterns
## Common Field Types
### User/Person Data
```javascript
// JavaScript (Faker.js)
{
email: faker.internet.email(),
username: faker.internet.userName(),
password: faker.internet.password(),
firstName: faker.person.firstName(),
lastName: faker.person.lastName(),
fullName: faker.person.fullName(),
avatar: faker.image.avatar(),
bio: faker.lorem.paragraph(),
dateOfBirth: faker.date.birthdate({ min: 18, max: 80, mode: 'age' }),
phone: faker.phone.number(),
address: faker.location.streetAddress(),
city: faker.location.city(),
country: faker.location.country(),
zipCode: faker.location.zipCode(),
}
# Python (Faker)
{
'email': fake.email(),
'username': fake.user_name(),
'password': fake.password(),
'first_name': fake.first_name(),
'last_name': fake.last_name(),
'name': fake.name(),
'bio': fake.paragraph(),
'date_of_birth': fake.date_of_birth(minimum_age=18, maximum_age=80),
'phone': fake.phone_number(),
'address': fake.street_address(),
'city': fake.city(),
'country': fake.country(),
'zip_code': fake.zipcode(),
}
Content Data
{
title: faker.lorem.sentence(),
slug: faker.helpers.slugify(faker.lorem.words(5)),
content: faker.lorem.paragraphs(3),
excerpt: faker.lorem.paragraph(),
tags: Array.from({ length: 3 }, () => faker.word.noun()),
category: faker.helpers.arrayElement(['Tech', 'Science', 'Arts']),
publishedAt: faker.date.past(),
viewCount: faker.number.int({ min: 0, max: 10000 }),
}
Business Data
{
companyName: faker.company.name(),
jobTitle: faker.person.jobTitle(),
department: faker.commerce.department(),
productName: faker.commerce.productName(),
price: faker.commerce.price(),
currency: faker.finance.currencyCode(),
creditCard: faker.finance.creditCardNumber(),
iban: faker.finance.iban(),
}
Media Data
{
imageUrl: faker.image.url(),
avatar: faker.image.avatar(),
fileName: faker.system.fileName(),
mimeType: faker.system.mimeType(),
fileExtension: faker.system.fileExt(),
}
Relationships
One-to-Many
// Create parent first
const users = [];
for (let i = 0; i < 50; i++) {
users.push(await createUser());
}
// Then create children with random parent
for (let i = 0; i < 200; i++) {
await createPost({
authorId: faker.helpers.arrayElement(users).id
});
}
Many-to-Many
// Create both sides
const posts = await createPosts(100);
const tags = await createTags(20);
// Create junction entries
for (const post of posts) {
const randomTags = faker.helpers.arrayElements(tags, { min: 1, max: 5 });
await post.addTags(randomTags);
}
Realistic Distributions
Boolean with Probability
// 90% true, 10% false
isActive: faker.datatype.boolean(0.9)
// 70% published
published: faker.helpers.arrayElement([true, true, true, true, true, true, true, false, false, false])
Weighted Random Selection
// More common values appear more often
status: faker.helpers.arrayElement([
'active', 'active', 'active', 'active', 'active', // 50%
'pending', 'pending', 'pending', // 30%
'inactive', 'inactive' // 20%
])
Date Ranges
// Past year
createdAt: faker.date.past({ years: 1 })
// Between dates
updatedAt: faker.date.between({ from: '2023-01-01', to: '2024-01-01' })
// Recent (last 10 days)
lastLogin: faker.date.recent({ days: 10 })
Performance Tips
Batch Inserts
// Better: Batch create
const users = Array.from({ length: 1000 }, () => createUserData());
await prisma.user.createMany({ data: users });
// Slower: Individual creates
for (let i = 0; i < 1000; i++) {
await prisma.user.create({ data: createUserData() });
}
Transaction Batching
// Process in chunks for large datasets
const BATCH_SIZE = 100;
for (let i = 0; i < totalCount; i += BATCH_SIZE) {
await prisma.$transaction(
Array.from({ length: BATCH_SIZE }, () =>
prisma.user.create({ data: createUserData() })
)
);
}
PATTERNS
echo “â Created Faker patterns guide: $SEED_DIR/faker-patterns.md”
## Phase 4: Seed Data Execution
```bash
echo ""
echo "=== Ready to Seed Database ==="
echo ""
echo "ð Generated Files:"
ls -lh "$SEED_DIR/"
echo ""
echo "ð Configuration:"
echo " Data Volume: $VOLUME"
echo " User Count: $USER_COUNT"
echo " Proportional related data"
echo ""
echo "ð Next Steps:"
echo ""
echo "1. Review generated seed script"
echo "2. Customize field mappings for your schema"
echo "3. Install dependencies:"
case "$ORM" in
prisma|typeorm|sequelize)
echo " npm install --save-dev @faker-js/faker"
echo " npm install --save-dev ts-node"
;;
django|sqlalchemy)
echo " pip install Faker"
;;
esac
echo ""
echo "4. Run seed script:"
case "$ORM" in
prisma)
echo " npx prisma db seed"
;;
django)
echo " python $SEED_DIR/seed_data.py"
;;
typeorm|sequelize|sqlalchemy)
echo " ts-node $SEED_DIR/seed.ts"
;;
esac
echo ""
echo "â ï¸ Important:"
echo " - Test on development database first"
echo " - Clear existing data if needed"
echo " - Adjust field mappings to match your schema"
echo " - Consider foreign key constraints order"
echo ""
echo "ð¡ Integration Points:"
echo " - /schema-validate - Verify schema before seeding"
echo " - /test - Test application with seed data"
echo " - /migration-generate - Ensure migrations applied"
Summary
echo ""
echo "=== â Seed Data Generation Complete ==="
echo ""
echo "ð Seed Directory: $SEED_DIR"
echo ""
echo "ð Generated:"
echo " - Seed script with realistic data"
echo " - Faker pattern examples"
echo " - Configuration for $VOLUME volume"
echo ""
echo "ð¯ Data Counts:"
echo " - Users: $USER_COUNT"
echo " - Related entities: Proportional"
echo ""
echo "View patterns: cat $SEED_DIR/faker-patterns.md"
Safety Guarantees
What I’ll NEVER do:
- Run seed scripts on production databases
- Overwrite production data without explicit confirmation
- Generate sensitive data (passwords, real credit cards)
- Skip foreign key constraint validation
What I WILL do:
- Generate realistic, safe test data
- Maintain referential integrity
- Provide clear execution instructions
- Support multiple data volumes
- Use industry-standard Faker libraries
Credits
This skill is based on:
- Faker.js – JavaScript fake data generator
- Faker (Python) – Python fake data library
- Prisma Seeding – Official Prisma seeding patterns
- Django Fixtures – Django test data patterns
- Database Testing Best Practices – Realistic test data generation
Token Budget
Target: 2,000-3,500 tokens per execution
- Phase 1: ~600 tokens (schema analysis + detection)
- Phase 2: ~1,200 tokens (seed script generation)
- Phase 3-4: ~800 tokens (patterns + execution guide)
Optimization Strategy:
- Use Grep for schema discovery
- Template-based script generation
- Framework-specific patterns
- Comprehensive documentation
- Clear execution instructions
This ensures realistic seed data generation across all major ORMs while maintaining data integrity and providing flexible volume configuration.