skill-creator-v2
npx skills add https://github.com/connorads/dotfiles --skill skill-creator-v2
Agent 安装分布
Skill 文档
Skill Creator v2: Principle-Driven Skill Design
This skill guides you through creating skills that transfer expertise, not just list instructions.
The 4 Core Truths
Every skill you create must embody these principles:
| Truth | What It Means | In Practice |
|---|---|---|
| Expertise Transfer, Not Instructions | Make Claude think like an expert, not follow steps | Teach mental models and decision frameworks, not checklists |
| Flow, Not Friction | Produce output, not intermediate documents | Go straight to deliverables – no “now write a plan” steps |
| Voice Matches Domain | Sound like a practitioner, not documentation | Use domain language naturally, avoid meta-commentary |
| Focused Beats Comprehensive | Constrain ruthlessly | Every section must justify its token cost |
When to Use This Skill
Use this skill when:
- Creating a new skill from scratch
- A skill isn’t performing well and needs rethinking
- You want to understand why certain patterns work
- You need to make tough trade-offs about what to include
Don’t use this for:
- Quick iterations on working skills (use the standard skill-creator)
- Just packaging an existing skill (use packaging scripts directly)
The 10-Step Process
UNDERSTAND the problem
â
EXPLORE Claude's failures
â
RESEARCH domain expertise
â
SYNTHESIZE principles
â
DRAFT the skill
â
SELF-CRITIQUE rigorously
â
ITERATE on feedback
â
TEST on real scenarios
â
FINALIZE structure
â
PACKAGE for distribution
1. UNDERSTAND â What skill? What problem?
Start by crystallizing the core need:
- What specific capability gap exists? (Not “documentation” but “Claude rewrites the same 50-line parsing script every time”)
- What does success look like? (Concrete examples of before/after)
- Who benefits and how? (Time saved? Quality improved? Consistency achieved?)
Good examples:
- “Engineers spend 20 minutes each time formatting API responses to match our schema”
- “Claude generates valid SQL but doesn’t know our table relationships”
- “We need Claude to follow our 47-point brand guidelines without a wall of text”
Bad examples:
- “Make Claude better at X” (too vague)
- “Help with documents” (too broad)
Output: A crisp problem statement you could explain in 30 seconds.
2. EXPLORE â See where Claude fails without guidance
Critical step – don’t skip this. You need to observe the failure mode, not imagine it.
Try the task without the skill:
- Give Claude a representative request
- Note where it struggles, hesitates, or produces suboptimal output
- Try 3-5 variations to see if it’s consistent
Document:
- What did Claude do wrong?
- What knowledge was it missing?
- What did it waste time on?
- When did it ask for clarification vs. guess?
Example observations:
- “Claude generated working code but used pandas when our stack is polars”
- “Claude wrote a 200-line form instead of using our 10-line template”
- “Claude asked what format we wanted – it should know we always use ISO 8601”
Why this matters: You’re designing for actual failure modes, not theoretical ones. This prevents over-specifying (wasting tokens) or under-specifying (skill doesn’t help).
3. RESEARCH â Go deep on the domain
Now that you know what fails, understand why success looks like and what experts know.
For technical domains:
- What mental models do experts use?
- What are the key decision points?
- What patterns repeat across scenarios?
- What’s stable vs. what varies?
For workflow domains:
- What’s the expert’s internal checklist?
- What do they check for quality?
- What shortcuts do they know?
- What mistakes do novices make?
Research sources:
- Interview domain experts
- Review high-quality examples
- Read practitioner documentation (not beginner tutorials)
- Analyze your own expert behavior
Output: A list of insights that would make Claude competent, not just capable.
4. SYNTHESIZE â Extract principles from research
Transform observations into teachable principles.
Pattern recognition:
- What do all good examples have in common?
- What varies based on context?
- What rules have exceptions? (Document both)
- What can be expressed as “if X then Y”?
Compression:
- Can 5 bullet points become 1 principle?
- Can 3 examples show a pattern?
- Can a decision tree replace prose?
Example synthesis:
Raw research:
- Example 1 uses indentation for hierarchy
- Example 2 uses bullet points for parallel items
- Example 3 uses numbered lists for sequences
- Example 4 combines all three appropriately
Synthesized principle:
"Match structure to meaning: indent for hierarchy, bullets for parallelism, numbers for sequence."
Output: Distilled principles that transfer expertise, not just information.
5. DRAFT â Write initial skill
Now you can draft. Structure using progressive disclosure:
skill-name/
âââ SKILL.md # Core workflow + principles (<500 lines)
â âââ YAML frontmatter (name, description)
â âââ Essential instructions
âââ references/ # Deep knowledge (loaded as needed)
â âââ patterns.md
â âââ examples.md
âââ scripts/ # Executable code
â âââ helper.py
âââ assets/ # Templates, not documentation
âââ template.json
Frontmatter:
---
name: skill-name
description: >
What the skill does + when to trigger it. Be specific about use cases.
Good: "Process medical transcripts following HIPAA guidelines, including
de-identification and structured output formatting."
Bad: "Help with medical documents."
---
SKILL.md structure:
# Skill Name
[One paragraph: what problem this solves]
## Core Approach
[The mental model - how an expert thinks about this domain]
## When [Most Common Scenario]
[Direct instructions using imperative form]
[Include only essential decision points]
[Reference detailed guides in references/ as needed]
## When [Second Most Common Scenario]
[...]
## Quality Checks
[What good output looks like - help Claude self-evaluate]
Key drafting principles:
- Imperative mood: “Extract text” not “You should extract text”
- Example over explanation: Show one good example > describe in prose
- Decision points explicit: “If X then Y, otherwise Z”
- Offload detail: “See references/advanced.md” not “here’s 500 words”
- Quality criteria: Help Claude know when it’s done
6. SELF-CRITIQUE â Review against quality criteria
Now be ruthlessly critical. For each section, ask:
Expertise Transfer Test:
- Does this make Claude think like an expert or just act like one?
- Would a domain expert recognize this approach?
- Are we teaching patterns or prescribing steps?
Flow Test:
- Can Claude go straight to output?
- Do we force intermediate artifacts Claude doesn’t need?
- Would an expert work this way?
Voice Test:
- Does this sound like domain documentation or AI instructions?
- Would a practitioner say “execute step 3” or just do it?
- Are we narrating process or enabling work?
Focus Test:
- Can we delete this section and still succeed? (If yes, delete it)
- Is this addressing observed failure modes? (If no, question it)
- Could this be a one-line reference instead?
Token Efficiency Test:
- Would this information surprise Claude? (If no, cut it)
- Is this stable knowledge vs. variable context?
- Should this be in SKILL.md or references/?
Progressive Disclosure Test:
- Do we force-load information Claude might not need?
- Can variations go in separate reference files?
- Are scripts documented or just executable?
Red flags:
- Phrases like “you should”, “make sure to”, “don’t forget”
- Apologetic language: “This might seem complex but…”
- Meta-commentary: “The next step is to…”
- Over-specification: Dictating every detail when heuristics suffice
- Under-specification: Vague guidance on fragile operations
7. ITERATE â Fix gaps, get feedback, improve
Testing approaches:
- Yourself: Use the skill on real scenarios – note friction
- Others: Have someone else try it – watch where they struggle
- Claude: Have Claude use it – monitor for confusion or errors
Feedback loop:
Use skill â Note issue â Hypothesis on why â Update skill â Test again
Common improvements:
- Too much guidance: Claude over-thinks â Trust Claude more, delete text
- Too little guidance: Claude guesses wrong â Add decision framework
- Wrong abstraction: Scenarios don’t map cleanly â Reorganize around real patterns
- Missing context: Claude lacks key info â Move knowledge from your head to references/
Iteration triggers:
- Claude asks questions that the skill should answer
- Output quality varies unexpectedly
- Claude ignores parts of the skill
- You find yourself adding ad-hoc instructions in chat
8. TEST â Use skill on a real scenario
Final validation with a realistic task:
- Pick a task that’s representative but not used during design
- Use the skill as if you’re a new user
- Document:
- Did it work on first try?
- Where did Claude hesitate?
- What did you need to clarify?
- Was anything missing?
- Was anything unused?
Success criteria:
- Claude produces quality output without ad-hoc guidance
- The skill handles expected variations naturally
- Token usage is reasonable (<2000 tokens loaded typically)
- Claude doesn’t ask questions the skill should answer
9. FINALIZE â Codify into optimal structure
Polish for production:
SKILL.md:
- Remove TODOs and draft artifacts
- Verify all references exist and are linked
- Ensure imperative mood throughout
- Trim any remaining cruft
References:
- Organize by use case (not by type)
- Add grep-able patterns for large files
- Include only what Claude might need
Scripts:
- Test thoroughly
- Include docstrings (Claude might read them)
- Consider: should this be a reference instead?
Assets:
- Only include files that are used in output
- Remove examples/samples unless they’re templates
Validation:
# Use the packaging script - it validates automatically
scripts/package_skill.py /path/to/skill-folder
10. PACKAGE â Share the skill
Once validated:
# Creates skill-name.skill file
scripts/package_skill.py /path/to/skill-folder /output/directory
The .skill file is a zip with a .skill extension containing your complete skill structure.
Common Skill Patterns
High-Level Guide with Deep References
When: Complex domain with many variations
SKILL.md:
## Core workflow
[Essential steps + decision points]
For detailed guidance:
- **Pattern library:** See references/patterns.md
- **Full examples:** See references/examples.md
- **API reference:** See references/api.md
Benefit: SKILL.md stays focused; Claude loads depth only when needed
Script-Heavy with Minimal Instructions
When: Fragile operations requiring exact execution
SKILL.md:
## Rotating PDFs
Use scripts/rotate_pdf.py:
\`\`\`bash
python scripts/rotate_pdf.py input.pdf --angle 90 --output rotated.pdf
\`\`\`
The script handles edge cases and validation.
Benefit: Deterministic, token-efficient, no “reinventing the wheel”
Template-Based with Examples
When: Output follows a strict format
SKILL.md:
## Report format
ALWAYS use this structure:
# [Title]
## Executive Summary
[One paragraph]
## Key Findings
- Finding 1 [with data]
- Finding 2 [with data]
## Recommendations
1. Specific action
2. Specific action
Benefit: Claude produces consistent output without guessing
Anti-Patterns to Avoid
| Anti-Pattern | Why It’s Bad | Fix |
|---|---|---|
| Novel-length SKILL.md | Wastes tokens, hard to maintain | Split into references/ |
| Step-by-step recipes | Makes Claude mechanical, not thoughtful | Teach principles |
| Generic advice | Claude already knows this | Only include novel info |
| Assuming incompetence | Over-explains, wastes tokens | Trust Claude’s base knowledge |
| README/CHANGELOG/etc | AI doesn’t need meta-documentation | Delete these files |
| Loading everything upfront | Wastes context | Use progressive disclosure |
| Vague triggering | Skill loaded when not needed | Be specific in description |
Degrees of Freedom Framework
Match instruction specificity to task fragility:
High Freedom (General Guidance)
- Use when: Multiple valid approaches exist
- Format: Principles + examples
- Example: “Write engaging product copy”
Medium Freedom (Preferred Patterns)
- Use when: Best practices exist but context varies
- Format: Decision framework + examples
- Example: “Structure SQL queries for readability”
Low Freedom (Exact Execution)
- Use when: Operations are fragile or compliance-critical
- Format: Scripts or strict templates
- Example: “Fill IRS tax forms”
Working with Existing Skills
To improve a skill:
- Use it on real tasks – note where it fails
- Check: Is this a skill problem or wrong use case?
- Run through EXPLORE phase again
- Apply targeted fixes (resist full rewrites)
- Test that fixes don’t break existing use cases
To merge skills:
Only if they share 80%+ overlap. Otherwise keep separate – Claude can use multiple skills.
To split a skill:
When SKILL.md exceeds 500 lines or covers truly distinct workflows. Split at natural boundaries, update descriptions.
Quick Reference: Skill Creation Checklist
- Problem clearly defined with concrete examples
- Observed Claude’s actual failure modes (not assumed)
- Researched domain to extract expert mental models
- Synthesized principles, not just collected facts
- SKILL.md under 500 lines
- Description is specific about when to trigger
- Imperative mood throughout
- Progressive disclosure: SKILL.md â references â scripts
- Every section justified by observed need
- Tested on realistic scenarios
- No README, CHANGELOG, or meta-docs
- Packaged and validated
References
For detailed patterns:
- Workflow patterns: See the standard skill-creator’s references/workflows.md
- Output patterns: See the standard skill-creator’s references/output-patterns.md
Tools
Use the standard skill-creator scripts:
scripts/init_skill.py– Initialize skill structurescripts/package_skill.py– Validate and packagescripts/quick_validate.py– Check structure only
Final Wisdom
The best skill is the one that disappears. When Claude uses it, it should feel like Claude “just knows” how to do the task – not like it’s following a manual.
If your skill makes Claude sound like it’s reading instructions, you’ve created friction. If Claude sounds like a domain expert who happens to have expertise in this area, you’ve transferred expertise.
That’s the difference.