skill-validator

📁 salmanferozkhan/cloud-and-fast-api 📅 7 days ago
1
总安装量
1
周安装量
#48287
全站排名
安装命令
npx skills add https://github.com/salmanferozkhan/cloud-and-fast-api --skill skill-validator

Agent 安装分布

cline 1
openclaw 1
trae 1
trae-cn 1
opencode 1

Skill 文档

Skill Validator

Validate any skill against production-level quality criteria.

Validation Workflow

Phase 1: Gather Context

  1. Read the skill’s SKILL.md completely
  2. Identify skill type from frontmatter description:
    • Builder skill (creates artifacts)
    • Guide skill (provides instructions)
    • Automation skill (executes workflows)
    • Analyzer skill (extracts insights)
    • Validator skill (enforces quality)
    • Hybrid skill (combination of above)
  3. Read all reference files in references/ directory
  4. Check for assets/scripts directories
  5. Note frontmatter fields (name, description, allowed-tools, model)

Phase 2: Apply Criteria

Evaluate against 9 criteria categories. Each criterion scores 0-3:

  • 0: Missing/Absent
  • 1: Present but inadequate
  • 2: Adequate implementation
  • 3: Excellent implementation

Criteria Categories

1. Structure & Anatomy (Weight: 12%)

Criterion What to Check
SKILL.md exists Root file present
Line count <500 lines (context is precious)
Frontmatter complete name and description present in YAML
Name constraints Lowercase, numbers, hyphens only; ≤64 chars; matches directory
Description format [What] + [When] format; ≤1024 chars
Description style Third-person: “This skill should be used when…”
No extraneous files No README.md, CHANGELOG.md, LICENSE in skill dir
Progressive disclosure Details in references/, not bloated SKILL.md
Asset organization Templates in assets/, scripts in scripts/
Large file guidance If references >10k words, grep patterns in SKILL.md

Fail condition: Missing SKILL.md or >800 lines = automatic fail

2. Content Quality (Weight: 15%)

Criterion What to Check
Conciseness No verbose explanations, context is public good
Imperative form Instructions use “Do X” not “You should do X”
Appropriate freedom Constraints where needed, flexibility where safe
Scope clarity Clear what skill does AND does not do
No hallucination risk No instructions that encourage making up info
Output specification Clear expected outputs defined

3. User Interaction (Weight: 12%)

Criterion What to Check
Clarification triggers Asks questions before acting on ambiguity
Required vs optional Distinguishes must-know from nice-to-know
Graceful handling What to do when user doesn’t answer
No over-asking Doesn’t ask obvious or inferrable questions
Question pacing Avoids too many questions in single message
Context awareness Uses available context before asking

Key pattern to look for:

## Required Clarifications
1. Question about X
2. Question about Y

## Optional Clarifications
3. Question about Z (if relevant)

Note: Avoid asking too many questions in a single message.

4. Documentation & References (Weight: 10%)

Criterion What to Check
Source URLs Official documentation links provided
Reference files Complex details in references/ not main file
Fetch guidance Instructions to fetch docs for unlisted patterns
Version awareness Notes about checking for latest patterns
Example coverage Good/bad examples for key patterns

Key pattern to look for:

| Resource | URL | Use For |
|----------|-----|---------|
| Official Docs | https://... | Complex cases |

5. Domain Standards (Weight: 10%)

Criterion What to Check
Best practices Follows domain conventions (e.g., WCAG, OWASP)
Enforcement mechanism Checklists, validation steps, must-verify items
Anti-patterns Lists what NOT to do
Quality gates Output checklist before delivery

Key pattern to look for:

### Must Follow
- [ ] Requirement 1
- [ ] Requirement 2

### Must Avoid
- Antipattern 1
- Antipattern 2

6. Technical Robustness (Weight: 8%)

Criterion What to Check
Error handling Guidance for failure scenarios
Security considerations Input validation, secrets handling if relevant
Dependencies External tools/APIs documented
Edge cases Common edge cases addressed
Testability Can outputs be verified?

7. Maintainability (Weight: 8%)

Criterion What to Check
Modularity References are self-contained topics
Update path Easy to update when standards change
No hardcoded values Uses placeholders/variables where appropriate
Clear organization Logical section ordering

8. Zero-Shot Implementation (Weight: 12%)

Skills should enable single-interaction implementation with embedded expertise.

Criterion What to Check
Before Implementation section Context gathering guidance present
Codebase context Guidance to scan existing structure/patterns
Conversation context Uses discussed requirements/decisions
Embedded expertise Domain knowledge in references/, not runtime discovery
User-only questions Only asks for USER requirements, not domain knowledge

Key pattern to look for:

## Before Implementation

Gather context to ensure successful implementation:

| Source | Gather |
|--------|--------|
| **Codebase** | Existing structure, patterns, conventions |
| **Conversation** | User's specific requirements |
| **Skill References** | Domain patterns from `references/` |
| **User Guidelines** | Project-specific conventions |

Red flag: Skill instructs to “research” or “discover” domain knowledge at runtime instead of embedding it.

9. Reusability (Weight: 13%)

Skills should handle variations, not single requirements.

Criterion What to Check
Handles variations Not hardcoded to single use case
Variable elements Clarifications capture what VARIES
Constant patterns Domain best practices encoded as constants
Not requirement-specific Avoids hardcoded data, tools, configs
Abstraction level Appropriate generalization for domain

Good example:

"Create visualizations - adaptable to data shape, chart type, library"

Bad example (too specific):

"Create bar chart with sales data using Recharts"

Key check: Does the skill work for multiple use cases within its domain?


Type-Specific Validation

After scoring general criteria, verify type-specific requirements:

Type Must Have
Builder Clarifications, Output Spec, Domain Standards, Output Checklist
Guide Workflow Steps, Examples (Good/Bad), Official Docs links
Automation Scripts in scripts/, Dependencies, Error Handling, I/O Spec
Analyzer Analysis Scope, Evaluation Criteria, Output Format, Synthesis
Validator Quality Criteria, Scoring Rubric, Thresholds, Remediation

Scoring: Deduct 10 points if type-specific requirements missing for identified type.


Scoring Guide

Category Scores

Calculate each category score:

Category Score = (Sum of criterion scores) / (Max possible) * 100

Overall Score

Overall = Σ(Category Score × Weight)

Rating Thresholds

Score Rating Meaning
90-100 Production Ready for wide use
75-89 Good Minor improvements needed
60-74 Adequate Functional but needs work
40-59 Developing Significant gaps
0-39 Incomplete Major rework required

Output Format

Generate validation report:

# Skill Validation Report: [skill-name]

**Rating**: [Production/Good/Adequate/Developing/Incomplete]
**Overall Score**: [X]/100

## Summary
[2-3 sentence assessment]

## Category Scores

| Category | Score | Weight | Weighted |
|----------|-------|--------|----------|
| Structure & Anatomy | X/100 | 12% | X |
| Content Quality | X/100 | 15% | X |
| User Interaction | X/100 | 12% | X |
| Documentation | X/100 | 10% | X |
| Domain Standards | X/100 | 10% | X |
| Technical Robustness | X/100 | 8% | X |
| Maintainability | X/100 | 8% | X |
| Zero-Shot Implementation | X/100 | 12% | X |
| Reusability | X/100 | 13% | X |
| **Type-Specific Deduction** | -X | - | -X |

## Critical Issues (if any)
- [Issue requiring immediate fix]

## Improvement Recommendations
1. **High Priority**: [Specific action]
2. **Medium Priority**: [Specific action]
3. **Low Priority**: [Specific action]

## Strengths
- [What skill does well]

Quick Validation Checklist

For rapid assessment, check these critical items:

Structure & Frontmatter

  • SKILL.md <500 lines
  • Frontmatter: name (≤64 chars, lowercase, hyphens) + description (≤1024 chars)
  • Description uses third-person style (“This skill should be used when…”)
  • No README.md/CHANGELOG.md in skill directory

Content & Interaction

  • Has clarification questions (Required vs Optional)
  • Has output specification
  • Has official documentation links

Zero-Shot & Reusability

  • Has “Before Implementation” section (context gathering)
  • Domain expertise embedded in references/ (not runtime discovery)
  • Handles variations (not requirement-specific)

Type-Specific (check based on skill type)

  • Builder: Clarifications + Output Spec + Standards + Checklist
  • Guide: Workflow + Examples + Docs
  • Automation: Scripts + Dependencies + Error Handling
  • Analyzer: Scope + Criteria + Output Format
  • Validator: Criteria + Scoring + Thresholds + Remediation

If 10+ checked: Likely Production (90+) If 7-9 checked: Likely Good (75-89) If 5-6 checked: Likely Adequate (60-74) If <5 checked: Needs significant work


Reference Files

File When to Read
references/detailed-criteria.md Deep evaluation of specific criterion
references/scoring-examples.md Example validations for calibration
references/improvement-patterns.md Common fixes for common issues

Usage Examples

Validate a skill

Validate the chatgpt-widget-creator skill against production criteria

Quick audit

Quick validation check on mcp-builder skill

Focused review

Check if skill-creator skill has proper user interaction patterns