code-surgeon

📁 baagad-ai/code-surgeon 📅 1 day ago

总安装量

周安装量

#55305

全站排名

安装命令

npx skills add https://github.com/baagad-ai/code-surgeon --skill code-surgeon

Agent 安装分布

claude-code 1

Skill 文档

code-surgeon

Overview

code-surgeon is an orchestrator skill that transforms GitHub issues or plain text requirements into comprehensive implementation plans with surgical promptsâprecise, file-by-file instructions that tell AI agents exactly what code to change, where, why, and how.

Core principle: Turn ambiguous requirements into unambiguous implementation guidance by deeply understanding the codebase, team conventions, and architectural constraints.

When to Use

Use code-surgeon when you:

Have a GitHub issue or requirement you want to implement
Need a step-by-step implementation plan before coding
Want surgical prompts (precise, targeted changes) rather than vague guidance
Are implementing across multiple files with dependencies
Have a large or unfamiliar codebase you need to understand first
Want to hand off well-structured work to another AI agent

When NOT to use:

Single-file, obviously-scoped changes (e.g., “fix typo in README”)
Emergency hotfixes requiring immediate coding
Repositories with no structure (random files, no patterns)
Highly proprietary code you can’t share with analysis

Before Running code-surgeon: Assess Your Situation

Depth Mode Decision Framework

Ask yourself these questions to choose the right depth mode:

1. Scope Clarity

Question: Can you articulate the change in one sentence?

YES â requirement is clear â proceed
NO â requirement is vague â define it first, then return

2. File Impact Assessment

Question: How many files will this change affect?

1-3 files â QUICK mode is appropriate (5 min, $0.04)
5-8 files â STANDARD mode is appropriate (15 min, $0.10) â default
8+ files or uncertain â STANDARD mode (let codebase analysis guide you)

3. Risk Assessment

Question: What’s the risk level if something breaks?

LOW RISK (bug fix in isolated module):
ââ QUICK mode (5 min, 85% accuracy, $0.04)

MEDIUM RISK (new feature affecting multiple areas):
ââ STANDARD mode (15 min, 95% accuracy, $0.10) â default

HIGH RISK (architectural change, security, payment flow):
ââ DEEP mode (30 min, 99% accuracy, $0.17)

MAXIMUM UNCERTAINTY (unfamiliar codebase):
ââ DEEP mode (get comprehensive understanding first)

4. Breaking Change Impact

Question: Could this change break existing behavior for users?

NO â QUICK mode acceptable
MAYBE â STANDARD mode
YES â DEEP mode (comprehensive breaking change analysis)

5. Time vs. Accuracy Tradeoff

Question: How much time can you invest?

<10 minutes available â QUICK
15-20 minutes available â STANDARD â default
30+ minutes available â DEEP (for complex changes)

Recommendation Logic:

IF requirement is unclear:
  ââ Stop and define it first

ELSE IF isolated, low-risk bug fix:
  ââ QUICK mode

ELSE IF you're uncertain about scope or risk:
  ââ STANDARD mode (this is the safe default)

ELSE IF architectural/security/broad-impact change:
  ââ DEEP mode

How It Works

Entry Point

/code-surgeon <requirement> [--depth=mode] [--resume=session-id]

Arguments:

<requirement> – GitHub issue URL OR plain text description
--depth=mode – QUICK (5min), STANDARD (15min), or DEEP (30min) [default: STANDARD]
--resume=session-id – Resume interrupted session

Example:

/code-surgeon "Add JWT token refresh to authentication flow"
/code-surgeon "https://github.com/myorg/myrepo/issues/234" --depth=DEEP
/code-surgeon-resume surgeon-20250212-abc123xyz

Complete Options Reference (For Claude)

Option	Type	Required	Default	Purpose
`requirement`	string	â Yes	â	GitHub issue URL or plain text description
`--depth`	QUICK\|STANDARD\|DEEP	No	STANDARD	Controls analysis depth: tradeoff between speed and accuracy
`--resume`	session-id	No	â	Resume interrupted session (loads prior state)
`--format`	markdown\|json\|interactive	No	markdown	Output format: markdown for humans, json for tools, interactive for step-through

When Claude Receives These Options

Parse logic:

If --resume provided: Load session from .claude/planning/sessions/<session-id>/state.json, ignore requirement
If --resume NOT provided: Create new session, proceed with requirement
If --depth not specified: Default to STANDARD (15 min, 95% accuracy, ~$0.10)
If --format not specified: Default to markdown (human-readable PLAN.md)

Option conflicts to handle:

â If both requirement AND --resume provided: Use --resume (resume mode takes precedence)
â If no requirement AND no --resume: Error – “REQUIREMENT is required if not resuming”
â Can combine --depth=QUICK with --resume (resume from that depth mode)
â Can combine any depth with any format (orthogonal options)

The Orchestration Pipeline

code-surgeon is NOT ONE skill. It’s an orchestrator that:

Receives your requirement
Validates it and checks for PII/secrets
Dispatches 5 specialized subagents in sequence:
- Phase 1 (PARALLEL): Issue Analyzer + Framework Detector
- Phase 2: Context Researcher
- Phase 3: Implementation Planner
- Phase 4: Surgical Prompt Generator + Validator
- Phase 5: Output Formatter
Manages state across all phases with full resumption support
Returns your outputs (Markdown, JSON, Interactive CLI)

User Input
    â
[ORCHESTRATOR] â validates, manages state, coordinates phases
    â
âââââââââââââââââââââââââââââââââââââââââââââââââ
â PHASE 1: PARALLEL (2 minutes)                 â
âââââââââââââ¬ââââââââââââââââââââââââââââââââââ¤
â Subagent  â Subagent 1B:                   â
â 1A:       â Framework Detector              â
â Issue     âââââââââââââââââââââââââââââââââââ¤
â Analyzer  â Output: {frameworks, versions}  â
âââââââââââââ                                   â
â Output: {type, requirements, scope}          â
âââââââââââââââââââââââââââââââââââââââââââââââââ
    â
[PHASE 2: Context Research] (5 minutes)
    ââ Analyze repo structure
    ââ Build dependency graph
    ââ Extract patterns
    ââ Find team guidelines
    â
[PHASE 3: Implementation Planning] (3 minutes)
    ââ Generate 6-section plan
    ââ Analyze breaking changes
    ââ Order tasks logically
    â
[PHASE 4: Surgical Prompts + Validation] (2 minutes)
    ââ Create 7-layer prompts per task
    ââ Validate against team guidelines
    ââ Scan for PII/secrets
    â
[PHASE 5: Output Formatting] (1 minute)
    ââ Markdown (human-readable)
    ââ JSON (machine-readable)
    ââ Interactive CLI (step-through)
    â
Outputs: PLAN.md, plan.json, interactive.json

Session State Management

code-surgeon persists everything to .claude/planning/sessions/<session-id>/:

.claude/planning/
ââ sessions/
â  ââ surgeon-20250212-abc123xyz/
â     ââ state.json              â Complete session state
â     ââ PLAN.md                 â Human-readable plan
â     ââ plan.json               â Machine-readable plan
â     ââ interactive.json        â CLI mode data
â     ââ logs/
â        ââ execution.log        â Detailed execution log
ââ cache/                        â Shared caches
â  ââ file-structure-<hash>.json
â  ââ dependency-graph-<hash>.json
â  ââ patterns-<hash>.json
ââ frameworks/                   â Framework configs
   ââ react.yml
   ââ django.yml
   ââ ...

Why JSON + Markdown?

state.json: Complete truth for resumption and debugging
PLAN.md: What you’ll read + copy-paste surgical prompts
plan.json: For tooling integration and CI/CD pipelines
interactive.json: Powers the step-through CLI mode

Error Handling & Recovery

code-surgeon is designed to never lose work.

What Claude Should Do in Each Scenario

1. Requirement Validation Fails

Triggers: Empty requirement, only whitespace, too short

Claude should:

Show error: "Requirement cannot be empty"
Suggest: 'Try: /code-surgeon "describe what you want to change"'
Action: Stop, don't proceed

2. Sub-Skill Invocation Fails

Triggers: Sub-skill returns status: "error" or times out

Claude should:

First attempt:
1. Log error: "[Phase N] [Subagent] failed: [error message]"
2. Retry once (wait 5 seconds)
3. If still fails:
   - Save state.json immediately
   - Show: "Phase [N] failed after retry. Session saved."
   - Show session ID: surgeon-20250212-abc123xyz
   - Suggest: "/code-surgeon-resume surgeon-20250212-abc123xyz"
   - Stop execution

3. Token Budget Exceeded

Triggers: Tokens used > budget for depth mode

Claude should:

During Phase 2-3:
1. Monitor token usage against budget
2. When approaching 85% of budget:
   - Log warning: "Approaching token limit (12,000/60,000)"
3. If exceed 100% of budget:
   - Stop loading files
   - Save state.json
   - Show: "Exceeded token budget for STANDARD mode"
   - Offer options:
     a) Continue analysis with reduced depth
     b) Resume with QUICK mode (fewer files)
     c) Restart with DEEP mode if needed
   - Action: Don't proceed without user choice

4. PII/Secrets Detected in Code

Triggers: Phase 4 validation detects API keys, emails, SSNs

Claude should:

During Phase 4 validation:
1. If validation_report.errors includes PII/secrets:
   - BLOCK generation
   - Show: "Cannot generate prompts: found [TYPE] in code"
   - Show examples: "Found API keys in src/config.ts line 45"
   - Suggest: "Please sanitize code and retry"
   - Action: Stop, don't output plan.json/PLAN.md

5. Sub-Skill Output Invalid

Triggers: Output doesn’t match expected schema

Claude should:

For each sub-skill:
1. Validate output against contract (see Sub-Skill Invocation Patterns)
2. If validation fails:
   - Log error: "[Subagent] output validation failed"
   - Show missing/invalid fields: "Missing: 'type' in Issue Analyzer output"
   - Action: Retry the sub-skill once
   - If retry fails: Pause and suggest resume

6. Missing Repository or File

Triggers: repo_root doesn’t exist, required files not found

Claude should:

When initializing Phase 2:
1. Check if repo_root is accessible
2. If not found:
   - Show: "Repository not found at [path]"
   - Show actual paths tried: [list]
   - Suggest: "Ensure you're running from correct directory"
   - Action: Stop, don't proceed

7. User Interrupts (Ctrl+C)

Triggers: User cancels while execution running

Claude should:

On interrupt signal:
1. Immediately save state.json with:
   - Current phase number
   - Completed phase outputs
   - Current progress status
2. Show: "Session paused and saved"
3. Show resume command: "/code-surgeon-resume surgeon-20250212-abc123xyz"
4. Exit cleanly (no partial outputs)

The Resume Protocol

When a failure occurs or user interrupts:

State is saved atomically after each phase completes
Session ID is generated: surgeon-<YYYYMMDD>-<random>
State file location: .claude/planning/sessions/<id>/state.json
Resume behavior: Load state, find highest completed phase, continue from next phase

Resume example:

# Initial execution
/code-surgeon "Add JWT auth" --depth=STANDARD
# ... Phase 1 done, Phase 2 done, Phase 3 running...
# User: Ctrl+C
# System saves state.json with Phase 1-2 complete, Phase 3 incomplete

# Later: Resume execution
/code-surgeon-resume surgeon-20250212-abc123xyz
# System: Loading state... Phase 1-2 already done, restarting Phase 3
# ... continues from Phase 3, reuses Phase 1-2 outputs
# ... completes Phases 3-5

Failure Scenarios Reference Table

Scenario	When	What Claude Does	Next Step
Requirement empty	Validation	Show error, stop	Ask user to provide requirement
Sub-skill timeout	Phase 1-4	Retry once, then pause	Suggest resume
Token budget exceeded	Phase 2-3	Save state, offer options	User chooses: continue, retry, or restart
PII detected	Phase 4 validation	BLOCK, show error	Ask user to sanitize code
Output invalid	Any phase	Log error, retry once	Pause if retry fails
Repo not found	Phase 2 start	Show error, stop	User fixes path and retries
User interrupt	Any time	Save state immediately	Suggest resume command

What Each Phase Does

Phase 1: Analysis (Parallel, 2 min)

BEFORE PHASE 1 EXECUTES – MANDATORY READING: Read these sub-skill files in parallel with this section. They define the exact parsing and detection algorithms:

[code-surgeon-issue-analyzer-SKILL.md] â Complete issue parsing logic
[code-surgeon-framework-detector-SKILL.md] â Framework detection algorithm

Issue Analyzer (Subagent 1A)

Parse GitHub URL or plain text requirements
Extract: requirements, scope, issue type (feature/bug/refactor/perf)
Return: {type, requirements[], deadline, file_hints}

Framework Detector (Subagent 1B)

Scan package.json, pyproject.toml, go.mod, Gemfile, Cargo.toml, etc.
Detect: frameworks, versions, language(s), monorepo status
Return: {frameworks[], primary_language, versions, is_monorepo}

Why parallel? These don’t depend on each other. Run simultaneously to save time.

Do NOT load (not needed for Phase 1):

context-researcher-SKILL.md (Phase 2 only)
implementation-planner-SKILL.md (Phase 3 only)
surgical-prompt-generator-SKILL.md (Phase 4 only)

Phase 2: Context Research (5 min)

MANDATORY – READ ENTIRE FILE: Before Phase 2 executes, you MUST read [code-surgeon-context-researcher-SKILL.md] completely from start to finish. This file contains the complex 3-tier file selection algorithm, dependency mapping logic, and caching strategy. Do NOT skip or skim this file.

Context Researcher (Subagent 3)

Analyze codebase structure using regex patterns (90%+ accuracy without AST parsing)
Build lightweight dependency graph
Extract 3-5 architectural patterns
Find team conventions from .claude/team-guidelines.md
Smart file selection: Tier 1 (direct) + Tier 2 (dependencies) + Tier 3 (patterns)
Respect token budget (30K quick, 60K standard, 90K deep)
Output: {files[], dependency_graph, patterns[], team_conventions, cache_updated}

Do NOT load (not needed for Phase 2):

issue-analyzer-SKILL.md (Phase 1 complete)
framework-detector-SKILL.md (Phase 1 complete)
implementation-planner-SKILL.md (Phase 3 only)
surgical-prompt-generator-SKILL.md (Phase 4 only)

Phase 3: Planning (3 min)

MANDATORY – READ ENTIRE FILE: Before Phase 3 executes, read [code-surgeon-implementation-planner-SKILL.md] to understand the 6-section plan format, task decomposition algorithm, and breaking change detection logic.

Implementation Planner (Subagent 4)

Synthesize Issue + Framework + Context
Generate 6-section plan:
1. Summary (strategy overview)
2. Research (findings)
3. Design Choices (decisions with rationale)
4. Phases (logical work chunks)
5. Tasks (granular work items with dependencies)
6. Verification (testing checklist)
Analyze breaking changes (4 categories: API/data/behavior/dependency)
Estimate effort per task using 3-point estimates
Output: {plan_6_sections, breaking_changes[], tasks_with_deps}

Do NOT load (not needed for Phase 3):

issue-analyzer-SKILL.md, framework-detector-SKILL.md, context-researcher-SKILL.md (prior phases complete)
surgical-prompt-generator-SKILL.md (Phase 4 only)

Phase 4: Surgical Prompts (2 min)

MANDATORY – READ ENTIRE FILE: Before Phase 4 executes, read [code-surgeon-surgical-prompt-generator-SKILL.md] to understand the 9-section prompt structure, framework-specific templates, and validation rules.

Surgical Prompt Generator (Subagent 5)

Create 9-section surgical prompts per task (Objective, Context, Scope, Approach, Patterns, Constraints, Breaking Changes, Success Criteria, Common Mistakes)
Apply framework-specific templates (React/Django/Express/etc with pattern examples)
Include: file paths, line numbers, code examples, verification steps
Scan for PII/secrets (ERROR if found, blocks generation)

Validator (Subagent 6):

Validate each prompt:
- â File paths absolute + exist in repo
- â No PII/secrets in prompt text
- â Token count within budget (prevents hallucination from over-context)
- â Syntax valid for target framework
Return: {prompts[], validation_passed: true/false, errors[]}
Output: {surgical_prompts[], validation_report}

Do NOT load (not needed for Phase 4):

issue-analyzer-SKILL.md, framework-detector-SKILL.md, context-researcher-SKILL.md, implementation-planner-SKILL.md (prior phases complete)

Phase 5: Output Formatting (1 min)

Generate 3 complementary outputs:

For Claude: Sub-Skill Invocation Patterns

When executing code-surgeon, use these patterns to invoke sub-skills. Each sub-skill expects specific inputs and outputs.

Phase 1A: Issue Analyzer Invocation

Invoke: /code-surgeon-issue-analyzer

Input Contract:

{
  "requirement": "GitHub issue URL or plain text description",
  "depth_mode": "QUICK" | "STANDARD" | "DEEP"
}

Output Contract (success):

{
  "type": "feature" | "bug" | "refactor" | "perf",
  "requirements": ["requirement 1", "requirement 2", ...],
  "deadline": "2025-02-20" (optional),
  "file_hints": ["src/auth.ts", "src/utils.ts", ...],
  "status": "success"
}

What Claude should check:

â type is one of: feature, bug, refactor, perf
â requirements is non-empty array
â status === "success"
â If any check fails: Show error, don’t continue to Phase 1B

Phase 1B: Framework Detector Invocation (Parallel with 1A)

Invoke: /code-surgeon-framework-detector

Input Contract:

{
  "repo_root": "/absolute/path/to/repo",
  "depth_mode": "QUICK" | "STANDARD" | "DEEP"
}

Output Contract (success):

{
  "frameworks": ["react", "express", ...],
  "primary_language": "typescript" | "python" | "go" | "java" | "ruby",
  "versions": {
    "react": "18.2.0",
    "express": "4.18.2"
  },
  "is_monorepo": false | true,
  "monorepo_info": {
    "type": "yarn" | "npm" | "lerna" | "turborepo",
    "root_packages": ["packages/ui", "packages/api"]
  },
  "status": "success"
}

What Claude should check:

â frameworks is non-empty array
â primary_language is recognized language
â status === "success"
â ï¸ If is_monorepo === true, ensure monorepo_info is present
â If framework detection fails: Continue anyway (framework can be inferred from Phase 3)

Phase 2: Context Researcher Invocation

Invoke: /code-surgeon-context-researcher

Input Contract (requires Phase 1 outputs):

{
  "requirement": "original requirement",
  "issue_type": "feature" | "bug" | "refactor" | "perf",
  "file_hints": ["src/auth.ts"],
  "frameworks": ["react"],
  "primary_language": "typescript",
  "is_monorepo": false,
  "depth_mode": "QUICK" | "STANDARD" | "DEEP",
  "repo_root": "/path/to/repo"
}

Output Contract (success):

{
  "files": {
    "tier_1": ["src/auth.ts", "src/utils.ts"],
    "tier_2": ["src/hooks/useAuth.ts", "src/types.ts"],
    "tier_3": ["src/middleware.ts"]
  },
  "dependency_graph": {
    "src/auth.ts": ["src/utils.ts", "src/types.ts"],
    "src/hooks/useAuth.ts": ["src/auth.ts"]
  },
  "patterns": [
    {
      "name": "Custom Hook Pattern",
      "files": ["src/hooks/useAuth.ts"],
      "description": "React custom hooks for state management"
    }
  ],
  "team_conventions": {
    "naming": "camelCase for functions, PascalCase for components",
    "error_handling": "Always use try-catch in async functions"
  },
  "cache_updated": true,
  "status": "success"
}

What Claude should check:

â files has tier_1 (required), tier_2 and tier_3 optional but expected in STANDARD/DEEP
â dependency_graph is non-empty object
â patterns is array (can be empty but should have 1-5 items)
â status === "success"
â If status is error: Retry once, then show error and suggest resume

Phase 3: Implementation Planner Invocation

Invoke: /code-surgeon-implementation-planner

Input Contract (requires Phase 1-2 outputs):

{
  "requirement": "original requirement",
  "issue_type": "feature",
  "frameworks": ["react"],
  "primary_language": "typescript",
  "files": {
    "tier_1": ["src/auth.ts"],
    "tier_2": ["src/hooks/useAuth.ts"]
  },
  "patterns": [...],
  "team_conventions": {...},
  "depth_mode": "STANDARD"
}

Output Contract (success):

{
  "plan": {
    "summary": "Strategy overview",
    "research": "Key findings",
    "design_choices": [
      {
        "decision": "Use JWT tokens",
        "rationale": "Industry standard",
        "alternatives": "Session cookies"
      }
    ],
    "phases": [
      {
        "phase": 1,
        "name": "Core implementation",
        "tasks": ["task 1", "task 2"]
      }
    ],
    "tasks": [
      {
        "id": "1.1",
        "name": "Create OAuth2Service",
        "files_affected": ["src/services/oauth.ts"],
        "effort_estimate": "3 hours",
        "dependencies": [],
        "success_criteria": ["Tests pass", "Integration works"]
      }
    ],
    "verification": ["Run tests", "Check integration"]
  },
  "breaking_changes": [
    {
      "type": "api",
      "description": "Auth endpoint signature changed",
      "impact": "Client code needs update",
      "migration": "Update calls from POST /auth to POST /auth/v2"
    }
  ],
  "status": "success"
}

What Claude should check:

â plan.summary exists and is non-empty
â plan.tasks is non-empty array with task IDs
â All task dependencies reference other task IDs (validate chain)
â breaking_changes is array (can be empty)
â status === "success"
â If task count > 20: Warn user “Large implementation, consider breaking into sub-tasks”

Phase 4: Surgical Prompt Generator Invocation

Invoke: /code-surgeon-surgical-prompt-generator

Input Contract (requires Phase 1-3 outputs):

{
  "tasks": [
    {
      "id": "1.1",
      "name": "Create OAuth2Service",
      "files_affected": ["src/services/oauth.ts"],
      "success_criteria": ["Tests pass"]
    }
  ],
  "files": {
    "tier_1": ["src/auth.ts"]
  },
  "patterns": [...],
  "team_conventions": {...},
  "frameworks": ["react"],
  "primary_language": "typescript",
  "depth_mode": "STANDARD"
}

Output Contract (success):

{
  "surgical_prompts": [
    {
      "task_id": "1.1",
      "prompt": "9-section surgical prompt starting with Objective, Context, Scope...",
      "token_count": 450,
      "framework": "react",
      "scope": {
        "files_to_modify": ["src/services/oauth.ts"],
        "files_to_reference": ["src/auth.ts"],
        "files_to_avoid": ["src/ui/"]
      }
    }
  ],
  "validation_report": {
    "total_prompts": 1,
    "valid": 1,
    "invalid": 0,
    "errors": []
  },
  "status": "success"
}

What Claude should check:

â surgical_prompts count matches input task count
â Each prompt has task_id, prompt, token_count
â validation_report.valid === surgical_prompts.length
â No items in validation_report.errors
â status === "success"
â If validation fails: Show errors, offer to regenerate with reduced depth

Phase 5: Output Formatter

Generate 3 complementary outputs:

Markdown (PLAN.md):

# Implementation Plan: [Task]

## Summary
[Strategy overview]

## Research
[Codebase findings]

## Design Choices
[Decisions with rationale]

## Surgical Prompts
[Per-task prompts, ready to copy-paste]

## Breaking Changes
[Impact analysis]

## Verification
[Testing checklist]

JSON (plan.json):

{
  "plan_id": "surgeon-...",
  "summary": "...",
  "research": {...},
  "design_choices": [...],
  "phases": [...],
  "tasks": [...],
  "surgical_prompts": [...],
  "breaking_changes": [...],
  "verification": [...]
}

Interactive CLI (interactive.json):

{
  "mode": "step-through",
  "phases": [
    {
      "phase": 1,
      "name": "Core Authentication Service",
      "tasks": [
        {
          "task_id": "1.1",
          "name": "Create OAuth2Service",
          "surgical_prompt": "...",
          "status": "not_started"
        }
      ]
    }
  ]
}

Depth Modes

Choose how deeply to analyze your codebase. Each mode represents a tradeoff between speed and accuracy.

QUICK Mode (5 minutes, ~30K tokens, 85% accuracy)

When user should request this:

Bug fixes with clear scope (“Fix off-by-one in utils.js”)
Small features in isolated areas
When user says “I’m in a hurry”
When requirement clearly maps to 1-2 files

What Claude should do differently in QUICK mode:

Phase	QUICK behavior	Difference from STANDARD
Phase 1	Normal	No change
Phase 2	Skip Tier 3 pattern extraction	Load only Tier 1 (direct) + Tier 2 (dependencies), don’t look for architectural patterns
Phase 3	Reduce task count	Create 2-4 tasks instead of 5-8, skip non-critical optimizations
Phase 4	Shorter prompts	Generate 5-section prompts (Objective, Scope, Approach, Success, Common Mistakes) instead of 9-section
Phase 5	Markdown only	Only generate PLAN.md, skip JSON and interactive formats to save time

Token budget: 30K max Cost: ~$0.04-0.06 Success rate: 85% (might miss some dependencies, but good for scoped changes)

STANDARD Mode (15 minutes, ~60K tokens, 95% accuracy) â DEFAULT

When to use (user doesn’t specify):

Normal features (“Add JWT token refresh”)
Bug fixes with complex impact
Moderate refactoring
Most real-world changes

What Claude executes in STANDARD mode:

Phase	STANDARD behavior
Phase 1	Full issue analysis + framework detection
Phase 2	Load Tier 1 (direct) + Tier 2 (dependencies) + Tier 3 (patterns), extract 3-5 architectural patterns
Phase 3	Full 6-section plan with 5-8 tasks, breaking change analysis, effort estimates
Phase 4	Full 9-section surgical prompts with framework-specific guidance
Phase 5	All 3 output formats: Markdown, JSON, Interactive

Token budget: 60K max Cost: ~$0.09-0.12 Success rate: 95% (captures most dependencies and patterns) Default: Always use this unless user specifies –depth

DEEP Mode (30 minutes, ~90K tokens, 99% accuracy)

When user should request this:

Major architectural changes (“Refactor authentication system”)
Risky changes with broad impact
When user says “I need high confidence”
When requirement affects multiple subsystems

What Claude should do differently in DEEP mode:

Phase	DEEP behavior	Extra details vs STANDARD
Phase 1	Normal	No change
Phase 2	Include file history	For each file: commits that touched it, blame info, last change date
Phase 2	Full dependency graph	Not just files, include: imports, exports, function calls, type references
Phase 2	Extended pattern analysis	Find 5-10 patterns instead of 3-5, include historical patterns
Phase 3	Detailed design choices	For each choice: 2-3 alternatives with tradeoffs, risk assessment
Phase 3	Comprehensive breaking changes	4-category analysis: API, data, behavior, dependency
Phase 4	Extended prompt context	9-section prompts with more code examples (10-15 lines per example)

Token budget: 90K max Cost: ~$0.15-0.20 Success rate: 99% (captures almost everything, good for critical changes)

For Claude: How to Manage Depth Mode

At Invocation (Parse)

IF --depth not specified:
  depth = STANDARD
ELSE IF --depth is one of [QUICK, STANDARD, DEEP]:
  depth = requested mode
ELSE:
  SHOW ERROR: "Invalid depth: [value]. Must be QUICK, STANDARD, or DEEP"

During Phase 2 (Implementation)

Phase 2: Context Research

SET token_budget = {
  QUICK: 30000,
  STANDARD: 60000,
  DEEP: 90000
}[depth]

IF depth === QUICK:
  - Load Tier 1 files only
  - Skip pattern extraction
  ELSE IF depth === STANDARD:
  - Load Tier 1 + Tier 2 files
  - Extract 3-5 patterns
  ELSE IF depth === DEEP:
  - Load Tier 1 + Tier 2 + Tier 3 files
  - Include file history and full dependency graph
  - Extract 5-10 patterns

During Phase 3 (Planning)

Phase 3: Implementation Planning

IF depth === QUICK:
  - Generate 2-4 tasks (minimal decomposition)
  - Skip optimization discussion
  ELSE IF depth === STANDARD:
  - Generate 5-8 tasks (normal decomposition)
  - Include design choices
  ELSE IF depth === DEEP:
  - Generate 8-12 tasks (fine-grained decomposition)
  - Include 2-3 alternatives for each decision
  - Comprehensive breaking change analysis

During Phase 4 (Prompts)

Phase 4: Surgical Prompt Generation

sections = {
  QUICK: [Objective, Scope, Approach, Success, CommonMistakes],
  STANDARD: [Objective, Context, Scope, Approach, Patterns, Constraints,
             BreakingChanges, SuccessCriteria, CommonMistakes],
  DEEP: Same as STANDARD but with extended examples (2x more context)
}

token_limit_per_prompt = {
  QUICK: 350 tokens,
  STANDARD: 650 tokens,
  DEEP: 1000 tokens
}[depth]

During Phase 5 (Output)

Phase 5: Output Formatting

output_formats = {
  QUICK: [markdown],           // Only PLAN.md
  STANDARD: [markdown, json],  // PLAN.md + plan.json
  DEEP: [markdown, json, interactive]  // All 3 formats
}[depth]

Token Transparency (Show User)

After Phase 2 completes, show real-time feedback:

Phase 2 complete: Researching codebase...
ââ Tokens used: 18,400 / 60,000 (31%)
ââ Files analyzed: 23
ââ Patterns found: 4
ââ Estimated final cost: ~$0.08
ââ Status: On track for STANDARD mode (15 min total)

Depth Mode Recovery

If token budget is exceeded during analysis:

IF tokens_used > token_budget * 1.1:  // 10% buffer exceeded
  1. Save state immediately
  2. Show: "Exceeded token budget for [DEPTH] mode"
  3. Offer recovery options:
     a) Continue with reduced depth (QUICK or lower)
     b) Restart with DEEP mode
     c) Analyze specific files only

Team Guidelines Integration

Create .claude/team-guidelines.md to enforce your team’s conventions:

# Team Guidelines

## Code Style
- [language-specific rules]

## Architecture Patterns
- [patterns your team uses]

## Security & Compliance
- [requirements]

code-surgeon automatically:

Loads the guidelines file
Incorporates rules into surgical prompts
Validates generated code against guidelines
Flags violations in breaking changes analysis

Framework Support

35+ frameworks auto-detected from package.json, pyproject.toml, go.mod, Gemfile, Cargo.toml, etc.

Includes: React, Vue, Angular, Next.js, Django, FastAPI, Express, Rails, Spring Boot, Go, Rust, Python, and more.

Each framework has specific pattern templates (React hooks, Django models, Express middleware, etc.) and common mistakes unique to that framework.

Caching & Performance

code-surgeon caches file structure and dependency graphs, saving 25-30% of tokens on repeated analyses. Automatically invalidates cache when files change. Clear with /code-surgeon-clear-cache.

Security & Privacy

code-surgeon is completely local and secure:

â All analysis local (no external API calls)
â Code never leaves your machine
â State stored in .claude/planning/ (git-ignorable)
â PII/secret detection with automatic blocking
â Path traversal and input validation built-in

Common Mistakes

â Using on single-file changes

Don’t use code-surgeon for “fix typo in README” or obvious one-liners. It’s overkill. â Instead: Make simple changes directly, use code-surgeon for multi-file or complex work

â Trusting the plan blindly

The plan is a guide, not gospel. Requirements and codebases change. â Instead: Review the plan, edit if needed, then hand to AI agent

â Ignoring breaking changes warnings

code-surgeon highlights what might break. Don’t ignore these. â Instead: Read the breaking changes section, plan testing accordingly

â Not reading team guidelines

If your team has .claude/team-guidelines.md, it’s loaded automatically. â Instead: Create the file! code-surgeon respects your team’s rules.

â Assuming prompts are perfect

Surgical prompts are very good, but not perfect. Review them. â Instead: Read the prompts, edit if needed, then hand to AI agent

â Using DEEP mode for simple bug fixes

DEEP mode (30 min, $0.15-0.20) analyzes everything in the codebase. For “fix off-by-one error in utils.js”, this is massive overkill. â Instead: Use QUICK mode (5 min, $0.04) for scoped bug fixes. Use DEEP only for architectural changes or risky refactoring.

â Ignoring dependency conflicts in breaking changes section

code-surgeon lists breaking changes. If a package dependency breaks, that affects downstream code. â Instead: When planning, check both the files you’re modifying AND their dependent files. Test with npm test or equivalent.

â Not creating .claude/team-guidelines.md

Without team guidelines, code-surgeon can’t enforce your team’s conventions (naming, error handling, security rules). â Instead: Create .claude/team-guidelines.md with your team’s architectural rules, code style, and security requirements. It takes 30 minutes and vastly improves plan quality.

â Running analysis on private code without review

code-surgeon saves state locally, but if you later share the PLAN.md output, it contains file paths and code references. â Instead: Review PLAN.md and surgical prompts before sharing. Sanitize file paths or code examples if needed for external teams.

Next Steps After Generation

Once code-surgeon completes:

Review the Markdown plan (PLAN.md)
- Read summary, research, design choices
- Check if it matches your intent
Review surgical prompts
- Check file paths and line numbers
- Read the code changes proposed
- Edit if needed (they’re just text)
Choose a task from the plan
- Pick first task from first phase
- Copy the surgical prompt for that task
- Hand to your preferred AI agent (Claude, Cursor, etc.)
Repeat for each task
- Each task has its own prompt
- Tasks are ordered by dependencies
- All context already provided
Resumable anytime
- Interrupted? /code-surgeon-resume <id>
- Want different depth? Run again with --depth=DEEP
- Need to edit plan? Edit PLAN.md, continue

Quick Reference

Command	Purpose
`/code-surgeon "requirement"`	Start new analysis (STANDARD depth)
`/code-surgeon URL --depth=QUICK`	Quick 5-minute analysis (85% accuracy)
`/code-surgeon URL --depth=DEEP`	Thorough 30-minute analysis (99% accuracy)
`/code-surgeon-resume <id>`	Resume interrupted session
`/code-surgeon-clear-cache`	Clear analysis cache
`/code-surgeon-list-sessions`	List all sessions
`/code-surgeon-view <id>`	View plan from session

How to Get Started

Create team guidelines (optional but recommended):

cat > .claude/team-guidelines.md << 'EOF'
# Team Guidelines
[Add your team's rules, patterns, security requirements]
EOF

Run code-surgeon on an issue:

/code-surgeon "Add feature: JWT token refresh"

Review the output (PLAN.md):
- Check if analysis is correct
- Edit if needed
- Save as reference
Copy a surgical prompt:
- Pick Task 1 from the plan
- Copy its surgical prompt
- Paste to Claude, Cursor, or your AI agent
Continue with next tasks:
- Each task has its own prompt
- All context is provided
- Follow dependencies (earlier tasks first)

Technical Details

State Management:

Session ID format: surgeon-<date>-<random>
State file: .claude/planning/sessions/<id>/state.json
Atomic writes (each phase commits atomically)
Resumption: Load state, find highest completed phase, continue

Subagent Communication:

Input: Invocation contract with task, inputs, expected schema, timeout
Output: Result object with success/error, data, metrics
Validation: JSON schema validation on all outputs

Error Levels:

CRITICAL: Block, pause, require resume
HIGH: Retry once, then pause
MEDIUM: Log warning, continue
LOW: Log info, continue

Performance:

Phase 1: 2 min (parallel)
Phase 2: 5 min (context research)
Phase 3: 3 min (planning)
Phase 4: 2 min (prompts + validation)
Phase 5: 1 min (formatting)
Total: 13 minutes for STANDARD depth

For Implementation

This skill requires 5 coordinated sub-skills:

issue-analyzer – Parse requirements, detect type
framework-detector – Identify tech stack
context-researcher – Analyze codebase
implementation-planner – Create 6-section plan
prompt-surgeon – Generate surgical prompts

Each sub-skill is tested independently, then integrated.

code-surgeon orchestrates them all.

GitHub 仓库 ↗ ← 返回陌讯 Skills 聚合平台

code-surgeon

Agent 安装分布

Skill 文档

code-surgeon

Overview

When to Use

Before Running code-surgeon: Assess Your Situation

1. Scope Clarity

2. File Impact Assessment

3. Risk Assessment

4. Breaking Change Impact

5. Time vs. Accuracy Tradeoff

How It Works

Entry Point

Complete Options Reference (For Claude)

When Claude Receives These Options

The Orchestration Pipeline

Session State Management

Error Handling & Recovery

What Claude Should Do in Each Scenario

1. Requirement Validation Fails

2. Sub-Skill Invocation Fails

3. Token Budget Exceeded

4. PII/Secrets Detected in Code

5. Sub-Skill Output Invalid

6. Missing Repository or File

7. User Interrupts (Ctrl+C)

The Resume Protocol

Failure Scenarios Reference Table

What Each Phase Does

Phase 1: Analysis (Parallel, 2 min)

Phase 2: Context Research (5 min)

Phase 3: Planning (3 min)

Phase 4: Surgical Prompts (2 min)

Phase 5: Output Formatting (1 min)

For Claude: Sub-Skill Invocation Patterns

Phase 1A: Issue Analyzer Invocation

Phase 1B: Framework Detector Invocation (Parallel with 1A)

Phase 2: Context Researcher Invocation

Phase 3: Implementation Planner Invocation

Phase 4: Surgical Prompt Generator Invocation

Phase 5: Output Formatter

Depth Modes

QUICK Mode (5 minutes, ~30K tokens, 85% accuracy)

STANDARD Mode (15 minutes, ~60K tokens, 95% accuracy) â DEFAULT

DEEP Mode (30 minutes, ~90K tokens, 99% accuracy)

For Claude: How to Manage Depth Mode

At Invocation (Parse)

During Phase 2 (Implementation)

During Phase 3 (Planning)

During Phase 4 (Prompts)

During Phase 5 (Output)

Token Transparency (Show User)

Depth Mode Recovery

Team Guidelines Integration

Framework Support

Caching & Performance

Security & Privacy

Common Mistakes

â Using on single-file changes

â Trusting the plan blindly

â Ignoring breaking changes warnings

â Not reading team guidelines

â Assuming prompts are perfect

â Using DEEP mode for simple bug fixes

â Ignoring dependency conflicts in breaking changes section

â Not creating .claude/team-guidelines.md

â Running analysis on private code without review

Next Steps After Generation

Quick Reference

How to Get Started

Technical Details

For Implementation

STANDARD Mode (15 minutes, ~60K tokens, 95% accuracy) â DEFAULT

â Using on single-file changes

â Trusting the plan blindly

â Ignoring breaking changes warnings

â Not reading team guidelines

â Assuming prompts are perfect

â Using DEEP mode for simple bug fixes

â Ignoring dependency conflicts in breaking changes section

â Not creating .claude/team-guidelines.md

â Running analysis on private code without review