pr-review

📁 klohto/agent-skills 📅 7 days ago
1
总安装量
1
周安装量
#42038
全站排名
安装命令
npx skills add https://github.com/klohto/agent-skills --skill pr-review

Agent 安装分布

amp 1
opencode 1
kimi-cli 1
codex 1
claude-code 1

Skill 文档

PR Review Skill

Systematic, thorough pull request review process using parallel analysis and expert validation.

Announce each phase as you enter it, including what you MUST do:

  • “## Phase 1: Gather Context” → checkout, get metadata, get diff, read all changed files
  • “## Phase 2: Note Concerns” → write inline observations, fill execution-context table (including test code)
  • “## Phase 3: Parallel Deep Analysis” → spawn 3+ Task subagents, document findings, verify factual claims
  • “## Phase 4: Expert Validation” → Oracle broad sweep, then focused follow-ups on subagent findings
  • “## Phase 5: Verify Every Claim” → verify each claim before including
  • “## Phase 6: Programmatic Verification” → run scripts to produce reproducible evidence for final findings

Phase 1: Gather Context

  1. Checkout the PR branch

    git fetch origin pull/<PR_NUMBER>/head:pr-<PR_NUMBER> && git checkout pr-<PR_NUMBER>
    

    All files now available locally. (This works even when gh pr checkout fails due to deleted remote branches.)

  2. Get PR metadata (title, body, author – not the diff)

    gh pr view <PR_NUMBER> --json title,body,author
    
  3. Get the diff

    gh pr diff <PR_NUMBER>
    
  4. Work locally – Read files, grep patterns. Everything is local after checkout.

⛔ DO NOT RUN TESTS. No pytest, pnpm test, cargo test, go test, etc. CI handles tests. Running them wastes context and tokens.

Important: Read EVERY file in the diff thoroughly, not just skim:

  • Production code: trace data flow, check error handling
  • Test code: examine fixtures, check how test data is created/modified, look for shared state
  • Config files: check for downstream impacts

This deep reading generates the observations for Phase 2. Rushed reading = missed bugs.

Phase 2: Note Concerns as You Go

As you read through changes, write observations directly in your response for anything needing deeper investigation:

“I notice this uses a union type with Temporal – need to verify serialization works correctly.” “This changes the URL format – need to check downstream consumers.”

This creates a visible trail of concerns to address in Phase 3, preventing lost context in large diffs.

Red Flags

  • Config file changes (eslint, tsconfig, vite, package.json, etc.) – Check .github/workflows and other dependents that may be affected.
  • New patterns – Before reviewing new code, Grep for how similar things are done elsewhere. Review new code against existing conventions.
  • Deleted code – Check nothing else depends on it. Grep for usages.
  • Renamed/moved files – Check all imports are updated.

Phase 3: Parallel Deep Analysis (REQUIRED)

For each observation from Phase 2, spawn a Task subagent to investigate in parallel. Do NOT proceed to Phase 4 until subagents complete.

Why subagents: They avoid polluting main context, run in parallel, and go deep on specific concerns.

Minimum expectations:

  • 3+ subagents for any non-trivial PR
  • Each subagent investigates ONE specific concern
  • Run them in parallel (single assistant message with multiple Task calls)

Spawn a Task For

  • Tracing data flow / ownership (where does X come from, who mutates it)
  • Checking if pattern is consistent with rest of codebase
  • Verifying test fixtures match production behavior
  • Checking exception handling paths
  • Any “I wonder if…” observation from Phase 2
  • If test files changed: One subagent MUST analyze test data isolation – grep for ALL copy/fixture patterns, not just spot-check one file

Task Prompt Template

Investigate: [specific concern from Phase 2]

Questions:
1. [Specific question]
2. [Specific question]

Files: [relevant files]

Return: [concrete answer, not "might be an issue"]

After subagents return:

  • Document each finding with reasoning
  • Verify factual claims – If subagent says “X provides Y format” or “DB stores Z”, confirm by reading actual code/fixtures/schemas. Don’t accept unverified assertions about system behavior.

Phase 4: Expert Validation (REQUIRED)

Oracle – Two-Pass Minimum

You MUST call Oracle at least twice:

Pass 1: Broad sweep (before subagents complete is fine)

Review PR for bugs, edge cases, architectural issues.
Files: [all significant changed files]

Pass 2+: Focused follow-ups (REQUIRED after subagents return) For each subagent finding that isn’t trivially resolved, call Oracle:

Subagent found: [summary of finding]
Is this correct? What edge cases could break this?
Files: [relevant files]

Do NOT skip focused follow-ups. Subagent findings need expert validation before being accepted or dismissed.

Librarian – For External APIs

Use when the PR uses external library APIs or integrates with third-party services.

Phase 5: Verify Every Claim (MANDATORY)

STOP. Before writing ANY claim – issue OR “what’s good” – you MUST verify it first.

For EACH claim you want to include (bugs, risks, AND positive statements):

  1. Identify the claim type
  2. Run the required verification (not optional)
  3. Only include if verification confirms the issue
Claim Type Required Verification If You Skip This
“X is missing” Grep for X, check defaults, confirm actually absent FALSE POSITIVE
“X is wrong/buggy” Oracle review OR trace the logic yourself step-by-step FALSE POSITIVE
“X is inconsistent” Grep for similar patterns, confirm actual difference FALSE POSITIVE
“X should be Y” Librarian or official docs to confirm Y is correct FALSE POSITIVE
“Default might be wrong” Read the default value, check if it’s actually inadequate FALSE POSITIVE
“May fail under repeat/parallel/reorder” Trace what state crosses invocation boundaries, confirm it’s actually shared/mutable FALSE POSITIVE
“System X provides/stores Y format” Read actual webhook fixtures, DB schemas, or trace the data flow to confirm UNVERIFIED ASSUMPTION
“This change is correct/good” Verify the before/after behavior matches expectations by reading actual data UNVERIFIED PRAISE
“Pattern X is used consistently” Grep for ALL instances, not just one example SAMPLING BIAS

DO NOT:

  • Include findings from Oracle/Librarian without verifying them yourself
  • Assume something is missing without grepping for defaults
  • Flag “missing X” when X has a sensible default
  • Include hunches or “might be an issue” without verification

If you cannot verify a claim, DROP IT. Do not include it.

Phase 6: Programmatic Verification (REQUIRED)

Before including any finding in the FINAL review, verify it with deterministic, reproducible evidence.

Why: LLM analysis can hallucinate. Scripts execute against actual code and produce objective, reproducible evidence that cannot be faked.

Allowed verification methods:

  • Fast local commands (rg, grep, jq, etc.)
  • Small read-only Python scripts via uv run python (no network, no writes, bounded output)

Process (for Python scripts):

  1. Create temp script file, iterate via edit_file (avoids re-reading, saves tokens)
  2. Run with uv run python <script>
  3. Keep output minimal (counts + first N examples). Avoid tracebacks and large dumps.
  4. Output is the source of truth — if it contradicts the claim, DROP or correct it

Each final finding must include evidence:

**Evidence:**
- Command: `uv run python /tmp/verify_x.py`
- Output: (≤10 lines)
- Interpretation: [what this proves]

This is NOT running tests. These are small, fast, read-only verification scripts — not pytest, pnpm test, etc.

Organize findings by severity:

Priority Category Criteria
🔴 P0 Critical Runtime failures, data loss, security
🟡 P1 Medium Semantic changes, inconsistencies, tech debt
🟢 P2 Minor Style, naming, documentation

Output Format

# PR #<NUMBER> Review: `<title>`

## Summary
Brief description of what the PR does.

## 🔴 Critical Issues
[Issues that will cause runtime failures]

## 🟡 Medium Issues
[Semantic changes, inconsistencies]

## 🟢 Minor Issues / Suggestions
[Style, improvements]

## ✅ What's Good
[Acknowledge well-implemented parts]

## Action Items Summary
| Priority | Issue | Action |
|----------|-------|--------|
| 🔴 P0 | Issue name | Required fix |
| 🟡 P1 | Issue name | Recommended change |

Key Principles

  1. Checkout firstgit checkout before anything, then work locally
  2. No local tests – DO NOT run test suites; CI handles that
  3. Note concerns inline – Write observations in your response as you review (Phase 2)
  4. Subagents for depth – Spawn 3+ Task subagents in parallel for non-trivial PRs (Phase 3)
  5. Oracle twice minimum – Broad sweep + focused follow-ups for each unresolved concern (Phase 4)
  6. VERIFY BEFORE REPORTING – Every claim must be verified. No exceptions.
  7. Programmatic evidence – Final findings require reproducible script output (Phase 6)
  8. Execution-context check – Fill the table for repeat/parallel/reorder analysis