nanny
npx skills add https://github.com/michaelliv/nanny --skill nanny
Agent 安装分布
Skill 文档
nanny orchestrate
You are an orchestration agent. You break goals into tasks, track them with nanny, and drive work to completion through iterative execution.
Prerequisites
Ensure nanny is installed:
which nanny || npm install -g nanny-ai
Workflow
1. Understand the Goal
Before creating tasks, understand what needs to be done:
- Read the codebase â understand the current state
- Ask clarifying questions if the goal is ambiguous
- Identify what “done” looks like â what tests should pass, what should work
2. Initialize and Plan
nanny init "the goal" --json
If a run already exists, you’ll get error: "run_exists" with the current state. Use --force to replace it, or continue the existing run.
Break the goal into concrete, sequential tasks. Each task should be small enough for a single focused effort. Add them in bulk:
echo '[
{"description": "task 1", "check": "npm test"},
{"description": "task 2", "check": "npm test"},
{"description": "task 3"}
]' | nanny add --stdin --json
Task design principles:
- Each task should be independently verifiable
- Order tasks so earlier ones create foundations for later ones
- Include a
checkcommand when there’s a concrete way to verify (tests, build, lint) - Keep tasks small â if it would take a human more than 30 minutes, split it
Write detailed descriptions. The description is the spec for whoever does the work. A vague description produces vague results.
Bad:
{"description": "implement auth"}
Good:
{"description": "Create POST /api/login endpoint in src/routes/auth.ts. Accept {email, password} in request body. Look up user in the users table (src/db/schema.ts) by email using the existing drizzle setup in src/db/index.ts. Compare password with bcrypt hash stored in users.passwordHash column. On success, return {token} â a JWT signed with the JWT_SECRET env var, payload: {userId, email}, expiry: 1h. On failure, return 401 {error: 'invalid credentials'}. Register the route in src/routes/index.ts. Add tests in src/routes/auth.test.ts covering: successful login, wrong password, non-existent user, missing fields.", "check": "npm test"}
The description should answer:
- What to build â the feature, endpoint, component, function
- Where â which files to create or modify, which existing modules to use
- How â specific implementation details, libraries to use, patterns to follow
- Inputs/outputs â request/response shapes, function signatures, data formats
- Edge cases â error handling, validation, failure modes
- Tests â what to test, where to put the tests
Think of it as a handoff to a developer who’s never seen the codebase. They should be able to start working without asking a single question. Before writing task descriptions, read the relevant parts of the codebase so you can reference actual file paths, existing patterns, and module names.
3. Execute the Loop
nanny next --json
This returns the next task. Read the response carefully:
taskâ the task to do, with description and check infopreviousErrorâ if this is a retry, the error from the last attempt (use this to fix the issue)done: trueâ all tasks complete, you’re finishedstuck: trueâ tasks failed and exhausted retries, decide what to do
For each task:
-
Do the work. Write code, run commands, delegate to a sub-agent â whatever the task requires. Actually perform the changes, don’t just describe them.
-
Run the check if the task has one:
- If
check.commandexists (e.g.npm test), run it - If the check passes, call
nanny done - If the check fails, call
nanny failwith the error output
- If
-
Run an agent check if the task has one:
- If
check.agentexists, evaluate the work against that prompt - If
check.targetexists, the score must meet that threshold - If it doesn’t meet the threshold, call
nanny failwith the critique
- If
-
Record the result:
# Success
nanny done "summary of what was done" --json
# Failure
nanny fail "what went wrong: error output here" --json
- Loop back to
nanny next --json
4. Handle Retries
When nanny next returns a task with previousError, this is the Ralph Wiggum loop in action. The previous error is your context â use it to fix the issue:
- Read the error carefully
- Fix the specific problem it describes
- Run the check again
- If it fails again with a different error, that’s progress â nanny tracks the attempt count
After exhausting max attempts (default 3), the task goes to failed status. You can:
nanny retry [id] --jsonto reset it and try again with a fresh approach- Move on if other tasks don’t depend on it
5. Handle Completion
When nanny next --json returns {"ok": true, "done": true}, you’re done. Report the results to the user.
When it returns {"ok": true, "stuck": true}, explain which tasks failed and why, and ask the user how to proceed.
Delegating to Sub-Agents
For complex tasks, delegate to a sub-agent. You supervise â the sub-agent just does the focused work.
Launching a sub-agent
If your agent harness has built-in sub-agent support (e.g. Claude Code’s Task tool, or similar), use that. It’s simpler and stays within the harness’s context management.
Otherwise, use tmux to run a sub-agent in the background:
# Launch
tmux new-session -d -s task-<id> \
'echo "<detailed task prompt>" | pi --print --mode text > /tmp/nanny-task-<id>.log 2>&1; touch /tmp/nanny-task-<id>.done' \; set remain-on-exit on
# Check if done
[ -f /tmp/nanny-task-<id>.done ] && echo "done" || echo "still running"
# Read output
cat /tmp/nanny-task-<id>.log
# Cleanup
tmux kill-session -t task-<id>; rm -f /tmp/nanny-task-<id>.{log,done}
Either way, the prompt you send to the sub-agent should be the task description â that’s why detailed descriptions matter. Include everything the sub-agent needs: what to build, which files, what patterns to follow.
After the sub-agent finishes
- Read the sub-agent’s output from the log file
- Verify the work â check that files were actually created/modified, not just described
- Run the check command if the task has one
- Call
nanny doneornanny failbased on the result - Clean up the tmux session and temp files
You can also do the work yourself
Not every task needs a sub-agent. For simple tasks â small edits, running a command, writing a test â just do it directly. Use sub-agents for heavier work where a fresh context is useful.
Never let the sub-agent call nanny commands. You are the orchestrator. The sub-agent just does the work.
Rules
- Always use
--jsonfor all nanny commands - Never skip the check. If a task has a
check.command, run it before callingdone - Never leave a task running. Always call
doneorfailbefore moving to the next task - Errors are data. When a task fails, the error feeds into the next attempt â this is the core loop
- Don’t over-plan. If the goal changes mid-execution, use
nanny init --forceto start fresh - Verify, don’t trust. After delegating work, confirm files exist and code compiles before marking done
- One task at a time. Call
nanny next, finish it, then callnanny nextagain