evolve

📁 boshu2/agentops 📅 1 day ago

177

总安装量

周安装量

#2804

全站排名

安装命令

npx skills add https://github.com/boshu2/agentops --skill evolve

Agent 安装分布

openclaw 2

gemini-cli 2

antigravity 2

claude-code 2

cursor 2

Skill 文档

/evolve â Autonomous Compounding Loop

Purpose: Measure what’s wrong. Fix the worst thing. Measure again. Compound.

Thin fitness-scored loop over /rpi. The knowledge flywheel provides compounding â each cycle loads learnings from all prior cycles.

Quick Start

/evolve                      # Run until all goals met or --max-cycles hit
/evolve --max-cycles=5       # Cap at 5 improvement cycles
/evolve --dry-run            # Measure fitness, show what would be worked on, don't execute

Execution Steps

YOU MUST EXECUTE THIS WORKFLOW. Do not just describe it.

Step 0: Setup

mkdir -p .agents/evolve

Load accumulated learnings (COMPOUNDING):

ao inject 2>/dev/null || true

Parse flags:

--max-cycles=N (default: 10) â hard cap on improvement cycles
--dry-run â measure and report only, no execution

Initialize state:

evolve_state = {
  cycle: 0,
  max_cycles: <from flag, default 10>,
  dry_run: <from flag, default false>,
  history: []
}

Step 1: Kill Switch Check

Check at the TOP of every cycle iteration:

# External kill (outside repo â can't be accidentally deleted by agents)
if [ -f ~/.config/evolve/KILL ]; then
  echo "KILL SWITCH ACTIVE: $(cat ~/.config/evolve/KILL)"
  # Write acknowledgment
  echo "{\"killed_at\": \"$(date -Iseconds)\", \"cycle\": $CYCLE}" > .agents/evolve/KILLED.json
  exit 0
fi

# Local convenience stop
if [ -f .agents/evolve/STOP ]; then
  echo "STOP file detected: $(cat .agents/evolve/STOP 2>/dev/null)"
  exit 0
fi

If either file exists, log reason and stop immediately. Do not proceed to measurement.

Step 2: Measure Fitness (MEASURE_FITNESS)

Read GOALS.yaml from repo root. For each goal:

# Run the check command
if eval "$goal_check" > /dev/null 2>&1; then
  # Exit code 0 = PASS
  result = "pass"
else
  # Non-zero = FAIL
  result = "fail"
fi

Record results:

# Write fitness snapshot
cat > .agents/evolve/fitness-${CYCLE}.json << EOF
{
  "cycle": $CYCLE,
  "timestamp": "$(date -Iseconds)",
  "goals": [
    {"id": "$goal_id", "result": "$result", "weight": $weight},
    ...
  ]
}
EOF

Bootstrap mode: If a check command fails to execute (command not found, permission denied), mark that goal as "result": "skip" with a warning. Do NOT block the entire loop because one check is broken.

Step 3: Select Work

failing_goals = [g for g in goals if g.result == "fail"]

if not failing_goals:
  # Before stopping, check harvested work from prior /rpi cycles
  if [ -f .agents/rpi/next-work.jsonl ]; then
    items = read_unconsumed(next-work.jsonl)  # entries with consumed: false
    if items:
      selected_item = max(items, by=severity)  # highest severity first
      log "All goals met. Picking harvested work: {selected_item.title}"
      # Execute as an /rpi cycle (Step 4), then mark consumed
      /rpi "{selected_item.title}" --auto --max-cycles=1
      mark_consumed(selected_item)  # set consumed: true, consumed_by, consumed_at
      # Skip Steps 4-5 (already executed above), go to Step 6 (log cycle)
      log_cycle(cycle, goal_id="next-work:{selected_item.title}", result="harvested")
      continue loop  # â Step 1 (kill switch check)

  log "All goals met, no harvested work. Done."
  STOP â go to Teardown

# Sort by weight (highest priority first)
failing_goals.sort(by=weight, descending)

# Simple strike check: skip goals that failed the last 3 consecutive cycles
for goal in failing_goals:
  recent = last_3_cycles_for(goal.id)
  if all(r.result == "regressed" for r in recent):
    log "Skipping {goal.id}: regressed 3 consecutive cycles. Needs human attention."
    continue
  selected = goal
  break

if no goal selected:
  log "All failing goals have regressed 3+ times. Human intervention needed."
  STOP â go to Teardown

Step 4: Execute

If --dry-run: Report the selected goal (or harvested item) and stop.

log "Dry run: would work on '{selected.id}' (weight: {selected.weight})"
log "Description: {selected.description}"
log "Check command: {selected.check}"

# Also show queued harvested work
if [ -f .agents/rpi/next-work.jsonl ]; then
  items = read_unconsumed(next-work.jsonl)
  if items:
    log "Harvested work queue ({len(items)} items):"
    for item in items:
      log "  - [{item.severity}] {item.title} ({item.type})"

STOP â go to Teardown

Otherwise: Run a full /rpi cycle on the selected goal.

/rpi "Improve {selected.id}: {selected.description}" --auto --max-cycles=1

This internally runs the full lifecycle:

/research â understand the problem
/plan â decompose into issues
/pre-mortem â validate the plan
/crank â implement (spawns workers)
/vibe â validate the code
/post-mortem â extract learnings + ao forge (COMPOUNDING)

Wait for /rpi to complete before proceeding.

Step 5: Re-Measure and Check Regression

After /rpi completes, re-run MEASURE_FITNESS (same as Step 2).

Compare before/after:

# Check the target goal
if selected_goal.result == "pass":
  outcome = "improved"
elif selected_goal.result == "fail":
  # Check if OTHER goals regressed
  newly_failing = [g for g in goals if g.was_passing_before and g.result == "fail"]
  if newly_failing:
    outcome = "regressed"
    log "REGRESSION: {newly_failing} started failing after fixing {selected.id}"
    # Revert: find the most recent merge commit and revert it
    git log --oneline -5  # Find the merge
    git revert HEAD --no-edit  # Revert the last commit
    log "Reverted regression. Moving to next goal."
  else:
    outcome = "unchanged"

Step 6: Log Cycle

Append to .agents/evolve/cycle-history.jsonl:

{"cycle": 1, "goal_id": "test-pass-rate", "result": "improved", "timestamp": "2026-02-11T21:00:00Z"}
{"cycle": 2, "goal_id": "doc-coverage", "result": "regressed", "timestamp": "2026-02-11T21:30:00Z"}

Step 7: Loop or Stop

evolve_state.cycle += 1

if evolve_state.cycle >= evolve_state.max_cycles:
  log "Max cycles ({max_cycles}) reached."
  STOP â go to Teardown

# Otherwise: loop back to Step 1 (kill switch check)

Teardown

# Write session summary
cat > .agents/evolve/session-summary.md << EOF
# /evolve Session Summary

**Date:** $(date -Iseconds)
**Cycles:** $CYCLE of $MAX_CYCLES
**Goals measured:** $(wc -l < GOALS.yaml goals)

## Cycle History
$(cat .agents/evolve/cycle-history.jsonl)

## Final Fitness
$(cat .agents/evolve/fitness-${CYCLE}.json)

## Next Steps
- Run \`/evolve\` again to continue improving
- Run \`/evolve --dry-run\` to check current fitness without executing
- Create \`~/.config/evolve/KILL\` to prevent future runs
- Create \`.agents/evolve/STOP\` for a one-time local stop
EOF

Report to user:

## /evolve Complete

Cycles: N of M
Goals improved: X
Goals regressed: Y (reverted)
Goals unchanged: Z

Run `/evolve` again to continue, or `/post-mortem` to wrap up.

How Compounding Works

Two mechanisms feed the loop:

1. Knowledge flywheel (each cycle is smarter):

Session 1:
  ao inject (nothing yet)         â cycle runs blind
  /rpi fixes test-pass-rate       â post-mortem runs ao forge
  Learnings extracted: "tests/skills/run-all.sh validates frontmatter"

Session 2:
  ao inject (loads Session 1 learnings)  â cycle knows about frontmatter validation
  /rpi fixes doc-coverage                â approach informed by prior learning
  Learnings extracted: "references/ dirs need at least one .md file"

2. Work harvesting (each cycle discovers the next):

Cycle 1: /rpi fixes test-pass-rate
  â post-mortem harvests: "add missing smoke test for /evolve" â next-work.jsonl

Cycle 2: all GOALS.yaml goals pass
  â /evolve reads next-work.jsonl â picks "add missing smoke test"
  â /rpi fixes it â post-mortem harvests: "update SKILL-TIERS count"

Cycle 3: reads next-work.jsonl â picks "update SKILL-TIERS count" â ...

The loop keeps running as long as post-mortem keeps finding follow-up work. Each /rpi cycle generates next-work items from its own post-mortem. The system feeds itself.

Priority cascade:

GOALS.yaml goals (explicit, human-authored)  â fix these first
next-work.jsonl (harvested from post-mortem) â work on these when goals pass
nothing left                                 â stop (human re-evaluates)

Kill Switch

Two paths, checked at every cycle boundary:

File	Purpose	Who Creates It
`~/.config/evolve/KILL`	Permanent stop (outside repo)	Human
`.agents/evolve/STOP`	One-time local stop	Human or automation

To stop /evolve:

echo "Taking a break" > ~/.config/evolve/KILL    # Permanent
echo "done for today" > .agents/evolve/STOP       # Local, one-time

To re-enable:

rm ~/.config/evolve/KILL
rm .agents/evolve/STOP

Flags

Flag	Default	Description
`--max-cycles=N`	10	Hard cap on improvement cycles per session
`--dry-run`	off	Measure fitness and show plan, don’t execute

GOALS.yaml Schema

version: 1
mission: "What this repo does"

goals:
  - id: unique-identifier
    description: "Human-readable description"
    check: "shell command â exit 0 = pass, non-zero = fail"
    weight: 1-10  # Higher = fix first

Goals are checked in weight order (highest first). The first failing goal with the highest weight is selected for improvement.

Artifacts

File	Purpose
`GOALS.yaml`	Fitness goals (repo root)
`.agents/evolve/fitness-{N}.json`	Fitness snapshot per cycle
`.agents/evolve/cycle-history.jsonl`	Cycle outcomes log
`.agents/evolve/session-summary.md`	Session wrap-up
`.agents/evolve/STOP`	Local kill switch
`.agents/evolve/KILLED.json`	Kill acknowledgment
`~/.config/evolve/KILL`	External kill switch