error-coordinator
29
总安装量
29
周安装量
#7246
全站排名
安装命令
npx skills add https://github.com/404kidwiz/claude-supercode-skills --skill error-coordinator
Agent 安装分布
claude-code
20
opencode
20
gemini-cli
18
cursor
16
windsurf
14
Skill 文档
Error Coordinator
Purpose
Provides expertise in building resilient multi-agent systems with robust error handling, failure detection, and recovery mechanisms. Covers loop detection, hallucination mitigation, and self-healing agent workflows.
When to Use
- Designing error handling for agent systems
- Implementing retry and recovery strategies
- Building self-healing AI workflows
- Detecting agent loops and infinite recursion
- Mitigating hallucinations in agent outputs
- Implementing circuit breakers for agents
- Coordinating failure recovery across agents
Quick Start
Invoke this skill when:
- Designing error handling for agent systems
- Implementing retry and recovery strategies
- Building self-healing AI workflows
- Detecting agent loops and infinite recursion
- Coordinating failure recovery across agents
Do NOT invoke when:
- Organizing agent teams (use agent-organizer)
- Debugging application errors (use debugger)
- Handling production incidents (use incident-responder)
- Detecting code error patterns (use error-detective)
Decision Framework
Error Type Handling:
âââ Transient failure â Retry with backoff
âââ Rate limiting â Backoff + queue
âââ Invalid output â Validation + retry with feedback
âââ Loop detected â Break + escalate
âââ Hallucination â Ground with context, retry
âââ Agent timeout â Cancel + fallback
âââ Cascading failure â Circuit breaker
Recovery Strategy:
âââ Idempotent operation â Simple retry
âââ Stateful operation â Checkpoint + resume
âââ Critical path â Fallback agent
âââ Best effort â Log + continue
Core Workflows
1. Loop Detection System
- Track agent invocation history
- Detect repeated state patterns
- Set maximum iteration limits
- Implement escape hatch triggers
- Log loop occurrences for analysis
- Escalate to supervisor or human
2. Hallucination Mitigation
- Ground responses with source data
- Implement output validation
- Cross-check with retrieval
- Add confidence scoring
- Flag low-confidence outputs
- Provide feedback for retry
3. Circuit Breaker Implementation
- Track failure rates per agent
- Define failure threshold
- Open circuit on threshold breach
- Provide fallback behavior
- Implement half-open state for testing
- Close circuit on recovery
- Monitor and alert on breaker state
Best Practices
- Implement timeouts for all agent calls
- Use exponential backoff with jitter
- Log all failures with full context
- Design for graceful degradation
- Test failure scenarios explicitly
- Monitor error rates and patterns
Anti-Patterns
| Anti-Pattern | Problem | Correct Approach |
|---|---|---|
| Infinite retries | Resource exhaustion | Max retry limits |
| Silent failures | Hidden problems | Log and alert |
| No timeouts | Hung processes | Always set timeouts |
| Same retry interval | Thundering herd | Exponential backoff |
| No fallbacks | Complete failure | Graceful degradation |