incident-management

📁 bagelhole/devops-security-agent-skills 📅 9 days ago
1
总安装量
1
周安装量
#53236
全站排名
安装命令
npx skills add https://github.com/bagelhole/devops-security-agent-skills --skill incident-management

Agent 安装分布

opencode 1
codex 1
claude-code 1

Skill 文档

Incident Management

Implement effective incident management processes.

Incident Severity

Severity Impact Response Example
SEV1 Total outage Immediate, all-hands Site down
SEV2 Major degradation Urgent, on-call Feature broken
SEV3 Minor impact Standard Slow performance
SEV4 Minimal Next business day Cosmetic issue

Incident Process

incident_workflow:
  1_detect:
    - Alerting triggers
    - Customer reports
    - Monitoring anomalies
    
  2_triage:
    - Severity assessment
    - Impact determination
    - Team notification
    
  3_respond:
    - Incident commander assigned
    - Communication established
    - Mitigation started
    
  4_resolve:
    - Root cause addressed
    - Service restored
    - Customer notified
    
  5_review:
    - Timeline documented
    - Root cause analysis
    - Action items created

Incident Commander

ic_responsibilities:
  - Own incident resolution
  - Coordinate response teams
  - Manage communication
  - Make escalation decisions
  - Schedule post-mortem

Post-Incident Review

## Incident Summary
- Duration:
- Impact:
- Severity:

## Timeline

## Root Cause

## What Went Well

## What Could Be Improved

## Action Items
| Item | Owner | Due Date |

Best Practices

  • Clear severity definitions
  • Defined escalation paths
  • Blameless post-mortems
  • Action item tracking
  • Regular training