terraform-drift-detection

📁 lgbarn/devops-skills 📅 9 days ago
1
总安装量
1
周安装量
#49757
全站排名
安装命令
npx skills add https://github.com/lgbarn/devops-skills --skill terraform-drift-detection

Agent 安装分布

mcpjam 1
claude-code 1
replit 1
junie 1
windsurf 1
zencoder 1

Skill 文档

Terraform Drift Detection

Overview

Detect and categorize drift between Terraform-managed state and actual infrastructure. Drift indicates out-of-band changes that can cause problems during the next apply.

Announce at start: “I’m using the terraform-drift-detection skill to check for infrastructure drift.”

The Process

Step 1: Verify Environment

# Verify AWS credentials and account
aws sts get-caller-identity

# Confirm we're in the right directory/environment
pwd
ls -la *.tf 2>/dev/null | head -5

Step 2: Refresh State

# Initialize if needed
terraform init

# Refresh state to detect drift
terraform plan -refresh-only -out=drift.out

# Convert to JSON for analysis
terraform show -json drift.out > drift.json

Step 3: Analyze Drift

Parse drift.json and categorize changes:

Drift Categories

Category Severity Examples
Security Drift CRITICAL Security groups, IAM, encryption
Configuration Drift HIGH Instance settings, networking
Tag Drift LOW Tags modified outside Terraform
Metadata Drift INFO AWS-managed fields that change

Step 4: Dispatch Analysis Agent

Task(drift-detector) → Categorize and assess drift impact

Agent should:

  • Categorize each drifted resource
  • Assess impact of accepting vs rejecting drift
  • Identify potential causes (manual changes, AWS updates, etc.)

Step 5: Present Findings

## Drift Detection Report

### Summary
- Total drifted resources: X
- Critical drift: Y
- High drift: Z
- Low/Info drift: W

### Critical Drift (Requires Immediate Attention)
| Resource | Attribute | State Value | Actual Value |
|----------|-----------|-------------|--------------|
| ... | ... | ... | ... |

### Potential Causes
- Manual console changes: [list if detected]
- AWS service updates: [list if detected]
- Unknown origin: [list if detected]

### Recommended Actions
1. [Action for each drifted resource]

Step 6: Resolution Options

Present user with options:

  1. Accept Drift: Run terraform apply -refresh-only to update state to match actual
  2. Reject Drift: Run terraform apply to revert actual infrastructure to match code
  3. Investigate: Manual review needed before deciding
  4. Hybrid: Accept some drift, reject other drift

Never auto-resolve drift. Always get user approval.

Common Drift Sources

Intentional (Usually Accept)

  • AWS auto-scaling adjustments
  • Managed service updates
  • Emergency manual fixes (document these!)

Unintentional (Usually Reject)

  • Console click-ops mistakes
  • Unapproved manual changes
  • Conflicting automation

Systemic (Fix Root Cause)

  • Multiple tools managing same resources
  • Missing Terraform coverage
  • CI/CD race conditions

Integration with Memory

Store detected drift patterns:

  • Which resources commonly drift
  • Common causes in this environment
  • Resolution preferences

Query memory before analysis:

  • Has this resource drifted before?
  • What was the cause last time?
  • What resolution was chosen?

Verification Checklist

Before presenting:

  • Refresh completed successfully
  • All drift categorized by severity
  • Root causes identified where possible
  • Resolution options are clear
  • No sensitive values exposed in output