docker-production

📁 pluginagentmarketplace/custom-plugin-docker 📅 Today
1
总安装量
1
周安装量
#45282
全站排名
安装命令
npx skills add https://github.com/pluginagentmarketplace/custom-plugin-docker --skill docker-production

Agent 安装分布

opencode 1
github-copilot 1
claude-code 1

Skill 文档

Docker Production Skill

Master production-grade Docker deployments with monitoring, logging, health checks, and resource management.

Purpose

Configure containers for production with proper observability, resource limits, and deployment strategies.

Parameters

Parameter Type Required Default Description
monitoring enum No prometheus prometheus/datadog
logging enum No json-file json-file/loki/elk
replicas number No 1 Number of replicas

Production Configuration

Health Checks

HEALTHCHECK --interval=30s --timeout=3s --retries=3 --start-period=60s \
  CMD curl -f http://localhost:3000/health || exit 1
# Compose health check
services:
  app:
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:3000/health"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 60s

Resource Limits

services:
  app:
    deploy:
      resources:
        limits:
          cpus: '1'
          memory: 1G
        reservations:
          cpus: '0.5'
          memory: 512M
      restart_policy:
        condition: on-failure
        delay: 5s
        max_attempts: 3

Logging Configuration

services:
  app:
    logging:
      driver: json-file
      options:
        max-size: "10m"
        max-file: "3"
        labels: "app,environment"

Monitoring Stack

Prometheus + Grafana

services:
  prometheus:
    image: prom/prometheus:latest
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
    ports:
      - "9090:9090"

  grafana:
    image: grafana/grafana:latest
    ports:
      - "3001:3000"
    environment:
      GF_SECURITY_ADMIN_PASSWORD: ${GRAFANA_PASSWORD}

  cadvisor:
    image: gcr.io/cadvisor/cadvisor:latest
    volumes:
      - /:/rootfs:ro
      - /var/run:/var/run:ro
      - /sys:/sys:ro
      - /var/lib/docker/:/var/lib/docker:ro
    ports:
      - "8080:8080"

Prometheus Config

# prometheus.yml
global:
  scrape_interval: 15s

scrape_configs:
  - job_name: 'docker-containers'
    docker_sd_configs:
      - host: unix:///var/run/docker.sock

Deployment Strategies

Rolling Update (Zero Downtime)

deploy:
  update_config:
    parallelism: 1
    delay: 10s
    failure_action: rollback
    order: start-first
  rollback_config:
    parallelism: 1
    delay: 10s

Blue-Green

# Deploy new version
docker compose -p myapp-green up -d

# Switch traffic (update nginx/load balancer)
# Remove old version
docker compose -p myapp-blue down

Error Handling

Common Errors

Error Cause Solution
unhealthy Health check failing Check endpoint, increase start_period
OOMKilled Memory exceeded Increase limit or optimize
restart loop App crash Check logs, fix application

Recovery

  1. Check logs: docker logs --tail 100 <container>
  2. Verify health: docker inspect --format='{{.State.Health.Status}}'
  3. Rollback if needed

Troubleshooting

Debug Checklist

  • Health check passing?
  • Resources sufficient? docker stats
  • Logs showing errors?
  • Metrics collecting?

Diagnostics

# Resource usage
docker stats --format "table {{.Name}}\t{{.CPUPerc}}\t{{.MemUsage}}"

# Restart count
docker inspect --format='{{.RestartCount}}' <container>

# Recent events
docker events --filter 'container=<name>' --since 1h

Usage

Skill("docker-production")

Related Skills

  • docker-debugging
  • docker-ci-cd
  • docker-security