docker-containerization-expert

📁 webdev70/hosting-google 📅 11 days ago
3
总安装量
3
周安装量
#57383
全站排名
安装命令
npx skills add https://github.com/webdev70/hosting-google --skill docker-containerization-expert

Agent 安装分布

augment 3
gemini-cli 3
antigravity 3
claude-code 3
github-copilot 3
codex 3

Skill 文档

Docker Containerization Expert

This skill provides comprehensive expert knowledge of Docker containerization for Node.js applications, with emphasis on production-ready configurations, security best practices, and cloud platform deployment.

Dockerfile Best Practices

Multi-Stage Builds

Purpose: Reduce final image size by separating build dependencies from runtime dependencies.

Basic Pattern:

# Build stage
FROM node:18-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production

# Production stage
FROM node:18-alpine
WORKDIR /app
COPY --from=builder /app/node_modules ./node_modules
COPY . .
EXPOSE 3000
CMD ["node", "server.js"]

Advanced Pattern with Build Dependencies:

# Build stage with dev dependencies
FROM node:18-alpine AS builder
WORKDIR /app
COPY package*.json ./
RUN npm ci
COPY . .
RUN npm run build

# Production stage
FROM node:18-alpine
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY --from=builder /app/dist ./dist
EXPOSE 3000
USER node
CMD ["node", "dist/server.js"]

Layer Caching Optimization

Order matters: Place commands that change least frequently at the top.

# Good - dependencies cached separately from code
FROM node:18-alpine
WORKDIR /app

# Copy package files first (changes infrequently)
COPY package*.json ./
RUN npm ci --only=production

# Copy application code (changes frequently)
COPY . .

# This ordering means code changes don't invalidate npm install cache

Bad ordering:

# Bad - code changes invalidate entire cache
FROM node:18-alpine
WORKDIR /app
COPY . .
RUN npm ci --only=production

Alpine Linux Specifics

Why Alpine: Minimal footprint (~5MB base vs ~100MB+ for full images)

Base Image Selection:

# Recommended for Node.js apps
FROM node:18-alpine

# For specific Alpine version
FROM node:18-alpine3.19

# For LTS versions
FROM node:20-alpine

Package Management in Alpine:

# Use apk (not apt-get)
RUN apk add --no-cache \
    python3 \
    make \
    g++

Common Alpine Issues:

Missing native dependencies:

# If you need native modules (bcrypt, sharp, etc.)
RUN apk add --no-cache \
    python3 \
    make \
    g++ \
    libc6-compat

Missing shell utilities:

# Alpine uses ash shell, not bash
# For bash compatibility
RUN apk add --no-cache bash

# Or use ash-compatible syntax in scripts

Missing timezone data:

# Add timezone support
RUN apk add --no-cache tzdata
ENV TZ=America/New_York

Security Best Practices

Non-Root User

Why: Limit damage if container is compromised.

Pattern 1: Use built-in node user:

FROM node:18-alpine
WORKDIR /app

# Install dependencies as root
COPY package*.json ./
RUN npm ci --only=production

# Copy application files
COPY . .

# Change ownership to node user
RUN chown -R node:node /app

# Switch to non-root user
USER node

EXPOSE 3000
CMD ["node", "server.js"]

Pattern 2: Create custom user:

FROM node:18-alpine

# Create app user and group
RUN addgroup -g 1001 -S appuser && \
    adduser -S -u 1001 -G appuser appuser

WORKDIR /app
COPY --chown=appuser:appuser package*.json ./
RUN npm ci --only=production

COPY --chown=appuser:appuser . .

USER appuser
EXPOSE 3000
CMD ["node", "server.js"]

Minimal Image Content

Use .dockerignore:

node_modules
npm-debug.log
.git
.gitignore
.env
.env.*
!.env.example
.vscode
.idea
.DS_Store
Thumbs.db
*.md
!README.md
docs/
tests/
__tests__/
coverage/
.github/
Dockerfile
docker-compose.yml
.dockerignore

Benefits:

  • Faster builds (less context to send)
  • Smaller images
  • Prevents accidentally copying secrets

Read-Only Filesystem

# Make filesystem read-only (advanced)
FROM node:18-alpine
WORKDIR /app

COPY package*.json ./
RUN npm ci --only=production

COPY . .

# Create temp directory with write permissions
RUN mkdir -p /tmp/app-cache && \
    chown node:node /tmp/app-cache

USER node
EXPOSE 3000

# Run with read-only root filesystem
# (requires docker run --read-only --tmpfs /tmp/app-cache)
CMD ["node", "server.js"]

npm Install Optimization

Use npm ci instead of npm install:

# Good - deterministic, faster, requires package-lock.json
RUN npm ci --only=production

# Bad - slower, may have version drift
RUN npm install --production

Cache npm packages:

# Use BuildKit cache mounts (requires Docker BuildKit)
RUN --mount=type=cache,target=/root/.npm \
    npm ci --only=production

Clean npm cache:

RUN npm ci --only=production && \
    npm cache clean --force

EXPOSE and CMD/ENTRYPOINT

EXPOSE: Documents port, doesn’t publish it

EXPOSE 3000
# Actual port binding happens at runtime: docker run -p 3000:3000

CMD vs ENTRYPOINT:

CMD (recommended for apps):

# Can be overridden at runtime
CMD ["node", "server.js"]

# Docker run: docker run myimage
# Override: docker run myimage node debug.js

ENTRYPOINT (for tools/scripts):

# Always runs, arguments appended
ENTRYPOINT ["node"]
CMD ["server.js"]

# Docker run: docker run myimage
# With args: docker run myimage debug.js

Combined pattern:

ENTRYPOINT ["node"]
CMD ["server.js"]
# Default: node server.js
# Override: docker run myimage debug.js → node debug.js

Environment Variables

Build-time (ARG):

ARG NODE_VERSION=18
FROM node:${NODE_VERSION}-alpine

ARG BUILD_DATE
LABEL build.date=${BUILD_DATE}

Runtime (ENV):

ENV NODE_ENV=production
ENV PORT=3000

# Reference in CMD
CMD ["sh", "-c", "node server.js"]

Best practice – don’t set sensitive defaults:

# Good - require at runtime
# (set via docker-compose.yml or docker run -e)

# Bad - hardcoded secrets
ENV API_KEY=secret123  # NEVER DO THIS

docker-compose.yml Configuration

Basic Service Definition

version: '3.8'

services:
  app:
    build:
      context: .
      dockerfile: Dockerfile
    container_name: my-app
    ports:
      - "3000:3000"
    environment:
      - NODE_ENV=production
      - PORT=3000
    restart: unless-stopped

Health Checks

Purpose: Allow orchestration platforms to detect if container is actually working.

HTTP health check:

services:
  app:
    build: .
    healthcheck:
      test: ["CMD", "wget", "--quiet", "--tries=1", "--spider", "http://localhost:3000"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 40s
    restart: unless-stopped

Alternative using curl:

healthcheck:
  test: ["CMD", "curl", "-f", "http://localhost:3000"]
  interval: 30s
  timeout: 10s
  retries: 3
  start_period: 40s

TCP check (if no HTTP endpoint):

healthcheck:
  test: ["CMD-SHELL", "nc -z localhost 3000 || exit 1"]
  interval: 30s
  timeout: 10s
  retries: 3

Node.js script health check:

healthcheck:
  test: ["CMD", "node", "healthcheck.js"]
  interval: 30s
  timeout: 10s
  retries: 3

Restart Policies

services:
  app:
    # Never restart automatically
    restart: "no"

    # Always restart (even after system reboot)
    restart: always

    # Restart on failure only
    restart: on-failure

    # Restart unless explicitly stopped (recommended)
    restart: unless-stopped

Volumes and Bind Mounts

Named volumes (persist data):

services:
  app:
    volumes:
      - app-data:/app/data
      - logs:/var/log

volumes:
  app-data:
  logs:

Bind mounts (development):

services:
  app:
    volumes:
      # Mount current directory into container
      - .:/app
      # Exclude node_modules
      - /app/node_modules

Read-only mounts:

volumes:
  - ./config:/app/config:ro  # Read-only

Environment Variables

Inline:

services:
  app:
    environment:
      - NODE_ENV=production
      - PORT=3000
      - DEBUG=app:*

From .env file:

services:
  app:
    env_file:
      - .env
      - .env.production

Variable substitution:

services:
  app:
    image: myapp:${TAG:-latest}
    ports:
      - "${HOST_PORT:-3000}:3000"

Networks

Default network:

# All services can communicate via service names
services:
  app:
    # Can connect to: http://db:5432
  db:
    # Can connect to: http://app:3000

Custom networks:

services:
  app:
    networks:
      - frontend
      - backend

  nginx:
    networks:
      - frontend

  db:
    networks:
      - backend

networks:
  frontend:
  backend:

Dependencies

depends_on (start order only):

services:
  app:
    depends_on:
      - db
    # Starts after db, but doesn't wait for db to be ready

  db:
    image: postgres:15-alpine

Wait for service to be ready:

services:
  app:
    depends_on:
      db:
        condition: service_healthy

  db:
    image: postgres:15-alpine
    healthcheck:
      test: ["CMD", "pg_isready", "-U", "postgres"]
      interval: 10s
      timeout: 5s
      retries: 5

Resource Limits

services:
  app:
    deploy:
      resources:
        limits:
          cpus: '1.0'
          memory: 512M
        reservations:
          cpus: '0.5'
          memory: 256M

Logging

services:
  app:
    logging:
      driver: "json-file"
      options:
        max-size: "10m"
        max-file: "3"

Container Security

Image Scanning

Scan for vulnerabilities:

# Using Docker Scout
docker scout cves myimage:latest

# Using Trivy
trivy image myimage:latest

# Using Snyk
snyk container test myimage:latest

In Dockerfile:

# Use specific, patched versions
FROM node:18.19.0-alpine3.19

# Not latest (unpredictable)
FROM node:alpine

Security Best Practices Checklist

  • Use specific image versions, not latest
  • Run as non-root user
  • Use Alpine or distroless base images
  • Scan images for vulnerabilities
  • Use multi-stage builds to minimize attack surface
  • Don’t include secrets in image
  • Use .dockerignore to exclude unnecessary files
  • Set resource limits
  • Implement health checks
  • Use read-only root filesystem where possible
  • Minimize installed packages
  • Keep base images updated

Runtime Security

Run with security options:

docker run \
  --read-only \
  --tmpfs /tmp \
  --security-opt=no-new-privileges:true \
  --cap-drop=ALL \
  --cap-add=NET_BIND_SERVICE \
  myimage

In docker-compose.yml:

services:
  app:
    read_only: true
    tmpfs:
      - /tmp
    security_opt:
      - no-new-privileges:true
    cap_drop:
      - ALL
    cap_add:
      - NET_BIND_SERVICE

Container Registry

Google Container Registry (GCR) – Legacy

Push to GCR:

docker tag myapp gcr.io/PROJECT_ID/myapp:latest
docker push gcr.io/PROJECT_ID/myapp:latest

Dockerfile reference:

FROM gcr.io/PROJECT_ID/base-image:v1.0

Google Artifact Registry (Modern)

Push to Artifact Registry:

# Configure Docker auth
gcloud auth configure-docker us-central1-docker.pkg.dev

# Tag and push
docker tag myapp us-central1-docker.pkg.dev/PROJECT_ID/my-repo/myapp:v1.0
docker push us-central1-docker.pkg.dev/PROJECT_ID/my-repo/myapp:v1.0

Multi-region replication:

# Create multi-region repository
gcloud artifacts repositories create my-repo \
  --repository-format=docker \
  --location=us \
  --description="Multi-region Docker repository"

Docker Hub

Push to Docker Hub:

docker login
docker tag myapp username/myapp:v1.0
docker push username/myapp:v1.0

Private Registry

Authenticate:

docker login registry.example.com

Push:

docker tag myapp registry.example.com/myapp:v1.0
docker push registry.example.com/myapp:v1.0

Cloud Platform Deployment

Google Cloud Run

PORT environment variable:

# Cloud Run sets PORT dynamically (usually 8080)
# Application MUST read from process.env.PORT
FROM node:18-alpine
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY . .

# Don't hardcode port
EXPOSE 8080
USER node

# Application reads PORT from environment
CMD ["node", "server.js"]

Deployment:

# Build and push
docker build -t gcr.io/PROJECT_ID/myapp .
docker push gcr.io/PROJECT_ID/myapp

# Deploy to Cloud Run
gcloud run deploy myapp \
  --image gcr.io/PROJECT_ID/myapp \
  --region us-central1 \
  --platform managed \
  --allow-unauthenticated

Google Kubernetes Engine (GKE)

Deployment manifest:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp
spec:
  replicas: 3
  selector:
    matchLabels:
      app: myapp
  template:
    metadata:
      labels:
        app: myapp
    spec:
      containers:
      - name: myapp
        image: gcr.io/PROJECT_ID/myapp:v1.0
        ports:
        - containerPort: 3000
        env:
        - name: NODE_ENV
          value: production
        resources:
          requests:
            memory: "256Mi"
            cpu: "250m"
          limits:
            memory: "512Mi"
            cpu: "500m"
        livenessProbe:
          httpGet:
            path: /
            port: 3000
          initialDelaySeconds: 30
          periodSeconds: 10
        readinessProbe:
          httpGet:
            path: /
            port: 3000
          initialDelaySeconds: 5
          periodSeconds: 5

AWS Elastic Container Service (ECS)

Task definition:

{
  "family": "myapp",
  "containerDefinitions": [
    {
      "name": "myapp",
      "image": "123456789012.dkr.ecr.us-east-1.amazonaws.com/myapp:v1.0",
      "memory": 512,
      "cpu": 256,
      "essential": true,
      "portMappings": [
        {
          "containerPort": 3000,
          "protocol": "tcp"
        }
      ],
      "environment": [
        {"name": "NODE_ENV", "value": "production"},
        {"name": "PORT", "value": "3000"}
      ],
      "logConfiguration": {
        "logDriver": "awslogs",
        "options": {
          "awslogs-group": "/ecs/myapp",
          "awslogs-region": "us-east-1",
          "awslogs-stream-prefix": "ecs"
        }
      }
    }
  ],
  "requiresCompatibilities": ["FARGATE"],
  "networkMode": "awsvpc",
  "cpu": "256",
  "memory": "512"
}

Debugging and Troubleshooting

Common Issues

Container Exits Immediately

Check logs:

docker logs container_name
docker logs --tail 50 container_name
docker logs --follow container_name

Common causes:

  • CMD/ENTRYPOINT incorrect
  • Application crashes on startup
  • Missing environment variables
  • File permissions

Port Not Accessible

Verify port binding:

docker ps
# Look for PORT column: 0.0.0.0:3000->3000/tcp

docker port container_name

Test from inside container:

docker exec container_name wget -O- http://localhost:3000

Permission Denied Errors

Check file ownership:

docker exec container_name ls -la /app

Fix in Dockerfile:

COPY --chown=node:node . .
# Or
RUN chown -R node:node /app

Health Check Failing

Check health status:

docker ps
# Look for STATUS column: healthy/unhealthy

docker inspect container_name | grep -A 10 Health

Debug health check:

# Run health check command manually
docker exec container_name wget --quiet --tries=1 --spider http://localhost:3000

Out of Memory

Check memory usage:

docker stats container_name

Increase memory:

services:
  app:
    deploy:
      resources:
        limits:
          memory: 1G

Interactive Debugging

Shell into running container:

# Alpine (uses ash shell)
docker exec -it container_name sh

# If bash installed
docker exec -it container_name bash

Run one-off commands:

docker exec container_name node -v
docker exec container_name npm list
docker exec container_name cat /app/package.json

Inspect environment variables:

docker exec container_name env
docker exec container_name printenv PORT

Build Debugging

Build with no cache:

docker build --no-cache -t myapp .

Build specific stage:

docker build --target builder -t myapp-builder .

View build history:

docker history myapp

Check image size:

docker images myapp

Performance Optimization

Image Size Reduction

Before optimization:

FROM node:18
WORKDIR /app
COPY . .
RUN npm install
CMD ["node", "server.js"]
# Result: ~1GB

After optimization:

FROM node:18-alpine
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production && npm cache clean --force
COPY . .
USER node
CMD ["node", "server.js"]
# Result: ~150MB

Build Speed Optimization

Use BuildKit:

DOCKER_BUILDKIT=1 docker build -t myapp .

Cache mounts:

RUN --mount=type=cache,target=/root/.npm \
    npm ci --only=production

Parallel builds:

docker compose build --parallel

Runtime Performance

Health check interval tuning:

healthcheck:
  interval: 60s  # Less frequent checks
  timeout: 5s    # Shorter timeout
  retries: 2     # Fewer retries

Resource allocation:

deploy:
  resources:
    limits:
      cpus: '2.0'      # More CPU
      memory: 1G       # More memory

Best Practices Summary

Dockerfile

  1. Use Alpine-based images for smaller footprint
  2. Implement multi-stage builds
  3. Order layers from least to most frequently changing
  4. Use npm ci --only=production not npm install
  5. Run as non-root user
  6. Use specific version tags, not latest
  7. Leverage .dockerignore
  8. Clean up after installs (npm cache, apt cache)

docker-compose.yml

  1. Define health checks for all services
  2. Use restart: unless-stopped for resilience
  3. Set resource limits
  4. Use named volumes for persistent data
  5. Implement proper networking
  6. Never commit secrets (use env files)
  7. Configure logging with rotation

Security

  1. Scan images regularly
  2. Use minimal base images
  3. Don’t run as root
  4. Keep images updated
  5. Use read-only filesystems where possible
  6. Implement least privilege
  7. Never embed secrets in images

Cloud Deployment

  1. Read PORT from environment (Cloud Run requirement)
  2. Implement health checks
  3. Use managed container registries
  4. Tag images with commit SHA or version
  5. Set appropriate resource limits
  6. Configure logging for observability

Common Commands Reference

Note: Modern Docker uses docker compose (with space) instead of legacy docker-compose (with hyphen). Docker Compose V2 is integrated as a Docker CLI plugin.

# Build
docker build -t myapp .
docker build --no-cache -t myapp .
docker compose build
docker compose build --no-cache

# Run
docker run -p 3000:3000 myapp
docker run -d -p 3000:3000 --name myapp-container myapp
docker compose up
docker compose up -d

# Stop
docker stop container_name
docker compose down

# Logs
docker logs container_name
docker logs -f container_name
docker compose logs
docker compose logs -f app

# Shell access
docker exec -it container_name sh
docker compose exec app sh

# Inspect
docker ps
docker ps -a
docker inspect container_name
docker stats
docker compose ps

# Clean up
docker rm container_name
docker rmi image_name
docker system prune
docker volume prune

# Registry
docker tag myapp gcr.io/PROJECT_ID/myapp:v1.0
docker push gcr.io/PROJECT_ID/myapp:v1.0
docker pull gcr.io/PROJECT_ID/myapp:v1.0

Resources