ci-cd-best-practices

📁 mindrally/skills 📅 Jan 25, 2026
63
总安装量
63
周安装量
#3472
全站排名
安装命令
npx skills add https://github.com/mindrally/skills --skill ci-cd-best-practices

Agent 安装分布

claude-code 54
opencode 50
gemini-cli 44
antigravity 41
codex 41
cursor 39

Skill 文档

CI/CD Best Practices

You are an expert in Continuous Integration and Continuous Deployment, following industry best practices for automated pipelines, testing strategies, deployment patterns, and DevOps workflows.

Core Principles

  • Automate everything that can be automated
  • Fail fast with quick feedback loops
  • Build once, deploy many times
  • Implement infrastructure as code
  • Practice continuous improvement
  • Maintain security at every stage

Pipeline Design

Pipeline Stages

A typical CI/CD pipeline includes these stages:

Build -> Test -> Security -> Deploy (Staging) -> Deploy (Production)

1. Build Stage

build:
  stage: build
  script:
    - npm ci --prefer-offline
    - npm run build
  artifacts:
    paths:
      - dist/
    expire_in: 1 day
  cache:
    key: ${CI_COMMIT_REF_SLUG}
    paths:
      - node_modules/

Best practices:

  • Use dependency caching to speed up builds
  • Generate build artifacts for downstream stages
  • Pin dependency versions for reproducibility
  • Use multi-stage Docker builds for smaller images

2. Test Stage

test:
  stage: test
  parallel:
    matrix:
      - TEST_TYPE: [unit, integration, e2e]
  script:
    - npm run test:${TEST_TYPE}
  coverage: '/Coverage: \d+\.\d+%/'
  artifacts:
    reports:
      junit: test-results.xml
      coverage_report:
        coverage_format: cobertura
        path: coverage/cobertura-coverage.xml

Testing layers:

  • Unit tests: Fast, isolated, run on every commit
  • Integration tests: Test component interactions
  • End-to-end tests: Validate user workflows
  • Performance tests: Check for regressions

3. Security Stage

security:
  stage: security
  parallel:
    matrix:
      - SCAN_TYPE: [sast, dependency, secrets]
  script:
    - ./security-scan.sh ${SCAN_TYPE}
  allow_failure: false

Security scanning types:

  • SAST: Static Application Security Testing
  • DAST: Dynamic Application Security Testing
  • Dependency scanning: Check for vulnerable packages
  • Secret detection: Find leaked credentials
  • Container scanning: Analyze Docker images

4. Deploy Stage

deploy:staging:
  stage: deploy
  environment:
    name: staging
    url: https://staging.example.com
  script:
    - ./deploy.sh staging
  rules:
    - if: $CI_COMMIT_BRANCH == "develop"

deploy:production:
  stage: deploy
  environment:
    name: production
    url: https://example.com
  script:
    - ./deploy.sh production
  rules:
    - if: $CI_COMMIT_BRANCH == "main"
      when: manual

Deployment Strategies

Blue-Green Deployment

Maintain two identical environments:

deploy:blue-green:
  script:
    - ./deploy-to-inactive.sh
    - ./run-smoke-tests.sh
    - ./switch-traffic.sh
    - ./cleanup-old-environment.sh

Benefits:

  • Zero-downtime deployments
  • Easy rollback by switching traffic back
  • Full testing in production-like environment

Canary Deployment

Gradually roll out to subset of users:

deploy:canary:
  script:
    - ./deploy-canary.sh --percentage=5
    - ./monitor-metrics.sh --duration=30m
    - ./deploy-canary.sh --percentage=25
    - ./monitor-metrics.sh --duration=30m
    - ./deploy-canary.sh --percentage=100

Canary stages:

  1. Deploy to 5% of traffic
  2. Monitor error rates and latency
  3. Gradually increase if metrics are healthy
  4. Full rollout or rollback based on data

Rolling Deployment

Update instances incrementally:

deploy:rolling:
  script:
    - kubectl rollout restart deployment/app
    - kubectl rollout status deployment/app --timeout=5m

Configuration:

  • Set maxUnavailable and maxSurge
  • Health checks determine rollout pace
  • Automatic rollback on failure

Feature Flags

Decouple deployment from release:

// Feature flag implementation
if (featureFlags.isEnabled('new-checkout')) {
  return <NewCheckout />;
} else {
  return <LegacyCheckout />;
}

Benefits:

  • Deploy disabled features to production
  • Gradual feature rollout
  • A/B testing capabilities
  • Quick feature disable without deployment

Environment Management

Environment Hierarchy

Development -> Testing -> Staging -> Production

Each environment should:

  • Mirror production as closely as possible
  • Have isolated data and secrets
  • Use infrastructure as code

Environment Variables

variables:
  # Global variables
  APP_NAME: my-app

# Environment-specific
.staging:
  variables:
    ENV: staging
    API_URL: https://api.staging.example.com

.production:
  variables:
    ENV: production
    API_URL: https://api.example.com

Best practices:

  • Never hardcode secrets
  • Use secret management (Vault, AWS Secrets Manager)
  • Separate configuration from code
  • Document all required variables

Infrastructure as Code

# Terraform example
resource "aws_ecs_service" "app" {
  name            = var.app_name
  cluster         = aws_ecs_cluster.main.id
  task_definition = aws_ecs_task_definition.app.arn
  desired_count   = var.environment == "production" ? 3 : 1

  deployment_configuration {
    maximum_percent         = 200
    minimum_healthy_percent = 100
  }
}

Testing Strategies

Test Pyramid

        /\
       /  \      E2E Tests (Few)
      /----\
     /      \    Integration Tests (Some)
    /--------\
   /          \  Unit Tests (Many)
  --------------

Test Parallelization

test:
  parallel: 4
  script:
    - npm test -- --shard=$CI_NODE_INDEX/$CI_NODE_TOTAL

Test Data Management

  • Use fixtures for consistent test data
  • Reset database state between tests
  • Use factories for dynamic test data
  • Avoid production data in tests

Flaky Test Handling

test:
  retry:
    max: 2
    when:
      - runner_system_failure
      - stuck_or_timeout_failure

Strategies:

  • Quarantine flaky tests
  • Add retry logic for known issues
  • Investigate and fix root causes
  • Track flaky test metrics

Monitoring and Observability

Pipeline Metrics

Track these metrics:

  • Lead time: Commit to production duration
  • Deployment frequency: How often you deploy
  • Change failure rate: Percentage of failed deployments
  • Mean time to recovery: Time to fix failures

Health Checks

deploy:
  script:
    - ./deploy.sh
    - ./wait-for-healthy.sh --timeout=300
    - ./run-smoke-tests.sh

Implement:

  • Readiness probes
  • Liveness probes
  • Startup probes
  • Smoke tests post-deployment

Alerting

notify:failure:
  stage: notify
  script:
    - ./send-alert.sh --channel=deployments --status=failed
  when: on_failure

notify:success:
  stage: notify
  script:
    - ./send-notification.sh --channel=deployments --status=success
  when: on_success

Security in CI/CD

Secrets Management

# Use CI/CD secret variables
deploy:
  script:
    - echo "$DEPLOY_KEY" | base64 -d > deploy_key
    - chmod 600 deploy_key
    - ./deploy.sh
  after_script:
    - rm -f deploy_key

Best practices:

  • Rotate secrets regularly
  • Use short-lived credentials
  • Audit secret access
  • Never log secrets

Pipeline Security

# Restrict who can run production deploys
deploy:production:
  rules:
    - if: $CI_COMMIT_BRANCH == "main"
      when: manual
      allow_failure: false
  environment:
    name: production
    deployment_tier: production

Controls:

  • Branch protection rules
  • Required approvals
  • Audit logging
  • Signed commits

Dependency Security

dependency_check:
  script:
    - npm audit --audit-level=high
    - ./check-licenses.sh
  allow_failure: false

Optimization Techniques

Caching

cache:
  key:
    files:
      - package-lock.json
  paths:
    - node_modules/
  policy: pull-push

Cache strategies:

  • Cache dependencies between runs
  • Use content-based cache keys
  • Separate cache per branch
  • Clean stale caches periodically

Parallelization

stages:
  - build
  - test
  - deploy

# Run tests in parallel
test:unit:
  stage: test
  script: npm run test:unit

test:integration:
  stage: test
  script: npm run test:integration

test:e2e:
  stage: test
  script: npm run test:e2e

Artifact Management

build:
  artifacts:
    paths:
      - dist/
    expire_in: 1 week
    when: on_success

Best practices:

  • Set appropriate expiration
  • Only store necessary artifacts
  • Use artifact compression
  • Clean up old artifacts

Rollback Strategies

Automatic Rollback

deploy:
  script:
    - ./deploy.sh
    - ./health-check.sh || ./rollback.sh

Manual Rollback

rollback:
  stage: deploy
  when: manual
  script:
    - ./get-previous-version.sh
    - ./deploy.sh --version=$PREVIOUS_VERSION

Database Rollbacks

  • Use reversible migrations
  • Test rollback procedures
  • Consider data compatibility
  • Have backup restoration process

Documentation

Pipeline Documentation

Document in your repository:

  • Pipeline stages and their purpose
  • Required environment variables
  • Deployment procedures
  • Troubleshooting guides
  • Rollback procedures

Runbooks

Create runbooks for:

  • Deployment failures
  • Rollback procedures
  • Environment setup
  • Incident response

Continuous Improvement

Metrics to Track

  • Build success rate
  • Average build time
  • Test coverage trends
  • Deployment frequency
  • Incident frequency

Regular Reviews

  • Weekly pipeline performance review
  • Monthly security assessment
  • Quarterly process improvement
  • Annual tooling evaluation