ci-cd
32
总安装量
32
周安装量
#6283
全站排名
安装命令
npx skills add https://github.com/cosmix/loom --skill ci-cd
Agent 安装分布
claude-code
27
opencode
23
gemini-cli
22
codex
21
github-copilot
19
Skill 文档
CI/CD
Overview
This skill covers the complete lifecycle of CI/CD pipeline design, implementation, and optimization across platforms including GitHub Actions, GitLab CI, Jenkins, CircleCI, and cloud-native solutions. It encompasses automated testing integration, security scanning, artifact management, deployment strategies, and specialized pipelines for ML workloads.
When to Use
- Implementing or migrating CI/CD pipelines
- Optimizing build and test execution times
- Integrating security scanning (SAST, DAST, dependency checks)
- Setting up deployment automation with rollback strategies
- Configuring test suites in CI environments
- Managing artifacts and container registries
- Implementing ML model training and deployment pipelines
- Troubleshooting pipeline failures and flakiness
Instructions
1. Analyze Requirements
- Identify build and test requirements
- Determine deployment targets and environments
- Assess security scanning needs (SAST, DAST, secrets, dependencies)
- Plan environment promotion strategy (dev â staging â production)
- Define quality gates and approval workflows
- Identify test suite composition (unit, integration, E2E)
- Determine artifact storage and retention policies
2. Design Pipeline Architecture
- Structure stages logically with clear dependencies
- Optimize for speed through parallelization and caching
- Design fail-fast strategy (lint â unit tests â integration tests â build)
- Plan secret management and secure credential handling
- Define deployment strategies (rolling, blue-green, canary)
- Architect for rollback and recovery procedures
- Design matrix builds for multi-platform support
- Plan monorepo CI strategies if applicable
3. Implement Testing Integration
- Configure unit test execution with coverage reporting
- Set up integration tests with service dependencies (databases, APIs)
- Implement E2E/smoke tests for critical user journeys
- Configure test parallelization and sharding
- Integrate test result reporting (JUnit, TAP, JSON)
- Set up flaky test detection and quarantine
- Configure performance/load testing stages
- Implement visual regression testing if applicable
4. Implement Security Scanning
- Integrate SAST (static analysis) tools (SonarQube, CodeQL, Semgrep)
- Configure DAST (dynamic analysis) for deployed environments
- Set up dependency/vulnerability scanning (Dependabot, Snyk, Trivy)
- Implement container image scanning
- Configure secrets detection (GitGuardian, TruffleHog)
- Set up license compliance checking
- Define security gate thresholds and failure policies
5. Implement Build and Artifact Management
- Configure dependency caching strategies
- Implement build output caching and layer caching (Docker)
- Set up artifact versioning and tagging
- Configure container registry integration
- Implement multi-stage builds for optimization
- Set up artifact signing and attestation
- Configure artifact retention and cleanup policies
6. Implement Deployment Automation
- Configure environment-specific deployments
- Implement deployment strategies (rolling, blue-green, canary)
- Set up health checks and readiness probes
- Configure smoke tests post-deployment
- Implement automated rollback on failure
- Set up deployment notifications (Slack, email, PagerDuty)
- Configure manual approval gates for production
7. Optimize Pipeline Performance
- Analyze pipeline execution times and bottlenecks
- Implement job parallelization for independent tasks
- Configure aggressive caching (dependencies, build outputs, Docker layers)
- Optimize test execution (parallel runners, test sharding)
- Use matrix builds efficiently
- Consider self-hosted runners for performance-critical workloads
- Implement conditional job execution (path filters, change detection)
8. Ensure Reliability and Observability
- Add retry logic for transient failures
- Implement comprehensive error handling
- Configure alerts for pipeline failures
- Set up metrics and dashboards for pipeline health
- Document runbooks and troubleshooting procedures
- Implement audit logging for deployments
- Configure SLO tracking for pipeline performance
Best Practices
Core Principles
- Fail Fast: Run cheap, fast checks first (lint, type check, unit tests)
- Parallelize Aggressively: Run independent jobs concurrently
- Cache Everything: Dependencies, build outputs, Docker layers
- Secure by Default: Secrets in vaults, least privilege, audit logs
- Environment Parity: Keep dev/staging/prod as similar as possible
- Immutable Artifacts: Build once, promote everywhere
- Automated Rollback: Every deployment must be reversible
- Idempotent Operations: Pipelines should be safely re-runnable
Testing in CI/CD
- Test Pyramid: More unit tests, fewer integration tests, minimal E2E
- Isolation: Tests should not depend on execution order
- Determinism: Eliminate flaky tests or quarantine them
- Fast Feedback: Unit tests < 5min, full suite < 15min target
- Coverage Gates: Enforce minimum coverage thresholds
- Service Mocking: Use test doubles for external dependencies
Security
- Shift Left: Run security scans early in the pipeline
- Dependency Scanning: Check for CVEs in all dependencies
- Secrets Management: Never hardcode secrets, use secure vaults
- Least Privilege: Minimal permissions for pipeline runners
- Supply Chain Security: Verify and sign artifacts
- Audit Trail: Log all deployments and access
Performance
- Incremental Builds: Only rebuild changed components
- Layer Caching: Optimize Dockerfile layer order
- Dependency Locking: Pin versions for reproducibility
- Resource Limits: Prevent resource exhaustion
- Path Filtering: Skip jobs when irrelevant files change
Examples
Example 1: GitHub Actions Workflow
name: CI/CD Pipeline
on:
push:
branches: [main, develop]
pull_request:
branches: [main]
env:
NODE_VERSION: "20"
REGISTRY: ghcr.io
IMAGE_NAME: ${{ github.repository }}
jobs:
lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: ${{ env.NODE_VERSION }}
cache: "npm"
- name: Install dependencies
run: npm ci
- name: Run linter
run: npm run lint
test:
runs-on: ubuntu-latest
needs: lint
services:
postgres:
image: postgres:16
env:
POSTGRES_PASSWORD: postgres
POSTGRES_DB: test
ports:
- 5432:5432
options: >-
--health-cmd pg_isready
--health-interval 10s
--health-timeout 5s
--health-retries 5
steps:
- uses: actions/checkout@v4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: ${{ env.NODE_VERSION }}
cache: "npm"
- name: Install dependencies
run: npm ci
- name: Run tests
run: npm test -- --coverage
env:
DATABASE_URL: postgresql://postgres:postgres@localhost:5432/test
- name: Upload coverage
uses: codecov/codecov-action@v3
with:
files: ./coverage/lcov.info
build:
runs-on: ubuntu-latest
needs: test
permissions:
contents: read
packages: write
steps:
- uses: actions/checkout@v4
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Log in to Container Registry
uses: docker/login-action@v3
with:
registry: ${{ env.REGISTRY }}
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Extract metadata
id: meta
uses: docker/metadata-action@v5
with:
images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
tags: |
type=ref,event=branch
type=ref,event=pr
type=sha,prefix=
type=raw,value=latest,enable=${{ github.ref == 'refs/heads/main' }}
- name: Build and push
uses: docker/build-push-action@v5
with:
context: .
push: true
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
cache-from: type=gha
cache-to: type=gha,mode=max
deploy-staging:
runs-on: ubuntu-latest
needs: build
if: github.ref == 'refs/heads/develop'
environment: staging
steps:
- uses: actions/checkout@v4
- name: Deploy to staging
uses: azure/k8s-deploy@v4
with:
namespace: staging
manifests: |
k8s/deployment.yaml
k8s/service.yaml
images: |
${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.sha }}
deploy-production:
runs-on: ubuntu-latest
needs: build
if: github.ref == 'refs/heads/main'
environment: production
steps:
- uses: actions/checkout@v4
- name: Deploy to production
uses: azure/k8s-deploy@v4
with:
namespace: production
manifests: |
k8s/deployment.yaml
k8s/service.yaml
images: |
${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ github.sha }}
strategy: canary
percentage: 20
Example 2: GitLab CI Pipeline
stages:
- validate
- test
- build
- deploy
variables:
DOCKER_TLS_CERTDIR: "/certs"
IMAGE_TAG: $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA
.node-base:
image: node:20-alpine
cache:
key: ${CI_COMMIT_REF_SLUG}
paths:
- node_modules/
lint:
stage: validate
extends: .node-base
script:
- npm ci
- npm run lint
rules:
- if: $CI_PIPELINE_SOURCE == "merge_request_event"
- if: $CI_COMMIT_BRANCH == "main"
test:
stage: test
extends: .node-base
services:
- postgres:16
variables:
POSTGRES_DB: test
POSTGRES_USER: runner
POSTGRES_PASSWORD: runner
DATABASE_URL: postgresql://runner:runner@postgres:5432/test
script:
- npm ci
- npm test -- --coverage
coverage: '/Lines\s*:\s*(\d+\.?\d*)%/'
artifacts:
reports:
coverage_report:
coverage_format: cobertura
path: coverage/cobertura-coverage.xml
junit: junit.xml
build:
stage: build
image: docker:24
services:
- docker:24-dind
script:
- docker login -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD $CI_REGISTRY
- docker build -t $IMAGE_TAG .
- docker push $IMAGE_TAG
rules:
- if: $CI_COMMIT_BRANCH == "main"
- if: $CI_COMMIT_BRANCH == "develop"
deploy-staging:
stage: deploy
image: bitnami/kubectl:latest
script:
- kubectl set image deployment/app app=$IMAGE_TAG -n staging
- kubectl rollout status deployment/app -n staging --timeout=300s
environment:
name: staging
url: https://staging.example.com
rules:
- if: $CI_COMMIT_BRANCH == "develop"
deploy-production:
stage: deploy
image: bitnami/kubectl:latest
script:
- kubectl set image deployment/app app=$IMAGE_TAG -n production
- kubectl rollout status deployment/app -n production --timeout=300s
environment:
name: production
url: https://example.com
when: manual
rules:
- if: $CI_COMMIT_BRANCH == "main"
Example 3: Reusable Workflow (GitHub Actions)
# .github/workflows/reusable-deploy.yml
name: Reusable Deploy Workflow
on:
workflow_call:
inputs:
environment:
required: true
type: string
image-tag:
required: true
type: string
secrets:
KUBE_CONFIG:
required: true
jobs:
deploy:
runs-on: ubuntu-latest
environment: ${{ inputs.environment }}
steps:
- uses: actions/checkout@v4
- name: Set up kubectl
uses: azure/setup-kubectl@v3
- name: Configure kubeconfig
run: |
mkdir -p ~/.kube
echo "${{ secrets.KUBE_CONFIG }}" | base64 -d > ~/.kube/config
- name: Deploy
run: |
kubectl set image deployment/app \
app=${{ inputs.image-tag }} \
-n ${{ inputs.environment }}
kubectl rollout status deployment/app \
-n ${{ inputs.environment }} \
--timeout=300s
- name: Verify deployment
run: |
kubectl get pods -n ${{ inputs.environment }} -l app=app
kubectl logs -n ${{ inputs.environment }} -l app=app --tail=50
Example 4: Security Scanning Pipeline
name: Security Scanning
on:
push:
branches: [main, develop]
pull_request:
branches: [main]
schedule:
- cron: "0 0 * * 0" # Weekly scan
jobs:
sast:
name: Static Analysis (SAST)
runs-on: ubuntu-latest
permissions:
security-events: write
contents: read
steps:
- uses: actions/checkout@v4
- name: Initialize CodeQL
uses: github/codeql-action/init@v3
with:
languages: javascript, python
- name: Autobuild
uses: github/codeql-action/autobuild@v3
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v3
- name: SonarCloud Scan
uses: SonarSource/sonarcloud-github-action@master
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
SONAR_TOKEN: ${{ secrets.SONAR_TOKEN }}
with:
args: >
-Dsonar.organization=myorg
-Dsonar.projectKey=myproject
-Dsonar.qualitygate.wait=true
dependency-scan:
name: Dependency Vulnerability Scan
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Run Trivy vulnerability scanner
uses: aquasecurity/trivy-action@master
with:
scan-type: "fs"
scan-ref: "."
format: "sarif"
output: "trivy-results.sarif"
severity: "CRITICAL,HIGH"
- name: Upload Trivy results to GitHub Security
uses: github/codeql-action/upload-sarif@v3
with:
sarif_file: "trivy-results.sarif"
- name: Snyk Security Scan
uses: snyk/actions/node@master
env:
SNYK_TOKEN: ${{ secrets.SNYK_TOKEN }}
with:
args: --severity-threshold=high
secrets-scan:
name: Secrets Detection
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0 # Full history for secret detection
- name: TruffleHog Scan
uses: trufflesecurity/trufflehog@main
with:
path: ./
base: ${{ github.event.repository.default_branch }}
head: HEAD
- name: GitGuardian Scan
uses: GitGuardian/ggshield-action@v1
env:
GITHUB_PUSH_BEFORE_SHA: ${{ github.event.before }}
GITHUB_PUSH_BASE_SHA: ${{ github.event.base }}
GITHUB_DEFAULT_BRANCH: ${{ github.event.repository.default_branch }}
GITGUARDIAN_API_KEY: ${{ secrets.GITGUARDIAN_API_KEY }}
container-scan:
name: Container Image Scan
runs-on: ubuntu-latest
needs: [sast, dependency-scan]
steps:
- uses: actions/checkout@v4
- name: Build image
run: docker build -t myapp:${{ github.sha }} .
- name: Scan image with Trivy
uses: aquasecurity/trivy-action@master
with:
image-ref: "myapp:${{ github.sha }}"
format: "sarif"
output: "trivy-image-results.sarif"
- name: Scan image with Grype
uses: anchore/scan-action@v3
with:
image: "myapp:${{ github.sha }}"
fail-build: true
severity-cutoff: high
Example 5: Test Integration with Parallelization
name: Test Suite
on: [push, pull_request]
jobs:
unit-tests:
name: Unit Tests
runs-on: ubuntu-latest
strategy:
matrix:
node-version: [18, 20, 22]
os: [ubuntu-latest, macos-latest, windows-latest]
steps:
- uses: actions/checkout@v4
- name: Setup Node.js ${{ matrix.node-version }}
uses: actions/setup-node@v4
with:
node-version: ${{ matrix.node-version }}
cache: "npm"
- name: Install dependencies
run: npm ci
- name: Run unit tests
run: npm run test:unit -- --coverage --maxWorkers=4
- name: Upload coverage
uses: codecov/codecov-action@v3
with:
files: ./coverage/coverage-final.json
flags: unit-${{ matrix.os }}-node${{ matrix.node-version }}
integration-tests:
name: Integration Tests
runs-on: ubuntu-latest
strategy:
matrix:
shard: [1, 2, 3, 4]
services:
postgres:
image: postgres:16
env:
POSTGRES_PASSWORD: postgres
POSTGRES_DB: test
options: >-
--health-cmd pg_isready
--health-interval 10s
--health-timeout 5s
--health-retries 5
ports:
- 5432:5432
redis:
image: redis:7
options: >-
--health-cmd "redis-cli ping"
--health-interval 10s
--health-timeout 5s
--health-retries 5
ports:
- 6379:6379
steps:
- uses: actions/checkout@v4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: 20
cache: "npm"
- name: Install dependencies
run: npm ci
- name: Run integration tests (shard ${{ matrix.shard }}/4)
run: npm run test:integration -- --shard=${{ matrix.shard }}/4
env:
DATABASE_URL: postgresql://postgres:postgres@localhost:5432/test
REDIS_URL: redis://localhost:6379
- name: Upload test results
if: always()
uses: actions/upload-artifact@v4
with:
name: integration-test-results-${{ matrix.shard }}
path: test-results/
e2e-tests:
name: E2E Tests
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: 20
cache: "npm"
- name: Install dependencies
run: npm ci
- name: Install Playwright
run: npx playwright install --with-deps
- name: Build application
run: npm run build
- name: Run E2E tests
run: npm run test:e2e
- name: Upload Playwright report
if: always()
uses: actions/upload-artifact@v4
with:
name: playwright-report
path: playwright-report/
test-report:
name: Generate Test Report
runs-on: ubuntu-latest
needs: [unit-tests, integration-tests, e2e-tests]
if: always()
steps:
- uses: actions/checkout@v4
- name: Download all test results
uses: actions/download-artifact@v4
with:
path: test-results/
- name: Generate combined report
run: |
npm install -g junit-viewer
junit-viewer --results=test-results/ --save=test-report.html
- name: Upload combined report
uses: actions/upload-artifact@v4
with:
name: combined-test-report
path: test-report.html
Example 6: ML Pipeline (Model Training & Deployment)
name: ML Pipeline
on:
push:
branches: [main]
paths:
- "models/**"
- "training/**"
- "data/**"
workflow_dispatch:
inputs:
model-version:
description: "Model version to train"
required: true
type: string
jobs:
data-validation:
name: Validate Training Data
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup Python
uses: actions/setup-python@v5
with:
python-version: "3.11"
cache: "pip"
- name: Install dependencies
run: |
pip install pandas great-expectations dvc
- name: Pull data with DVC
run: |
dvc remote modify origin --local auth basic
dvc remote modify origin --local user ${{ secrets.DVC_USER }}
dvc remote modify origin --local password ${{ secrets.DVC_PASSWORD }}
dvc pull
- name: Validate data schema
run: python scripts/validate_data.py
- name: Run Great Expectations
run: great_expectations checkpoint run training_data_checkpoint
train-model:
name: Train ML Model
runs-on: ubuntu-latest
needs: data-validation
steps:
- uses: actions/checkout@v4
- name: Setup Python
uses: actions/setup-python@v5
with:
python-version: "3.11"
cache: "pip"
- name: Install dependencies
run: |
pip install -r requirements.txt
pip install mlflow wandb
- name: Configure MLflow
run: |
echo "MLFLOW_TRACKING_URI=${{ secrets.MLFLOW_TRACKING_URI }}" >> $GITHUB_ENV
echo "MLFLOW_TRACKING_USERNAME=${{ secrets.MLFLOW_USERNAME }}" >> $GITHUB_ENV
echo "MLFLOW_TRACKING_PASSWORD=${{ secrets.MLFLOW_PASSWORD }}" >> $GITHUB_ENV
- name: Train model
run: |
python training/train.py \
--experiment-name "prod-training" \
--model-version ${{ inputs.model-version || github.sha }} \
--config training/config.yaml
env:
WANDB_API_KEY: ${{ secrets.WANDB_API_KEY }}
- name: Upload model artifact
uses: actions/upload-artifact@v4
with:
name: trained-model
path: models/output/
evaluate-model:
name: Evaluate Model Performance
runs-on: ubuntu-latest
needs: train-model
steps:
- uses: actions/checkout@v4
- name: Setup Python
uses: actions/setup-python@v5
with:
python-version: "3.11"
cache: "pip"
- name: Install dependencies
run: pip install -r requirements.txt
- name: Download model
uses: actions/download-artifact@v4
with:
name: trained-model
path: models/output/
- name: Run model evaluation
run: python evaluation/evaluate.py --model-path models/output/
- name: Check performance thresholds
run: |
python evaluation/check_metrics.py \
--min-accuracy 0.85 \
--min-f1 0.80
- name: Generate model card
run: python scripts/generate_model_card.py
deploy-model:
name: Deploy Model to Production
runs-on: ubuntu-latest
needs: evaluate-model
if: github.ref == 'refs/heads/main'
environment: production
steps:
- uses: actions/checkout@v4
- name: Download model
uses: actions/download-artifact@v4
with:
name: trained-model
path: models/output/
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v4
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: us-east-1
- name: Upload model to S3
run: |
aws s3 cp models/output/model.pkl \
s3://my-ml-models/prod/${{ github.sha }}/model.pkl
- name: Deploy to SageMaker
run: |
python deployment/deploy_sagemaker.py \
--model-uri s3://my-ml-models/prod/${{ github.sha }}/model.pkl \
--endpoint-name prod-ml-endpoint \
--instance-type ml.m5.large
- name: Run smoke tests
run: python deployment/smoke_test.py --endpoint prod-ml-endpoint
- name: Update model registry
run: |
python scripts/register_model.py \
--version ${{ github.sha }} \
--stage production \
--metadata models/output/metadata.json
Example 7: Monorepo CI with Path Filtering
name: Monorepo CI
on:
push:
branches: [main, develop]
pull_request:
branches: [main]
jobs:
detect-changes:
name: Detect Changed Services
runs-on: ubuntu-latest
outputs:
api: ${{ steps.filter.outputs.api }}
web: ${{ steps.filter.outputs.web }}
worker: ${{ steps.filter.outputs.worker }}
shared: ${{ steps.filter.outputs.shared }}
steps:
- uses: actions/checkout@v4
- uses: dorny/paths-filter@v3
id: filter
with:
filters: |
api:
- 'services/api/**'
- 'packages/shared/**'
web:
- 'services/web/**'
- 'packages/shared/**'
worker:
- 'services/worker/**'
- 'packages/shared/**'
shared:
- 'packages/shared/**'
test-api:
name: Test API Service
needs: detect-changes
if: needs.detect-changes.outputs.api == 'true'
runs-on: ubuntu-latest
defaults:
run:
working-directory: services/api
steps:
- uses: actions/checkout@v4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: 20
cache: "npm"
cache-dependency-path: services/api/package-lock.json
- name: Install dependencies
run: npm ci
- name: Run tests
run: npm test
test-web:
name: Test Web Service
needs: detect-changes
if: needs.detect-changes.outputs.web == 'true'
runs-on: ubuntu-latest
defaults:
run:
working-directory: services/web
steps:
- uses: actions/checkout@v4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: 20
cache: "npm"
cache-dependency-path: services/web/package-lock.json
- name: Install dependencies
run: npm ci
- name: Run tests
run: npm test
- name: Build
run: npm run build
test-worker:
name: Test Worker Service
needs: detect-changes
if: needs.detect-changes.outputs.worker == 'true'
runs-on: ubuntu-latest
defaults:
run:
working-directory: services/worker
steps:
- uses: actions/checkout@v4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: 20
cache: "npm"
cache-dependency-path: services/worker/package-lock.json
- name: Install dependencies
run: npm ci
- name: Run tests
run: npm test
build-and-deploy:
name: Build and Deploy Changed Services
needs: [detect-changes, test-api, test-web, test-worker]
if: |
always() &&
(needs.test-api.result == 'success' || needs.test-api.result == 'skipped') &&
(needs.test-web.result == 'success' || needs.test-web.result == 'skipped') &&
(needs.test-worker.result == 'success' || needs.test-worker.result == 'skipped')
runs-on: ubuntu-latest
strategy:
matrix:
service:
- name: api
changed: ${{ needs.detect-changes.outputs.api == 'true' }}
- name: web
changed: ${{ needs.detect-changes.outputs.web == 'true' }}
- name: worker
changed: ${{ needs.detect-changes.outputs.worker == 'true' }}
steps:
- uses: actions/checkout@v4
if: matrix.service.changed == 'true'
- name: Build and push ${{ matrix.service.name }}
if: matrix.service.changed == 'true'
run: |
docker build -t myapp-${{ matrix.service.name }}:${{ github.sha }} \
services/${{ matrix.service.name }}
docker push myapp-${{ matrix.service.name }}:${{ github.sha }}
Example 8: Performance Optimization Pipeline
name: Optimized CI Pipeline
on: [push, pull_request]
jobs:
# Fast feedback jobs run first
quick-checks:
name: Quick Checks (< 2min)
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: 20
cache: "npm"
- name: Cache node_modules
uses: actions/cache@v4
with:
path: node_modules
key: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}
restore-keys: |
${{ runner.os }}-node-
- name: Install dependencies
run: npm ci --prefer-offline --no-audit
- name: Parallel lint and type check
run: |
npm run lint &
npm run type-check &
wait
unit-tests-fast:
name: Unit Tests (Changed Files Only)
runs-on: ubuntu-latest
if: github.event_name == 'pull_request'
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0 # Need full history for changed files
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: 20
cache: "npm"
- name: Install dependencies
run: npm ci --prefer-offline
- name: Get changed files
id: changed-files
run: |
echo "files=$(git diff --name-only origin/main...HEAD | \
grep -E '\.(ts|tsx|js|jsx)$' | \
xargs -I {} echo '--findRelatedTests {}' | \
tr '\n' ' ')" >> $GITHUB_OUTPUT
- name: Run tests for changed files only
if: steps.changed-files.outputs.files != ''
run: npm test -- ${{ steps.changed-files.outputs.files }}
build-with-cache:
name: Build with Aggressive Caching
runs-on: ubuntu-latest
needs: quick-checks
steps:
- uses: actions/checkout@v4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: 20
cache: "npm"
- name: Cache build output
uses: actions/cache@v4
with:
path: |
.next/cache
dist/
build/
key: ${{ runner.os }}-build-${{ hashFiles('**/*.ts', '**/*.tsx', '**/*.js') }}
restore-keys: |
${{ runner.os }}-build-
- name: Install dependencies
run: npm ci --prefer-offline
- name: Build
run: npm run build
- name: Upload build artifacts
uses: actions/upload-artifact@v4
with:
name: build-output
path: dist/
retention-days: 7
docker-build-optimized:
name: Docker Build with Layer Caching
runs-on: ubuntu-latest
needs: quick-checks
steps:
- uses: actions/checkout@v4
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Build with cache
uses: docker/build-push-action@v5
with:
context: .
push: false
tags: myapp:${{ github.sha }}
cache-from: type=gha
cache-to: type=gha,mode=max
build-args: |
BUILDKIT_INLINE_CACHE=1
Pipeline Optimization Patterns
Caching Strategy
- Dependency Caching: Cache
node_modules,vendor/,.m2/, etc. - Build Output Caching: Cache compiled artifacts between runs
- Docker Layer Caching: Use BuildKit cache mounts and GitHub Actions cache
- Incremental Builds: Only rebuild changed modules (Nx, Turborepo)
Parallelization Strategies
- Job-Level Parallelization: Run independent jobs concurrently
- Test Sharding: Split test suite across multiple runners
- Matrix Builds: Test multiple versions/platforms simultaneously
- Monorepo Path Filtering: Only test changed services
Conditional Execution
- Path Filters: Skip jobs when irrelevant files change
- Changed Files Detection: Test only affected code
- Branch-Specific Jobs: Different pipelines for different branches
- Manual Triggers: Allow on-demand pipeline execution
ML-Specific Patterns
Model Training Pipeline
- Data Validation: Validate schema and quality before training
- Experiment Tracking: Log metrics to MLflow/W&B
- Model Versioning: Tag models with git SHA or semantic version
- Performance Gates: Enforce minimum accuracy/F1 thresholds
Model Deployment
- A/B Testing: Deploy new model alongside existing
- Shadow Mode: Run new model without affecting production
- Canary Rollout: Gradually increase traffic to new model
- Automated Rollback: Revert on performance degradation
Troubleshooting Guide
Common Issues
- Flaky Tests: Implement retry logic, increase timeouts, fix race conditions
- Slow Pipelines: Profile execution times, add caching, parallelize
- Secrets Exposure: Use secret scanning, audit logs, rotate credentials
- Resource Exhaustion: Set resource limits, use cleanup actions
- Network Timeouts: Add retries, use artifact caching, increase timeouts
Debugging Commands
# GitHub Actions local testing
act -j test --secret-file .env.secrets
# GitLab CI local testing
gitlab-runner exec docker test
# Jenkins pipeline validation
java -jar jenkins-cli.jar declarative-linter < Jenkinsfile
# Docker build debugging
DOCKER_BUILDKIT=1 docker build --progress=plain .
# Test pipeline with dry-run
kubectl apply --dry-run=client -f k8s/
# Validate workflow syntax
actionlint .github/workflows/*.yml