devops-automation
1
总安装量
1
周安装量
#44324
全站排名
安装命令
npx skills add https://github.com/rohitg00/awesome-claude-code-toolkit --skill devops-automation
Agent 安装分布
replit
1
trae
1
trae-cn
1
claude-code
1
Skill 文档
DevOps Automation
GitHub Actions Workflow Structure
name: CI/CD
on:
push:
branches: [main]
pull_request:
branches: [main]
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
jobs:
lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 22
cache: 'npm'
- run: npm ci
- run: npm run lint
test:
runs-on: ubuntu-latest
needs: lint
strategy:
matrix:
node-version: [20, 22]
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: ${{ matrix.node-version }}
cache: 'npm'
- run: npm ci
- run: npm test -- --coverage
- uses: actions/upload-artifact@v4
with:
name: coverage-${{ matrix.node-version }}
path: coverage/
deploy:
runs-on: ubuntu-latest
needs: test
if: github.ref == 'refs/heads/main'
environment: production
steps:
- uses: actions/checkout@v4
- run: ./deploy.sh
Key patterns:
- Use
concurrencyto cancel outdated runs - Cache dependencies with setup action’s
cacheoption - Use
needsfor job dependencies - Gate deploys with
environmentprotection rules - Use matrix for cross-version testing
Docker Multi-Stage Builds
FROM node:22-alpine AS deps
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci --production
FROM node:22-alpine AS builder
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci
COPY . .
RUN npm run build
FROM node:22-alpine AS runner
WORKDIR /app
RUN addgroup -g 1001 appgroup && adduser -u 1001 -G appgroup -S appuser
COPY /app/node_modules ./node_modules
COPY /app/dist ./dist
COPY /app/package.json ./
USER appuser
EXPOSE 3000
HEALTHCHECK CMD wget -qO- http://localhost:3000/health || exit 1
CMD ["node", "dist/server.js"]
Rules:
- Use specific image tags, never
latest - Run as non-root user
- Copy only necessary files into final stage
- Add
HEALTHCHECKfor orchestrator integration - Use
.dockerignoreto excludenode_modules,.git, tests
Kubernetes Deployment Manifest
apiVersion: apps/v1
kind: Deployment
metadata:
name: api-server
labels:
app: api-server
spec:
replicas: 3
selector:
matchLabels:
app: api-server
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 0
template:
metadata:
labels:
app: api-server
spec:
containers:
- name: api
image: registry.example.com/api:v1.2.3
ports:
- containerPort: 3000
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 512Mi
readinessProbe:
httpGet:
path: /health
port: 3000
initialDelaySeconds: 5
periodSeconds: 10
livenessProbe:
httpGet:
path: /health
port: 3000
initialDelaySeconds: 15
periodSeconds: 20
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: api-secrets
key: database-url
Always set resource requests and limits. Always define readiness and liveness probes. Use maxUnavailable: 0 for zero-downtime deploys.
Helm Chart Structure
chart/
Chart.yaml
values.yaml
values-staging.yaml
values-production.yaml
templates/
deployment.yaml
service.yaml
ingress.yaml
hpa.yaml
_helpers.tpl
# values.yaml
replicaCount: 2
image:
repository: registry.example.com/api
tag: latest
pullPolicy: IfNotPresent
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 512Mi
ingress:
enabled: true
host: api.example.com
autoscaling:
enabled: true
minReplicas: 2
maxReplicas: 10
targetCPUUtilization: 70
Use values-{env}.yaml overrides per environment. Lint charts with helm lint. Test with helm template before deploying.
ArgoCD GitOps Pattern
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: api-server
namespace: argocd
spec:
project: default
source:
repoURL: https://github.com/org/k8s-manifests
targetRevision: main
path: apps/api-server
destination:
server: https://kubernetes.default.svc
namespace: production
syncPolicy:
automated:
prune: true
selfHeal: true
syncOptions:
- CreateNamespace=true
GitOps principles:
- Git is the single source of truth for cluster state
- All changes go through PRs (no
kubectl applyin production) - ArgoCD auto-syncs from Git to cluster
- Enable
selfHealto revert manual cluster changes - Separate app code repos from deployment manifest repos
Monitoring Stack
# Prometheus ServiceMonitor
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: api-server
spec:
selector:
matchLabels:
app: api-server
endpoints:
- port: metrics
interval: 15s
path: /metrics
Key metrics to expose:
http_request_duration_seconds(histogram) – request latency by route and statushttp_requests_total(counter) – request count by route and statusprocess_resident_memory_bytes(gauge) – memory usagedb_query_duration_seconds(histogram) – database query latency
Alert on: error rate >1%, P99 latency >2s, memory >80% of limit, pod restarts >3 in 10 minutes.
Pipeline Best Practices
- Keep CI under 10 minutes (parallelize jobs, cache aggressively)
- Run linting and type checking before tests
- Use ephemeral environments for PR previews
- Pin all action versions to SHA, not tags
- Store secrets in GitHub Secrets, never in workflow files
- Use OIDC for cloud provider authentication (no long-lived keys)
- Tag images with git SHA, not
latest - Run security scans (Trivy, Snyk) on container images in CI