gcp-gke-cost-optimization
npx skills add https://github.com/dawiddutoit/custom-claude --skill gcp-gke-cost-optimization
Agent 安装分布
Skill 文档
GKE Cost Optimization
Purpose
Reduce GKE spending while maintaining performance and reliability. This skill covers cost analysis, resource right-sizing, cluster mode comparison, and budget monitoring strategies.
When to Use
Use this skill when you need to:
- Analyze GKE spending and identify cost optimization opportunities
- Compare Autopilot vs Standard billing models for your workload
- Right-size pod resource requests and limits
- Configure Horizontal/Vertical Pod Autoscaling to reduce waste
- Set up Spot VMs for batch or non-critical workloads
- Create budget alerts and track cost per service
- Optimize resource utilization
Trigger phrases: “optimize GKE costs”, “reduce Kubernetes spending”, “right-size resources”, “Autopilot vs Standard pricing”, “GKE budget alerts”
Table of Contents
Quick Start
Analyze and optimize costs in three steps:
# 1. Check current resource usage
kubectl top pods -n wtr-supplier-charges
# 2. Analyze resource requests vs actual usage
kubectl describe deployment supplier-charges-hub -n wtr-supplier-charges | grep -A 10 "resources:"
# 3. Compare Autopilot vs Standard pricing for your workload
# Autopilot: Pay per pod per second for requests
# Standard: Pay per provisioned node per hour (regardless of usage)
Instructions
Step 1: Understand Your Billing Model
GKE Autopilot (Per-Pod Billing – Recommended)
Cost Calculation:
Monthly Cost = (vCPU requests * $0.04 + Memory requests GB * $0.004 + Disk GB * $0.0001) * seconds per month
Example for Supplier Charges Hub:
- 2 replicas, each requesting: 1 vCPU + 2 GB memory
- Monthly cost â (2 * 1 * $0.04 + 2 * 2 * $0.004) * 2.592M seconds â $260
Advantages:
- Pay only for what pods request (not what entire cluster capacity is)
- No idle resource costs
- Perfect for variable workloads (scales up/down automatically)
- Up to 60% cheaper than Standard for typical workloads
GKE Standard (Node Billing)
Cost Calculation:
Monthly Cost = Number of nodes * Machine type hourly rate * 730 hours per month
Example for Supplier Charges Hub:
- 3
n2-standard-4nodes (standard pool) = ~$400/month - Even if pods use only 30% of capacity, you pay for 100%
Advantages:
- Predictable costs (great for committed use discounts)
- Full control over infrastructure
- Better for stable, high-utilization workloads
Step 2: Analyze Current Resource Usage
Check if pods are over-provisioned:
# View actual vs requested resources
kubectl top pods -n wtr-supplier-charges -o wide
# Compare to requests
kubectl get pods -n wtr-supplier-charges -o jsonpath='{.items[*].spec.containers[*].resources.requests}'
Analysis Questions:
- Are actual values significantly lower than requests?
- Is memory usage consistently below 75% of limits?
- Is CPU usage consistently below 70% of requests?
If Yes â Right-size resources (reduce requests)
Step 3: Right-Size Pod Resources
Use Vertical Pod Autoscaler (VPA) recommendations:
# Apply VPA to deployment
cat <<EOF | kubectl apply -f -
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: supplier-charges-hub-vpa
namespace: wtr-supplier-charges
spec:
targetRef:
apiVersion: "apps/v1"
kind: Deployment
name: supplier-charges-hub
updatePolicy:
updateMode: "Off" # Only provide recommendations, don't auto-update
resourcePolicy:
containerPolicies:
- containerName: supplier-charges-hub-container
minAllowed:
cpu: 500m
memory: 1Gi
maxAllowed:
cpu: 2
memory: 4Gi
EOF
# Wait 1 week for data collection, then view recommendations
kubectl describe vpa supplier-charges-hub-vpa -n wtr-supplier-charges | grep -A 20 "Recommendation"
Recommended VPA Values:
resources:
requests:
cpu: 500m # Reduced from 1000m if usage averages 300m
memory: 1.5Gi # Reduced from 2Gi if usage averages 1Gi
limits:
cpu: 500m # Match requests for Guaranteed QoS
memory: 1.5Gi
Step 4: Configure Horizontal Pod Autoscaling (HPA)
Let Kubernetes scale replicas based on demand:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: supplier-charges-hub-hpa
namespace: wtr-supplier-charges
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: supplier-charges-hub
minReplicas: 1 # Scale down to 1 during low traffic
maxReplicas: 5 # Scale up to 5 during peak
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
behavior:
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 50
periodSeconds: 60
Cost Savings: Reduces off-peak replicas from 2 to 1 = ~30% savings if traffic varies.
Step 5: Use Spot VMs (GKE Standard Only)
For non-critical or fault-tolerant workloads, use Spot VMs (91% discount):
# Create Spot node pool
gcloud container node-pools create spot-pool \
--cluster=shared-gke-standard-01-euw2 \
--region=europe-west2 \
--spot \
--machine-type=n2-standard-4 \
--num-nodes=2 \
--enable-autoscaling \
--min-nodes=0 \
--max-nodes=10
# Add toleration to workloads that can run on Spot
tolerations:
- key: cloud.google.com/gke-spot
operator: Equal
value: "true"
effect: NoSchedule
Use Cases: Batch processing, CI/CD build workers, non-critical background jobs
Not for: Production APIs (Supplier Charges Hub should NOT use Spot)
Step 6: Set Up Cost Monitoring
Create budget alerts:
# Create budget with 50%, 90%, 100% thresholds
gcloud billing budgets create \
--billing-account=BILLING_ACCOUNT_ID \
--display-name="GKE Labs Environment" \
--budget-amount=500USD \
--threshold-rule=percent=50 \
--threshold-rule=percent=90 \
--threshold-rule=percent=100 \
--notification-rule-name=email-alert
Step 7: Analyze Cost by Namespace/Service
Export billing data for detailed analysis:
# Tag namespace for cost tracking
kubectl label namespace wtr-supplier-charges \
cost-center=supplier-charges \
environment=labs
# Later, filter GCP Billing reports by labels
# In Cloud Console: Billing â Reports â Filter by labels
Examples
Example 1: Optimize Supplier Charges Hub Deployment
#!/bin/bash
# Step-by-step optimization
DEPLOYMENT="supplier-charges-hub"
NAMESPACE="wtr-supplier-charges"
echo "=== GKE Cost Optimization ==="
echo ""
echo "1. Current Resource Usage:"
kubectl top pods -l app=$DEPLOYMENT -n $NAMESPACE
echo ""
echo "2. Current Resource Requests:"
kubectl get deployment $DEPLOYMENT -n $NAMESPACE \
-o jsonpath='{.spec.template.spec.containers[0].resources}'
echo ""
echo "3. Applying VPA for recommendations:"
cat <<EOF | kubectl apply -f -
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: ${DEPLOYMENT}-vpa
namespace: $NAMESPACE
spec:
targetRef:
apiVersion: "apps/v1"
kind: Deployment
name: $DEPLOYMENT
updatePolicy:
updateMode: "Off"
EOF
echo "VPA created. Wait 1 week for recommendations."
echo ""
echo "4. Applying HPA for automatic scaling:"
cat <<EOF | kubectl apply -f -
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: ${DEPLOYMENT}-hpa
namespace: $NAMESPACE
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: $DEPLOYMENT
minReplicas: 1
maxReplicas: 5
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
EOF
echo "HPA created. Replicas will auto-scale based on demand."
echo ""
echo "5. Estimated Savings:"
echo " - Autopilot: ~$260/month (current)"
echo " - With HPA (avg 1.5 replicas): ~$195/month"
echo " - Savings: ~$65/month (25%)"
Example 2: Autopilot vs Standard Cost Comparison
#!/bin/bash
# Compare pricing for your workload
# Supplier Charges Hub specifications
REPLICAS=2
CPU_PER_POD="1000m"
MEMORY_PER_POD="2Gi"
DISK_PER_POD="10Gi"
# Autopilot pricing (europe-west2)
CPU_PRICE=0.04
MEMORY_PRICE=0.004
DISK_PRICE=0.0001
SECONDS_PER_MONTH=2592000
TOTAL_CPU=$(echo "$REPLICAS * 1" | bc)
TOTAL_MEMORY=$(echo "$REPLICAS * 2" | bc)
TOTAL_DISK=$(echo "$REPLICAS * 10" | bc)
AUTOPILOT_COST=$(echo "($TOTAL_CPU * $CPU_PRICE + $TOTAL_MEMORY * $MEMORY_PRICE + $TOTAL_DISK * $DISK_PRICE) * $SECONDS_PER_MONTH / 3600 / 24 / 30" | bc)
echo "=== Cost Comparison: Autopilot vs Standard ==="
echo ""
echo "Workload: $REPLICAS replicas, ${CPU_PER_POD}m CPU, ${MEMORY_PER_POD} memory each"
echo ""
echo "AUTOPILOT (Per-Pod Billing):"
echo " Monthly cost: ~\$${AUTOPILOT_COST}"
echo ""
echo "STANDARD (Node-Based Billing):"
echo " 3x n2-standard-4 nodes @ \$400/month"
echo " 3x cloud-sql-proxy sidecars @ \$50/month"
echo " Total: ~\$450/month"
echo ""
echo "SAVINGS with Autopilot: ~\$190/month (42%)"
Example 3: Set Up Cost Tracking and Alerts
#!/bin/bash
# Set up cost monitoring
BILLING_ACCOUNT="000000-A1B2C3-D4E5F6" # Replace with your billing account
NAMESPACE="wtr-supplier-charges"
PROJECT_ID="ecp-wtr-supplier-charges-labs"
echo "=== Setting Up Cost Monitoring ==="
echo ""
echo "1. Tag resources for cost tracking"
kubectl label namespace $NAMESPACE \
cost-center=supplier-charges \
environment=labs \
--overwrite
echo ""
echo "2. Create billing budget"
gcloud billing budgets create \
--billing-account=$BILLING_ACCOUNT \
--display-name="GKE Labs - Supplier Charges Hub" \
--budget-amount=500USD \
--threshold-rule=percent=50 \
--threshold-rule=percent=90 \
--threshold-rule=percent=100 \
--notification-rule-name="Budget Alert"
echo ""
echo "3. Enable Cost Analysis"
echo " Go to: https://console.cloud.google.com/billing/reports"
echo " Filter by: labels.cost-center = supplier-charges"
echo ""
echo "4. Track GKE Costs"
echo " Go to: https://console.cloud.google.com/monitoring/dashboards"
echo " Create dashboard filtering: resource.type=k8s_cluster AND resource.labels.cluster_name"
Requirements
- GKE cluster running (Autopilot or Standard)
kubectlaccess to view pod metrics and deploymentsgcloudCLI access for billing and monitoring setup- Metrics server enabled (usually on by default)
- For billing: access to GCP billing account
See Also
- gcp-gke-cluster-setup – Understand cluster modes and their costs
- gcp-gke-deployment-strategies – Optimize deployment replicas
- gcp-gke-monitoring-observability – Monitor resource utilization