kubernetes-health-check

📁 asadullah48/hackathon-superpowers 📅 12 days ago
1
总安装量
1
周安装量
#51600
全站排名
安装命令
npx skills add https://github.com/asadullah48/hackathon-superpowers --skill kubernetes-health-check

Agent 安装分布

openclaw 1
opencode 1
claude-code 1

Skill 文档

Kubernetes Health Check

Purpose

Perform comprehensive health checks on Kubernetes clusters to ensure readiness for application deployments. Validates cluster connectivity, resource availability, and essential components.

Prerequisites

  • kubectl installed and configured
  • Access to Kubernetes cluster (Minikube, cloud cluster, etc.)
  • Sufficient permissions to query cluster resources

Instructions

Step 1: Verify Cluster Connectivity

Check if kubectl can connect to the cluster:

# Check cluster info
kubectl cluster-info

# Verify current context
kubectl config current-context

# List all nodes
kubectl get nodes

Expected Results:

  • Cluster-info shows running control plane
  • At least one node in Ready state
  • No connection errors

Step 2: Check Node Resources

Verify sufficient resources are available:

# Get detailed node information
kubectl describe nodes

# Check resource usage (requires metrics-server)
kubectl top nodes

Minimum Requirements:

  • CPU: >2 cores available
  • Memory: >4GB available
  • Disk: >20GB available
  • Status: All nodes Ready

Step 3: Verify System Components

Ensure essential Kubernetes components are running:

# Check system pods
kubectl get pods -n kube-system

# Check critical components
kubectl get pods -n kube-system | grep -E 'coredns|etcd|kube-apiserver|kube-controller|kube-scheduler'

All pods should be:

  • Status: Running
  • Restarts: <3 (some restarts normal during startup)
  • Ready: All containers ready (e.g., 1/1, 2/2)

Step 4: Validate Storage

Verify storage classes are available:

# List storage classes
kubectl get storageclass

# Check for default storage class
kubectl get storageclass -o json | jq -r '.items[] | select(.metadata.annotations."storageclass.kubernetes.io/is-default-class"=="true") | .metadata.name'

Requirements:

  • At least one storage class exists
  • One marked as (default)

Step 5: Test Networking

Validate cluster networking:

# Check kube-proxy
kubectl get pods -n kube-system | grep kube-proxy

# Verify DNS resolution
kubectl run -it --rm debug --image=busybox --restart=Never -- nslookup kubernetes.default

# Check CNI plugin
kubectl get pods -n kube-system | grep -E 'calico|flannel|weave|cilium'

Networking is healthy when:

  • kube-proxy pods running
  • DNS lookups succeed
  • CNI plugin pods running

Validation

Pre-deployment checks passed when:

  • kubectl connects successfully
  • All nodes are Ready
  • System pods running (coredns, etcd, kube-apiserver)
  • Default storage class exists
  • DNS resolution works
  • Resources meet requirements (2 CPU, 4GB RAM)

Troubleshooting

Problem: Cannot connect to cluster

kubectl config get-contexts
kubectl config use-context <correct-context>

# For Minikube
minikube status
minikube start

Problem: Nodes NotReady

kubectl describe node <node-name>

# Restart Minikube with more resources
minikube stop
minikube start --cpus=4 --memory=8192

Best Practices

  1. Always Check First: Run health checks before any deployment
  2. Save Reports: Keep health reports for troubleshooting
  3. Automate: Add to CI/CD pipelines
  4. Monitor Trends: Track resource usage over time
  5. Document Baseline: Know what “healthy” looks like