cluster-api
npx skills add https://github.com/sigridjineth/kubespray-skills --skill cluster-api
Agent 安装分布
Skill 文档
Cluster API (CAPI)
Overview
Cluster API (CAPI) manages Kubernetes clusters as Kubernetes resources. Instead of SSH-ing into nodes and running imperative commands, you define clusters declaratively using Custom Resources, and CAPI controllers reconcile the desired state into reality.
Core principle: CAPI turns cluster lifecycle management into a Kubernetes-native workflow — define a Cluster CR, apply it, and controllers handle provisioning, scaling, and upgrades automatically.
When to Use
- Provisioning Kubernetes clusters declaratively from a management cluster
- Managing fleet of clusters at scale (tens to hundreds)
- Performing rolling upgrades by patching a version field
- Using ClusterClass blueprints for standardized cluster templates
- Setting up dev/test environments with the Docker infrastructure provider
Not for: Single cluster installation with security hardening (use rke2-deployment or kubespray-deployment), air-gapped environments without CAPI controller images, bare-metal provisioning without an infrastructure provider
Key Concepts
| Concept | Description |
|---|---|
| Management Cluster | Runs CAPI controllers, stores cluster definitions — the “brain” |
| Workload Cluster | Provisioned BY CAPI, runs actual applications |
| Cluster CR | Defines desired cluster state (version, topology, networking) |
| Machine | Represents a single node (VM or container) |
| MachineDeployment | Manages a set of Machines (like Deployment manages Pods) |
| MachineSet | Intermediate resource between MachineDeployment and Machine |
| ClusterClass | Reusable blueprint for creating clusters with consistent topology |
CAPI vs Kubespray/RKE2
| Aspect | Kubespray / RKE2 | Cluster API |
|---|---|---|
| Model | Imperative (Ansible playbooks / CLI) | Declarative (Kubernetes CRs) |
| Scope | Single cluster installation | Fleet management of many clusters |
| Upgrades | Manual per-node playbook runs | Patch version field, controllers handle it |
| Scale | One cluster at a time | Manage hundreds of clusters from one place |
| Security focus | RKE2: FIPS, SELinux, CIS hardened | Depends on infrastructure provider |
| Air-gap support | Strong (Kubespray offline, RKE2 embedded) | Requires CAPI controller images available |
Provider Architecture
CAPI uses a modular provider architecture. Four provider types work together to provision and manage clusters.
Management Cluster
+------------------------------------------------------------------+
| |
| +-------------------+ +--------------------------------+ |
| | Core Provider | | Bootstrap Provider | |
| | (capi-system) | | (capi-kubeadm-bootstrap-system)| |
| | | | | |
| | Manages: | | Generates: | |
| | - Cluster | | - cloud-init / user-data | |
| | - Machine | | - kubeadm init config | |
| | - MachineSet | | - kubeadm join config | |
| | - MachineDeploymt| +--------------------------------+ |
| +-------------------+ |
| |
| +---------------------------+ +----------------------------+ |
| | Control Plane Provider | | Infrastructure Provider | |
| | (capi-kubeadm-control- | | (e.g., capd-system for | |
| | plane-system) | | Docker, capv for vSphere, | |
| | | | capa for AWS, etc.) | |
| | Manages: | | | |
| | - KubeadmControlPlane | | Creates: | |
| | - CP scaling/upgrades | | - Actual VMs / containers | |
| | - etcd membership | | - Load balancers | |
| +---------------------------+ | - Networks / disks | |
| +----------------------------+ |
+------------------------------------------------------------------+
Provider Types
| Provider | Namespace | Responsibility |
|---|---|---|
| Core | capi-system |
Manages Cluster, Machine, MachineSet, MachineDeployment CRDs |
| Bootstrap | capi-kubeadm-bootstrap-system |
Generates cloud-init/user-data for kubeadm init/join |
| Control Plane | capi-kubeadm-control-plane-system |
Manages KubeadmControlPlane, handles CP scaling/upgrades/etcd |
| Infrastructure | Provider-specific (e.g., capd-system) |
Creates actual VMs/containers, LBs, networks |
Infrastructure Provider Examples
| Provider | Prefix | Creates |
|---|---|---|
| Docker (CAPD) | capd-system |
Docker containers simulating nodes (dev/test) |
| vSphere (CAPV) | capv-system |
vSphere VMs |
| AWS (CAPA) | capa-system |
EC2 instances, ELBs, VPCs |
| Azure (CAPZ) | capz-system |
Azure VMs, LBs, VNets |
| Metal3 (CAPM3) | capm3-system |
Bare-metal servers via Ironic |
Management Cluster vs Workload Cluster
+-------------------------------------------+
| Management Cluster (KinD) |
| |
| CAPI Controllers: |
| - Core Provider |
| - Bootstrap Provider |
| - Control Plane Provider |
| - Infrastructure Provider |
| - cert-manager |
| |
| Stores CRs: |
| - Cluster/capi-quickstart |
| - KubeadmControlPlane/... |
| - MachineDeployment/... |
| - Machine/... (one per node) |
+-------------------------------------------+
| | |
| reconcile | reconcile | reconcile
v v v
+-------------------------------------------+
| Workload Cluster (capi-quickstart) |
| |
| +----------+ +---------+ +---------+ |
| | CP-1 | | CP-2 | | CP-3 | |
| | (node) | | (node) | | (node) | |
| +----------+ +---------+ +---------+ |
| |
| +----------+ +---------+ +---------+ |
| | Worker-1 | | Worker-2| | Worker-3| |
| | (node) | | (node) | | (node) | |
| +----------+ +---------+ +---------+ |
| |
| HAProxy LB (for API server access) |
| |
| Runs actual application workloads |
+-------------------------------------------+
Key distinction:
- Management cluster is the “brain” — it runs CAPI controllers and stores cluster definitions
- Workload clusters are the “muscle” — they run actual application workloads
- The management cluster does NOT run application pods
- One management cluster can manage many workload clusters
Hands-On Setup (Docker Provider)
The Docker provider (CAPD) simulates nodes as Docker containers. Ideal for learning, development, and CI/CD testing.
Step 1: Create Management Cluster with KinD
kind create cluster --name myk8s --image kindest/node:v1.35.0 --config - <<EOF
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
extraMounts:
- hostPath: /var/run/docker.sock
containerPath: /var/run/docker.sock
extraPortMappings:
- containerPort: 30000
hostPort: 30000
- containerPort: 30001
hostPort: 30001
EOF
Why these settings:
extraMountsfor Docker socket — CAPD needs access to the host Docker daemon to create containers that simulate workload cluster nodesextraPortMappings— maps NodePort ranges so you can access workload cluster services from your host machine
Step 2: Install clusterctl
# macOS
brew install clusterctl
# Linux (amd64)
curl -L https://github.com/kubernetes-sigs/cluster-api/releases/latest/download/clusterctl-linux-amd64 -o clusterctl
chmod +x clusterctl
sudo mv clusterctl /usr/local/bin/
Verify installation:
clusterctl version
Step 3: Initialize Management Cluster
# Enable ClusterClass/Topology support
export CLUSTER_TOPOLOGY=true
# Initialize with Docker infrastructure provider
clusterctl init --infrastructure docker
What this installs:
- cert-manager — manages TLS certificates for CAPI webhooks
- Core provider (
capi-system) — Cluster, Machine, MachineSet, MachineDeployment CRDs - Bootstrap provider (
capi-kubeadm-bootstrap-system) — generates kubeadm configs - Control Plane provider (
capi-kubeadm-control-plane-system) — manages KubeadmControlPlane - Infrastructure provider (
capd-system) — Docker-specific controllers
Step 4: Verify Initialization
# Check all providers are installed and ready
kubectl get providers.clusterctl.cluster.x-k8s.io -A
Expected output:
NAMESPACE NAME TYPE VERSION WATCH NAMESPACE
capd-system infrastructure-docker InfrastructureProvider v1.x.x
capi-kubeadm-bootstrap-system bootstrap-kubeadm BootstrapProvider v1.x.x
capi-kubeadm-control-plane-system control-plane-kubeadm ControlPlaneProvider v1.x.x
capi-system cluster-api CoreProvider v1.x.x
# Check all controller pods are running
kubectl get pods -A | grep -E 'capi|capd|cert-manager'
All pods should be in Running state with 1/1 or 2/2 ready.
Provisioning a Workload Cluster
Generate Cluster Manifest
# Set networking variables
export SERVICE_CIDR='["10.20.0.0/16"]'
export POD_CIDR='["10.10.0.0/16"]'
export SERVICE_DOMAIN="myk8s-1.local"
export POD_SECURITY_STANDARD_ENABLED="false"
# Generate cluster manifest
clusterctl generate cluster capi-quickstart --flavor development \
--kubernetes-version v1.34.3 \
--control-plane-machine-count=3 \
--worker-machine-count=3 \
> capi-quickstart.yaml
What --flavor development does: Selects a pre-built template designed for local Docker-based development. The template includes ClusterClass definitions, DockerCluster/DockerMachine templates, and kubeadm configurations.
Apply Cluster Manifest
kubectl apply -f capi-quickstart.yaml
Monitor Provisioning
# Watch cluster provisioning status
clusterctl describe cluster capi-quickstart
# Watch in real-time
clusterctl describe cluster capi-quickstart --show-conditions all
What happens during provisioning (Docker provider):
- CAPI creates Docker containers using
kindest/nodeimage to simulate cluster nodes - An HAProxy container is created as the load balancer for the workload cluster API server
- Bootstrap provider generates kubeadm init/join configurations
- Control plane provider runs kubeadm init on the first CP node, then joins remaining CP nodes
- Worker machines are bootstrapped with kubeadm join
Provisioning Flow (Docker Provider)
+--------------------------------------------------------------------+
| |
| kubectl apply -f capi-quickstart.yaml |
| | |
| v |
| Core Provider: creates Cluster object |
| | |
| v |
| Infrastructure Provider (CAPD): |
| - Creates DockerCluster (networking setup) |
| - Creates HAProxy container (LB for API server) |
| | |
| v |
| Control Plane Provider: |
| - Creates first Machine (Docker container) |
| - Bootstrap Provider generates kubeadm init config |
| - Runs kubeadm init inside container |
| - Repeats for CP nodes 2 and 3 (kubeadm join --control-plane) |
| | |
| v |
| MachineDeployment controller: |
| - Creates MachineSet -> Machines for workers |
| - Bootstrap Provider generates kubeadm join config |
| - Runs kubeadm join inside each worker container |
| | |
| v |
| Cluster status: Provisioned |
| (Nodes are NotReady until CNI is installed) |
+--------------------------------------------------------------------+
Access the Workload Cluster
# Get kubeconfig for the workload cluster
clusterctl get kubeconfig capi-quickstart > capi-quickstart.kubeconfig
Fix kubeconfig for local access (Docker provider only):
The generated kubeconfig contains the internal Docker network IP for the API server. To access from your host, you must replace it with 127.0.0.1 and the mapped port.
# Find the HAProxy container's mapped port
docker ps | grep capi-quickstart-lb
# Example output:
# ... 0.0.0.0:43210->6443/tcp ... capi-quickstart-lb
# The mapped port is 43210
# Update kubeconfig to use localhost and mapped port
sed -i 's|server: https://.*:6443|server: https://127.0.0.1:43210|' capi-quickstart.kubeconfig
# macOS:
sed -i '' 's|server: https://.*:6443|server: https://127.0.0.1:43210|' capi-quickstart.kubeconfig
Install CNI (Required)
Nodes remain in NotReady state until a CNI plugin is installed. CAPI does not install CNI automatically — this is by design, giving you flexibility to choose your CNI.
# Install Calico CNI
kubectl --kubeconfig=capi-quickstart.kubeconfig \
apply -f https://raw.githubusercontent.com/projectcalico/calico/v3.27.0/manifests/calico.yaml
Verify nodes transition to Ready:
# Watch nodes go from NotReady -> Ready
kubectl --kubeconfig=capi-quickstart.kubeconfig get nodes -w
All nodes should show Ready status within 1-2 minutes after CNI installation.
Full Verification
# All nodes Ready
kubectl --kubeconfig=capi-quickstart.kubeconfig get nodes -o wide
# System pods Running
kubectl --kubeconfig=capi-quickstart.kubeconfig get pods -A
# From management cluster: cluster status
clusterctl describe cluster capi-quickstart
# Check Machine objects
kubectl get machines -o wide
# Check MachineDeployments
kubectl get machinedeployments -o wide
ClusterClass (Topology)
ClusterClass is a reusable blueprint for creating clusters. Define the cluster shape once (templates for infrastructure, control plane, workers), then reference it when creating clusters.
Enable ClusterClass
# Must be set before clusterctl init
export CLUSTER_TOPOLOGY=true
clusterctl init --infrastructure docker
ClusterClass Resource Hierarchy
ClusterClass (blueprint)
+------------------------------------------------------------+
| name: my-cluster-class |
| |
| Infrastructure: |
| ref: DockerClusterTemplate/my-cluster-class |
| |
| Control Plane: |
| ref: KubeadmControlPlaneTemplate/my-cluster-class |
| Infrastructure: |
| ref: DockerMachineTemplate/my-cluster-class-cp |
| |
| Workers: |
| MachineDeployments: |
| - class: default-worker |
| template: |
| bootstrap: |
| ref: KubeadmConfigTemplate/my-cluster-class |
| infrastructure: |
| ref: DockerMachineTemplate/my-cluster-class-wkr |
+------------------------------------------------------------+
Cluster (instance using ClusterClass)
+------------------------------------------------------------+
| name: capi-quickstart |
| spec: |
| topology: |
| class: my-cluster-class <-- references ClusterClass|
| version: v1.34.3 |
| controlPlane: |
| replicas: 3 |
| workers: |
| machineDeployments: |
| - class: default-worker |
| name: md-0 |
| replicas: 3 |
+------------------------------------------------------------+
Resource Types in ClusterClass
| Resource | Purpose |
|---|---|
ClusterClass |
Top-level blueprint tying all templates together |
DockerClusterTemplate |
Template for infrastructure-level cluster config (network, LB) |
KubeadmControlPlaneTemplate |
Template for control plane behavior (kubeadm config, rollout) |
DockerMachineTemplate |
Template for individual machine specs (CPU, memory, image) |
KubeadmConfigTemplate |
Template for worker node bootstrap configuration |
Cluster |
Instance that references a ClusterClass via spec.topology.class |
Benefits of ClusterClass
- Consistency: All clusters from the same class share identical infrastructure and bootstrap configuration
- Simplicity: Creating a new cluster requires only a small Cluster CR referencing the class
- Upgrades: Update the ClusterClass and all clusters using it can be rolled forward
- Governance: Platform teams define ClusterClass, application teams consume them
Declarative Cluster Upgrade
One of CAPI’s most powerful features: upgrade an entire cluster by patching a single field.
Upgrade from v1.34 to v1.35
kubectl patch cluster capi-quickstart --type merge \
-p '{"spec":{"topology":{"version":"v1.35.0"}}}'
That is the entire upgrade command. One line. No SSH, no manual kubeadm commands, no node-by-node playbook runs.
What Happens During Upgrade
Declarative Upgrade Flow
+--------------------------------------------------------------------+
| |
| kubectl patch cluster ... version: "v1.35.0" |
| | |
| v |
| Control Plane Provider detects version drift |
| | |
| v |
| Phase 1: Control Plane Rolling Upgrade |
| For each CP node (one at a time): |
| 1. Create new Machine with v1.35.0 image |
| 2. Wait for new node to join and become Ready |
| 3. Cordon old node |
| 4. Drain old node (evict workloads) |
| 5. Remove old Machine (delete container/VM) |
| 6. Update etcd membership |
| Repeat until all CP nodes are on v1.35.0 |
| | |
| v |
| Phase 2: Worker Rolling Upgrade |
| MachineDeployment controller: |
| 1. Creates new MachineSet with v1.35.0 |
| 2. Scales up new MachineSet (new workers join) |
| 3. Cordons old workers |
| 4. Drains old workers |
| 5. Scales down old MachineSet (removes old workers) |
| Proceeds according to rollout strategy (default: rolling) |
| | |
| v |
| Upgrade complete: all nodes running v1.35.0 |
+--------------------------------------------------------------------+
Monitor Upgrade Progress
# Watch the upgrade in real-time
clusterctl describe cluster capi-quickstart --show-conditions all
# Watch Machine replacements
kubectl get machines -w
# Check rollout status of MachineDeployments
kubectl get machinedeployments -o wide
# Verify versions on workload cluster
kubectl --kubeconfig=capi-quickstart.kubeconfig get nodes -o wide
Upgrade Behavior Details
| Aspect | Behavior |
|---|---|
| CP upgrade order | One node at a time (serial), new created before old removed |
| Worker upgrade order | Rolling update via MachineDeployment (configurable surge/unavailable) |
| etcd handling | Automatic — membership updated as CP nodes are replaced |
| Workload disruption | Pods drained gracefully before node removal |
| Rollback | Not automatic — patch version back to previous if needed |
| Failure handling | Upgrade pauses if a new node fails to become Ready |
Cleanup
Delete Workload Cluster
# Delete the workload cluster (from management cluster context)
kubectl delete cluster capi-quickstart
This single command removes ALL resources associated with the workload cluster:
- All Machine objects (Docker containers or VMs)
- The load balancer container
- MachineDeployments, MachineSets
- Bootstrap configs, kubeadm configs
- Infrastructure resources (DockerCluster, DockerMachines)
CAPI uses owner references — deleting the Cluster object cascades to all child resources automatically.
Delete Management Cluster
# Remove the KinD management cluster
kind delete cluster --name myk8s
Custom Resource Quick Reference
| Resource | API Group | Managed By | Purpose |
|---|---|---|---|
Cluster |
cluster.x-k8s.io |
Core Provider | Top-level cluster definition |
Machine |
cluster.x-k8s.io |
Core Provider | Single node representation |
MachineSet |
cluster.x-k8s.io |
Core Provider | Manages a group of identical Machines |
MachineDeployment |
cluster.x-k8s.io |
Core Provider | Declarative updates for MachineSets |
KubeadmControlPlane |
controlplane.cluster.x-k8s.io |
CP Provider | Control plane scaling and upgrades |
KubeadmConfig |
bootstrap.cluster.x-k8s.io |
Bootstrap Provider | kubeadm bootstrap configuration |
KubeadmConfigTemplate |
bootstrap.cluster.x-k8s.io |
Bootstrap Provider | Template for worker bootstrap configs |
DockerCluster |
infrastructure.cluster.x-k8s.io |
Infra Provider | Docker-specific cluster infrastructure |
DockerMachine |
infrastructure.cluster.x-k8s.io |
Infra Provider | Docker-specific machine (container) |
DockerMachineTemplate |
infrastructure.cluster.x-k8s.io |
Infra Provider | Template for DockerMachine specs |
ClusterClass |
cluster.x-k8s.io |
Core Provider | Reusable cluster blueprint |
Useful Commands
| Command | Purpose |
|---|---|
clusterctl init --infrastructure docker |
Initialize management cluster with Docker provider |
clusterctl generate cluster <name> ... |
Generate cluster manifest from templates |
clusterctl describe cluster <name> |
Show cluster topology and status |
clusterctl get kubeconfig <name> |
Retrieve workload cluster kubeconfig |
clusterctl move --to-kubeconfig <path> |
Move CAPI objects to a different management cluster |
kubectl get clusters |
List all CAPI-managed clusters |
kubectl get machines -o wide |
List all machines with status and versions |
kubectl get machinedeployments |
List all MachineDeployments |
kubectl get kubeadmcontrolplane |
List control plane objects with replica counts |
Common Errors (Searchable)
error: failed to connect to the management cluster
Cause: kubectl context not pointing to the management cluster. Fix: Run kubectl config use-context kind-myk8s to switch to the management cluster context.
Error: action failed after X attempts: failed to get provider components for the "docker" provider
Cause: clusterctl cannot download provider components (network issue or wrong version). Fix: Check internet connectivity. For air-gapped setups, use clusterctl init with local provider repository configuration in ~/.cluster-api/clusterctl.yaml.
Cluster "capi-quickstart" is not ready: WaitForControlPlaneAvailable
Cause: Control plane machines are still provisioning or failing to bootstrap. Fix: Check Machine objects (kubectl get machines), inspect Machine conditions, check Docker container logs (docker logs <container-name>).
Unable to connect to the server: x509: certificate signed by unknown authority
Cause: Kubeconfig for workload cluster has incorrect CA or server address. Fix: Re-retrieve kubeconfig with clusterctl get kubeconfig and fix the server address for local Docker access.
nodes are not ready: node "capi-quickstart-xxxx" not Ready
Cause: CNI not installed on workload cluster. Fix: Install a CNI plugin (Calico, Cilium, Flannel) using the workload cluster’s kubeconfig. Nodes remain NotReady until CNI is deployed.
Error from server: error when creating "capi-quickstart.yaml": admission webhook "validation.cluster.cluster.x-k8s.io" denied the request: Cluster.spec.topology.class "quick-start" not found
Cause: ClusterClass referenced in the manifest does not exist. Fix: Ensure CLUSTER_TOPOLOGY=true was set before clusterctl init. The --flavor development flag includes ClusterClass definitions in the generated manifest — make sure to apply the full file.
docker: Error response from daemon: Conflict. The container name "/capi-quickstart-lb" is already in use
Cause: Leftover Docker containers from a previous CAPI cluster with the same name. Fix: Delete the old cluster first (kubectl delete cluster capi-quickstart), or remove orphaned containers manually (docker rm -f capi-quickstart-lb).
Error: timed out waiting for the condition on machines
Cause: Machine provisioning is stuck — could be Docker resource limits, image pull failures, or kubeadm init/join failures. Fix: Check Docker daemon resources (docker system df), inspect the Machine’s bootstrap data, and check container logs for kubeadm errors.
failed to create kind cluster: failed to init node: kubeadm init failed
Cause: KinD management cluster creation failed, often due to insufficient Docker resources or conflicting port mappings. Fix: Ensure Docker has sufficient memory (at least 4 GB for management + workload clusters), check port availability for extraPortMappings.
Common Mistakes
| Mistake | Consequence |
|---|---|
Forgetting CLUSTER_TOPOLOGY=true before clusterctl init |
ClusterClass CRDs not installed; re-run clusterctl init with the env var set |
| Not mounting Docker socket in KinD config | CAPD cannot create workload cluster containers; recreate KinD with extraMounts for /var/run/docker.sock |
| Skipping CNI installation on workload cluster | All nodes stuck in NotReady, pods cannot be scheduled |
| Using generated kubeconfig directly (Docker provider) | Connection refused; replace API server address with 127.0.0.1:<mapped-port> |
Deleting Docker containers manually instead of kubectl delete cluster |
CAPI state becomes inconsistent, orphaned resources in management cluster |
Running kubectl delete machine directly |
CAPI will reconcile and recreate the Machine; use MachineDeployment replicas or kubectl delete cluster |
| Trying to SSH into CAPD Docker containers | Containers are not VMs; use docker exec -it <container-name> bash instead |
| Not checking CAPI provider versions match | Version mismatch causes reconciliation failures; use clusterctl upgrade plan to check |
| Insufficient Docker resources for management + workload clusters | Containers OOM-killed; allocate at least 4 GB memory, 8 GB recommended for 3+3 clusters |
Ignoring clusterctl describe output during provisioning |
Missing early error signals; always monitor with clusterctl describe cluster <name> --show-conditions all |