Installation

Follow this guide to install the Node Readiness Controller in your Kubernetes cluster.

Prerequisites

If you plan to use the install-full.yaml option (which includes secure metrics and the validating admission webhook), you must first have cert-manager installed in your cluster.

Deployment Options

Option 1: Official Release (Recommended)

First, to install the CRDs, apply the crds.yaml manifest:

# Replace with the desired version
VERSION=v0.1.1
kubectl apply -f https://github.com/kubernetes-sigs/node-readiness-controller/releases/download/${VERSION}/crds.yaml
kubectl wait --for condition=established --timeout=30s crd/nodereadinessrules.readiness.node.x-k8s.io

2. Install the Controller

Choose one of the two following manifests based on your requirements:

Manifest	Contents	Prerequisites
`install.yaml`	Core Controller	None
`install-full.yaml`	Core Controller + Metrics (Secure) + Validation Webhook	`cert-manager`

Standard Installation (Minimal): The simplest way to deploy the controller with no external dependencies.

kubectl apply -f https://github.com/kubernetes-sigs/node-readiness-controller/releases/download/${VERSION}/install.yaml

Full Installation (Production Ready): Includes secure metrics (TLS-protected) and validating webhooks for rule conflict prevention. Requires cert-manager to be installed in your cluster.

kubectl apply -f https://github.com/kubernetes-sigs/node-readiness-controller/releases/download/${VERSION}/install-full.yaml

This will deploy the controller into the nrr-system namespace on any available node in your cluster.

Controller priority

The controller is deployed with system-cluster-critical priority to prevent eviction during node resource pressure.

If it gets evicted during resource pressure, nodes can’t transition to Ready state, blocking all workload scheduling cluster-wide.

This is the priority class used by other critical cluster components (eg: core-dns).

Images

The official releases use multi-arch images (AMD64, Arm64) and are available at registry.k8s.io/node-readiness-controller/node-readiness-controller

REPO="registry.k8s.io/node-readiness-controller/node-readiness-controller"
TAG=$(skopeo list-tags docker://$REPO | jq .'Tags[-1]' | tr -d '"')
docker pull $REPO:$TAG

Option 2: Advanced Deployment (Kustomize)

If you need deeper customization, you can use Kustomize directly from the source.

# 1. Install CRDs
kubectl apply -k config/crd

# 2. Deploy Controller with default configuration
kubectl apply -k config/default

You can enable optional components (Metrics, TLS, Webhook) by creating a kustomization.yaml that includes the relevant components from the config/ directory. For reference on how these components can be combined, see the deploy-with-metrics, deploy-with-tls, deploy-with-webhook, and deploy-full targets in the projects Makefile.

Option 3: Deploy as a Static Pod (Control Plane)

Running the controller as a Static Pod on control-plane nodes is useful for self-managed clusters (e.g., kubeadm) where you want the controller to be available alongside core components like the API server.

Refer to the examples/static-pod/node-readiness-controller.yaml for a detailed example on deploying the controller as a static pod in a kind cluster.

Deployment Steps

Prepare the Manifest: Refer the example manifest in examples/static-pod/node-readiness-controller.yaml. This manifest handles kubeconfig with a initContainer and necessary flags for leader election.
Deploy to Nodes: Use Ansible / Terraform to copy the manifest to the /etc/kubernetes/manifests/ directory on each control-plane node. The Kubelet will automatically detect and start the pod.
Install CRDs:
```
kubectl apply -k config/crd
```
This is typically handled via a bootstrap script or post-install job in a kubeadm setup.

Verification

After installation, verify that the controller is running successfully.

Note: Replace ${NAMESPACE} with the namespace where the controller is deployed (typically nrr-system for standard deployments, or kube-system for static pods).

Check Pod Status:

kubectl get pods -n ${NAMESPACE} -l component=node-readiness-controller

You should see the controller pods in Running status.

Check Logs:
```
kubectl logs -n ${NAMESPACE} -l component=node-readiness-controller
```
Look for “Starting EventSource” or “Starting Controller” messages indicating the manager is active.

Verify CRDs:

kubectl get crd nodereadinessrules.readiness.node.x-k8s.io

Verify High Availability: In an HA cluster, verify that one instance has acquired the leader lease:

# The lease namespace should match the controller's namespace (configured via --leader-election-namespace)
kubectl get lease -n ${NAMESPACE} ba65f13e.readiness.node.x-k8s.io

Uninstallation

IMPORTANT: Follow this order to avoid “stuck” resources.

The controller uses a finalizer (readiness.node.x-k8s.io/cleanup-taints) on NodeReadinessRule resources to ensure taints are safely removed from nodes before a rule is deleted.

You must delete all rule objects before deleting the controller.

Delete all Rules:
```
kubectl delete nodereadinessrules --all
```
Wait for this command to complete. This ensures the running controller removes its taints from your nodes.

Uninstall Controller:

# If installed via release manifest
kubectl delete -f https://github.com/kubernetes-sigs/node-readiness-controller/releases/download/${VERSION}/install.yaml

# Or if using the full manifest
kubectl delete -f https://github.com/kubernetes-sigs/node-readiness-controller/releases/download/${VERSION}/install-full.yaml

# OR if using Kustomize
kubectl delete -k config/default

# OR if using Static Pods
# Remove the manifest from /etc/kubernetes/manifests/ on all control-plane nodes

Uninstall CRDs (Optional):
```
kubectl delete -k config/crd
```

Recovering from Stuck Resources

If you accidentally deleted the controller before the rules, the NodeReadinessRule objects will get stuck in a Terminating state because the controller is needed to cleanup the taints and finalizers.

To force-delete them (this will require you to manually clean up the managed taints if any on your nodes):

# Patch the finalizer to remove it
kubectl patch nodereadinessrule <rule-name> -p '{"metadata":{"finalizers":[]}}' --type=merge

Troubleshooting Deployment

RBAC Permissions If the controller logs show “Forbidden” errors, verify the ClusterRole bindings:

kubectl describe clusterrole nrr-manager-role

It requires nodes (update/patch) and nodereadinessrules (all) access.

Debug Logging To enable verbose logging for deeper investigation:

kubectl patch deployment -n nrr-system nrr-controller-manager \
  -p '{"spec":{"template":{"spec":{"containers":[{"name":"manager","args":["--zap-log-level=debug"]}]}}}}'

Keyboard shortcuts

Node Readiness Controller