Ensure that Auto-Repair feature is enabled for all your Google Kubernetes Engine (GKE) cluster nodes in order to help you keep the cluster nodes healthy. Google Kubernetes Engine uses the node's health status to determine if a cluster node needs to be repaired. GKE triggers a repair action if a node reports consecutive unhealthy status reports for a given time threshold. The unhealthy status is reported when:
A cluster node broadcast a "NotReady" status on consecutive checks over the given time threshold.
A cluster node does not broadcast any status at all over the given time threshold.
A cluster node's boot disk is out of disk space for an extended period of time.
GKE Auto-Repair helps you keep the nodes in your cluster in a healthy, running state. When the feature is enabled, GKE makes periodic checks on the health state of each node in your cluster. If a node fails consecutive health checks over a given time threshold, GKE service initiates a repair process for that cluster node.
To determine if your Google Kubernetes Engine (GKE) clusters are using auto-repairing nodes, perform the following actions:
Remediation / Resolution
To enable the Auto-Repair feature for all the Google Kubernetes Engine (GKE) cluster nodes, perform the following actions:Note: GKE cluster node auto-repair can be enabled on a per-node pool basis only.
Unlock the Remediation Steps
Gain free unlimited access
to our full Knowledge Base
Over 750 rules & best practices
Get started for FREE
You are auditing:
Enable Auto-Repair for GKE Cluster Nodes
Risk level: Medium