Cluster and node status overview
This guide includes an overview of cluster status values, defining the current state of the cluster's connection to Cast AI, and an overview of node status values, indicating the health and readiness of nodes to accept pods.
Overview of cluster Status values
The cluster's status in the Cast AI console defines the current state of its connection to Cast AI and indicates whether the platform can perform automated optimization actions on it.
Status | Explanation | Action |
---|---|---|
Connecting | Cluster is in the process of being connected to Cast AI in Read only modeOR Cluster is transitioning from the Read only mode to Cast AI managed mode (where the customer can set up automation). | |
Read only | Cluster is connected to Cast AI in read-only mode. Reporting features are enabled. | |
Connected | Cluster is connected to Cast AI managed mode, reporting features are enabled, and automation can be set up. | |
Warning | The Cast AI-managed cluster has encountered a transient error and is attempting to recover from it automatically. Autoscaling is not working. | |
Not responding (Read only) | Cast AI has recently lost connectivity to a cluster that was previously connected in the Read only mode, if connection is not restored in 5 minutes, status will change to Disconnected (Read only) . | Check the status of castai-agent pod in the castai-agent namespace. |
Not responding | Cast AI has recently lost connectivity to a cluster. Autoscaling is not working. | Check the status of castai-agent pod in the castai-agent namespace. |
Failed | Cast AI has encountered an error and can't recover from it automatically. Autoscaling is not working. | Hover over the Status to view error details.Check the status of Cast AI components in castai-agent namespace. |
Disconnecting | The cluster is being disconnected from Cast AI. | |
Disconnected | Cluster, which was previously connected to Cast AI, is now disconnected. | Hover the Status to see when the cluster was disconnected. |
Overview of Node status values
The node's status in the console indicates its health and readiness to accept pods.
Status | Explanation | Action |
---|---|---|
Cordoned | When a Kubernetes node is in the Cordoned state, scheduling new pods onto that node is temporarily disabled. A user or system might have cordoned a node in preparation for node deletion. Cast AI also cordons and leaves a node in the cluster if pods were not evicted during rebalancing (with the Graceful Rebalancing option turned on). | Inspect the node to understand the reason behind cordoning. If a node was cordoned during rebalancing, adjust the pod disruption budget, and un-cordon the node. |
Creating | Cast AI is in the process of creating a node. | |
Deleted | A short-term status indicates that a node was deleted. | |
Deleting | Cast AI is in the process of deleting the node. | |
Detached | A node that is still present in the cloud but has been detached from the Kubernetes cluster. | Inspect the node and delete it manually from the cloud. |
Draining | The node is being drained; Kubernetes gracefully evicts existing pods from the node. | |
Interrupted | A couple of scenarios might trigger this spot node status. In all cases, Cast AI is managing the interruption and preparing the necessary capacity: - Interruption event received from a cloud provider - A rebalancing recommendation is received from the cloud provider, indicating a possible interruption - Cast AI predicted node interruption | Cast AI is handling the interruption and is preparing replacement capacity. |
Lost | A node is no longer part of the Kubernetes cluster; however, Cast AI has not yet deleted it. | If a node is in this state for a prolonged period, contact Cast AI support to troubleshoot the issue. |
Not ready | A node is temporarily unable to accept new workloads either because Cast AI is still preparing it as part of the provisioning process or it is experiencing issues, such as network problems or insufficient resources, that prevent it from properly communicating with the control plane. | If a node is in this state for a prolonged period, contact Cast AI support to troubleshoot the issue. |
Ready | Node is fully operational and ready to accept pods. |
Updated about 9 hours ago