Autoscaling
CAST AI Autoscaling Engine
Introduction
The CAST AI Autoscaler is a tool designed to scale Kubernetes clusters with cost efficiency as the primary objective. Its goal is to dynamically adjust the number of nodes by adding new right-sized nodes and removing underutilized nodes when:
- There are pods that are unschedulable due to insufficient resources in the cluster.
- There are empty nodes that have not been utilized for a certain period.
The Autoscaler works through the following process:
- Watching your cluster state using the CAST AI Kubernetes Agent, which synchronizes cluster state changes to CAST AI.
- Calculating the required extra cluster capacity and detecting empty nodes.
- Evaluating the extra capacity and converting it into optimal node instance types.
- Provisioning new optimal nodes while removing underutilized empty nodes.
Key Concepts
To fully understand the Autoscaler's functionality, it's important to be familiar with the following concepts:
-
Unschedulable pods: Kubernetes pods in the
Pending
phase with theirPodScheduled
condition set toFalse
and reasonUnschedulable
. -
Pod constraints: Kubernetes pods can be declared with node selectors, node affinity, and pod anti/affinity properties. These reduce the number of node types and topologies (i.e., the hardware and zonal placement) that could satisfy the pod's requirements.
-
Kubernetes scheduler: The control plane process responsible for assigning pods to nodes.
-
Empty node: A node without any running pods. The Autoscaler ignores exited, static, or
DaemonSet
pods when determining if a node is empty.
Cluster State Monitoring
The Autoscaler relies on accurate cluster state information to optimize the cluster effectively. The CAST AI Kubernetes Agent collects all cluster state changes and sends them to the Autoscaler. It uses Kubernetes informers to watch Kubernetes resources and synchronizes all changes to CAST AI at 15-second intervals.
The Kubernetes Agent monitors the following resources to enable correct autoscaling decisions:
Node
Pod
PersistentVolume
PersistentVolumeClaim
ReplicationController
Service
Deployment
ReplicaSet
DaemonSet
StatefulSet
StorageClasse
Job
CSINode
HorizontalPodAutoscaler
Cluster Upscaling
The Autoscaler adds new nodes when it detects unschedulable pods. Pods become unschedulable when the Kubernetes scheduler fails to place them on any existing node. This can occur for various reasons, such as insufficient resources on existing nodes or pods not matching node constraints.
Bin-packing Algorithm
To determine the required extra capacity, the Autoscaler employs bin-packing algorithms to find the most optimal pod groupings. This process takes into account that some pods cannot be placed on the same node due to their declared constraints.
Bin-packing is considered a P versus NP problem, meaning its complexity increases exponentially with the number of items to pack. As the number of unschedulable pods grows, the Autoscaler implements various approximation strategies to maintain consistent algorithm performance.
Instance Type Selection
After bin-packing pod groups, the Autoscaler converts them into appropriate instance types. The goal is to find the most cost-effective instance type that meets all constraints. This process involves:
- Filtering the list of all available instance types to identify viable options.
- Sorting the viable instance types by price.
- Selecting the cheapest option that satisfies all requirements.
This approach consistently results in the most optimal node selection.
CAST AI regularly updates instance type pricing and availability information by querying cloud service provider APIs. This eliminates the need for manual pricing tracking and analysis of viable instance-type offerings.
When pods have no specific constraints, the Autoscaler can choose from a wide range of instance types. However, pods often need to define constraints for high availability or specific hardware requirements. For more information on controlling pod placement, refer to the Pod placement section.
Spot Instances
CAST AI also fully supports spot instance scaling. To learn how to leverage spot instances for cost savings, visit the Spot instances section.
Cluster Downscaling
The Autoscaler removes nodes that have remained unutilized for a specified period. This process aims to reduce costs and minimize waste. It's important to note that the Autoscaler only removes completely empty nodes.
Nodes typically become empty when all running pods on the node have been deleted. This can occur due to:
ReplicaSet
scale-down operationsJob
completions
To further optimize cluster resources, CAST AI provides an additional component called the Evictor. The Evictor helps consolidate running pods into fewer nodes through intelligent bin-packing strategies.
By combining upscaling and downscaling processes, the CAST AI Autoscaler ensures your Kubernetes cluster maintains optimal resource utilization and cost efficiency.
Updated 4 months ago