How it works
Rebalancing is a CAST AI feature that allows your clusters to reach the most optimal & up to date state. During this process suboptimal nodes are automatically replaced with new ones that are more cost efficient as well as run most up to date Node configuration settings.
Rebalancing works by taking all the workloads running in your cluster and finding the most optimal ways they can be distributed amongst the cheapest nodes. Rebalancing is based on the same algorithms that drive the CAST AI Autoscaling engine to find optimal node configurations for your workloads. The only difference is that all workloads are run through them, rather than just unschedulable pods.
Purpose
The rebalancing process has multiple purposes:
-
Rebalance the cluster during the initial onboarding to immediately achieve cost savings. The rebalancer aims to make it easy to start using CAST AI by running your cluster through the CAST AI algorithms and reshaping your cluster into an optimal state during onboarding.
-
Remove fragmentation which is a normal byproduct of everyday cluster execution. Autoscaling is a reactive process which aims to satisfy unschedulable pods. As these reactive decisions accumulate, your cluster might become too fragmented. Consider this example: you are upscaling your workloads by 1 replica every hour. That replica is requesting 6 CPU. The cluster will end up with 24 new nodes with 8 CPU capacity each after a day. This means that you will have 48 unused fragmented CPUs. The rebalancer aims to solve this by consolidating the workloads into fewer cheaper nodes, reducing waste.
-
Replace specific nodes due to cost inefficiency or out-dated Node configuration. During the rebalancing operation targeted nodes will be replaced with most optimal set of nodes running latest node configuration settings.
Scope
Rebalancing can be executed on the whole cluster or on a specific set of nodes.
- In order to to run the rebalancing on the whole cluster choose Cluster --> Rebalance.
- To rebalance subset of nodes, first select the nodes using Cluster --> Node list then choose Actions --> Rebalance nodes.
After assessing the scope of the operation, generate Rebalancing plan to review planned changes and their effect on the cluster composition and costs. Only nodes which don't have any problematic workloads are up for rebalancing. In order to reduce a number of problematic workloads and avoid disruption of service, check the Preparation for rebalancing guide.
Execution
Rebalancing is executed in three distinct phases, executed sequentially:
- Create new optimal nodes.
- Drain old suboptimal nodes.
- Delete old suboptimal nodes.
Updated 3 months ago