How it works

Rebalancing is a CAST AI feature that allows your cluster to reach the most optimal & up-to-date state. During this process, suboptimal nodes are automatically replaced with new ones that are more cost-efficient and run the most up-to-date Node configuration settings.

Rebalancing works by taking all the workloads running in your cluster and finding the most optimal ways to distribute them among the cheapest nodes.

Rebalancing uses the same algorithms that drive the CAST AI Autoscaling engine to find optimal node configurations for your workloads. The only difference is that all workloads are run through them rather than just unschedulable pods.

Purpose

The rebalancing process has multiple purposes:

  1. Rebalance the cluster during the initial onboarding to immediately achieve cost savings. The rebalancer makes it easy to start using CAST AI by running your cluster through the CAST AI algorithms and reshaping your cluster into an optimal state during onboarding.

  2. Remove fragmentation which is a normal byproduct of everyday cluster execution. Autoscaling is a reactive process that aims to satisfy unschedulable pods. As these reactive decisions accumulate, your cluster might become too fragmented.

    Consider this example: you are upscaling your workloads by one replica every hour. That replica is requesting 6 CPUs. The cluster will end up with 24 new nodes with 8 CPU capacities each after a day. This means that you will have 48 unused fragmented CPUs. The rebalancer aims to solve this by consolidating the workloads into fewer cheaper nodes, reducing waste.

  3. Replace specific nodes due to cost inefficiency or outdated Node configuration. During the rebalancing operation, targeted nodes will be replaced with the most optimal set of nodes running the latest node configuration settings.

Scope

You can rebalance the entire cluster or only a specific set of nodes.

  1. In order to rebalance the whole cluster, choose Cluster --> Rebalance.
  2. To rebalance a subset of nodes, first, select the nodes using Cluster --> Node list, then choose Actions --> Rebalance nodes.

After assessing the operation's scope, generate a Rebalancing plan to review planned changes and their effect on the cluster composition and costs. Only nodes without problematic workloads will be considered for rebalancing.

To reduce the number of problematic workloads and avoid service disruption, check the Preparation for the rebalancing guide.

Execution

Rebalancing consists of three distinct phases:

  1. Create new optimal nodes.
  2. Drain old, suboptimal nodes.
  3. Delete old, suboptimal nodes. Nodes are deleted one by one as soon as they have been drained.

What’s Next