How it works

The CAST AI Autoscaler removes nodes when it detects that they have not been utilized for a certain period of time. The goal of removing empty nodes is to save costs and reduce waste.

The Autoscaler only removes empty nodes. Nodes end up empty because all pods running on the node have been deleted, possibly due to ReplicaSet scale down or Job completion.

Downscaling options

You have two levels of downscaling available at your disposal:

  • Node deletion policy - this policy just removes nodes that are empty and no longer running in any capacity. For example, if a job you're running goes past its run, a node may become empty, and CAST AI will automatically remove it to avoid waste.
  • Evictor - it continuously compacts pods into fewer nodes, creating empty nodes that can be removed following the Node deletion policy (if you choose to enable it). Evictor actively bin packs your cluster state and moves pods around to achieve higher node utilization. Nodes that have been freed up are removed in accordance with the Node deletion policy.

Bin packing flow

CAST AI implemented the Evictor component to solve the bin-packing problem.

  1. Evictor continuously scans your cluster on a per-minute basis.
  2. It identifies pods running on underutilized nodes and checks whether they could be scheduled elsewhere in your cluster.
  3. When such a combination of nodes and pods is found, Evictor cordons the node, drains it, and moves your workloads to another node, lowering the waste by bin packing your cluster.
  4. Once a node becomes empty, the Node deletion policy (that should be enabled), will remove the empty node.
Evictor explained with train senario

Evictor explained on a train scenario