Getting started
Workload Autoscaler automatically scales your workload requests up or down to ensure optimal performance and cost-effectiveness.
To start using workload optimization, you need to install the Workload Autoscaler component in addition to the custom resource definitions for the recommendation objects. [Not sure what this refers to]
Note
Your cluster must be running in automated optimization mode, as workload optimization relies on the cluster controller to create the recommendation objects in the cluster.
Installation
You can install the Workload Autoscaler component using an install script. Either obtain the script using our API or visit the CAST AI console and navigate to the workload optimization page.
Install Workload Autoscaler component
Lorem ipsum
Install script via API
Lorem ipsum
Install script in the console
Lorem ipsum
Install metrics-server
metrics-server
Lorem ipsum
How Workload Autoscaler works
Lorem ipsum
Metrics Collection and Recommendation Generation
Recommendations are regenerated every 30 minutes. The default configuration is maximum usage over 5 days with 10% overhead for memory and 80th percentile usage over 5 days for CPU.
Note
All generated recommendations will consider the current requests/limits.
Applying Recommendations Automatically
Once the recommendation lands in the cluster, the Workload Autoscaler component is notified that a recommendation has been created or updated.
Next, Workload Autoscaler works in the following order.
- It works as an admission webhook for pods. When pods are created matching the recommendation target, it modifies the pod to have its requests/limits set to what is defined in the recommendation.
- It finds the controller and triggers an update to re-create the pods controlled by the controller (for example, for a deployment object, it adds an annotation to the pod template).
Supported controllers
Workload Autoscaler currently supports deployments and rollouts.
Default behavior
By default, deployments are updated immediately, which may result in the restart of pods.
Rollouts are updated in a deferred manner. Workload Autoscaler waits for pods to restart naturally before applying new recommendations. Examples include a new service release or a pod dying because of a business or technical error.
For more information on immediate and deferred recommendation application modes, see Immediate vs. deferred scaling.
Updated 22 days ago