Rightsizing recommendations and Workload Autoscaling

How long does it take to start generating enough data to create useful recommendations?

30 minutes is enough for CAST AI to start creating valuable recommendations.


Could you please clarify whether pods will still need to restart after the optimization is applied?

Resizing support is on its way.

The K8s feature will be available from 1.27 in alpha and 1.28 in beta, read more:Kubernetes 1.27: In-place Resource Resize for Kubernetes Pods (alpha).


Does our automatic workload rightsizing take HPA into account?

No, CAST AI doesn't take HPA into account.


With Workload Autoscaler, is the castai-agent VPA still needed?

Currently, it's still needed. The agent is a special case for Workload Autoscaler since this is where it gets its metrics. If the agent dies, woop is 'blind', while for any other workload in the cluster, woop would eventually upscale it based on incoming metrics.


Is Workload Autoscaler that takes HPA into account on the roadmap? Or is there a potential workaround?

Workload Autoscaler will soon implement a way to scale pods not only vertically based on resource usage but also horizontally.


Are there any plans for Workload Autoscaler to manage DaemonSets and StatefulSets?

Yes, that is part of our roadmap and one of our milestones.


If I turn on Workload Autoscaler for a deployment and set CPU overhead to 1% and memory overhead to 10%, will it automatically add more pods to the deployment?

Currently, Workload Autoscaler works by modifying Pod Request on Pod recreation (scheduling), not modifying the deployment. We recommend using KEDA for horizontal scaling.


Do we have a way to support Workload Autoscaler declaratively on the workload and namespace level ?

Currently, CAST AI doesn't support this.


Does Workload Autoscaler work with Rollout CRDs? Specifically, apps using Argo Rollouts?

Currently, CAST AI doesn't support this.


Is it possible to adjust limits with Workload Autoscaler?

We currently set the memory limits = 1.5 * requests. This gives our platform time to detect any memory increase in the workload and adjust recommendations before it gets Out-Of-Memory killed.