Scaling modes

Immediate

If the autoscaler mode is set to immediate, it will check if a new recommendation meets the user-set optimization thresholds. It won't be applied if the recommendation doesn't meet these thresholds. If it does pass the threshold, Workload Autoscaler will automatically modify pod requests per the recommendation.

Moreover, Workload Autoscaler will also apply new recommendations upon natural pod restarts, such as a new service release or when a pod dies due to a business or technical error. This helps to avoid unnecessary pod restarts.

Deferred

If the scaling mode is set to deferred, Workload Autoscaler will not initiate a forced restart of the pod. Instead, it will apply the recommendation whenever external factors initiate pod restarts.