Pod startup recommendations
Workload Autoscaler rightsizes CPU requests to a steady-state optimum. For workloads that need more CPU to initialize than to run at steady state, this creates two problems:
- Slow startup — the pod cannot get the CPU it needs to start quickly, because the lowered request limits what the scheduler reserves for it.
- HPA spikes — reduced CPU requests mean higher measured utilization. If an HPA is targeting CPU utilization, the spike during startup causes it to scale out replicas until the workload stabilizes.
Startup recommendations address both problems by generating a two-phase resource profile. A newly created pod receives the workload's original CPU requests during the startup period, then Workload Autoscaler transitions it to the optimized recommendation via in-place pod resize once the startup period ends.
How it works
A startup recommendation contains two phases:
- Startup phase: The pod receives the workload's original CPU requests — the values from the workload spec before optimization. This ensures the workload has sufficient CPU to initialize reliably.
- Post-startup phase: After the configured startup period elapses, Workload Autoscaler applies the optimized recommendation via in-place resize, without restarting the pod.
The startup period duration comes from the workload's configured startup period. For example, if a workload has a 5-minute startup period configured, the startup phase resources are applied for 5 minutes before the in-place resize to the optimized recommendation.
CPU onlyStartup recommendations apply to CPU only. Memory requests remain the same across both phases.
When a startup recommendation is active, the Recommendation object in the cluster reflects both phases:
apiVersion: autoscaling.cast.ai/v2
kind: Recommendation
metadata:
name: nginx-deployment
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: nginx
...
# startup phase: original workload CPU requests
recommendation:
- containerName: nginx
requests:
cpu: 300m
memory: 300Mi
startup:
duration: "5m"
postStartup:
verticalHash: abc1e2323asda
recommendation:
- containerName: nginx
requests:
cpu: 150m
memory: 150MiPrerequisites
For startup recommendations to be generated:
- Workload Autoscaler: v0.63.1 or later
- Kubernetes: v1.33 or later — the cluster must support in-place pod resizing
- Startup period: The workload must have a startup period configured. Without a startup period, no startup phase is generated.
Limited Availability FeatureThis feature is currently available through feature flags. Contact us to enable access for your organization.
Considerations
Enabling startup recommendations can reduce cluster bin-packing efficiency. Because newly scheduled pods temporarily hold their original (higher) CPU requests during the startup period, the cluster must reserve capacity for those values at scheduling time. After the startup period passes and the in-place resize completes, the reserved headroom is no longer needed — but it cannot be immediately reclaimed for other pods. This results in nodes that are slightly over-provisioned for the duration of each startup period.
Behavior details
Startup recommendations are generated selectively. A startup phase is only added when it meaningfully differs from the post-startup recommendation:
| Scenario | Behavior |
|---|---|
| Optimized CPU is higher than the original spec | A regular single-phase recommendation is generated |
| Resources not defined in the original spec | A regular single-phase recommendation is generated |
| Startup and post-startup CPU differ by less than 100m | Regular recommendation is generated — the difference is too small to warrant two phases |
| OOMKill event | Memory is unchanged between phases, so OOMKill handling is unaffected |
| Surge detected | The startup phase is added only if the startup CPU exceeds the post-startup recommendation by at least 100m |
| Mid-rollout change from two-phase to regular | Startup-phase resources are updated to the regular recommendation values via standard reconciliation |
| Mid-rollout change from regular to two-phase | Pods still in the startup receive startup resources; pods past the startup period receive post-startup resources |
| In-place resize not possible | Falls back to a standard single-phase recommendation. This can occur when JVM heap rightsizing is used alongside startup recommendations. |
Updated about 3 hours ago
