Node-aware DaemonSet sizing
Early Access FeatureThis feature is in early access. It may undergo changes based on user feedback and continued development. We recommend testing in non-production environments first and welcome your feedback to help us improve.
Some workloads — most notably DaemonSets, but also any pod whose footprint is naturally proportional to the node it runs on (log shippers, metric agents, CSI drivers, service meshes) — are hard to size with a single static request. A request that fits a small node wastes capacity on a large one, and a request that fits a large node gets evicted on a small one.
The node allocatable percentage resource strategy lets you express requests as a percentage of the node's allocatable CPU and memory, resolved at pod admission time. Workload Autoscaler reads the percentage from the workload's resource recommendation, looks up the node the pod is bound to, and rewrites the pod's requests and limits to a concrete value before the pod is admitted.
The strategy is currently only configurable via workload annotations. Scaling policy and UI support will follow.
How it works
When you enable the strategy on a workload, Workload Autoscaler:
- Picks up the configuration on the next reconciliation and stores the percentages alongside the workload's existing recommendation, together with a static fallback request and limit derived from observed usage.
- As each pod is admitted, reads the node assigned to the pod and computes the request as a percentage of that node's allocatable CPU and memory:
cpu request = node allocatable CPU × cpuPercent / 100memory request = node allocatable memory × memoryPercent / 100
- Clamps the resolved request to the workload's
minandmaxconstraints (if set), then derives the limit from the clamped request using the workload's existing limit strategy (multiplier,noLimit, ormaintainRatioin annotations). - Stamps the resolved request and limit into the pod spec.
If the target node cannot be determined at admission — for example, when an unbound pod that doesn't pin a node via nodeAffinity is admitted before scheduling — Workload Autoscaler applies the static fallback request and limit instead. Pods are never rejected.
The strategy fully overrides the requests and limits that Workload Autoscaler would otherwise apply for the configured resource.
Not compatible withkeepLimitsIf a workload's policy keeps existing limits (
keepLimitsin annotations) for CPU or memory, the percentage strategy is not applied to that resource — the workload falls back to the recommender's request and the manifest's original limit. Usemultiplier,noLimit, ormaintainRatioinstead.
Compatibility
Minimum component versions required for node-aware DaemonSet sizing:
| Component | Minimum version |
|---|---|
castai-workload-autoscaler | v0.105.0 |
castai-agent | v0.123.1 |
NoteThe percentage is resolved at pod admission. Existing pods keep whatever requests and limits they were admitted with — they pick up new values the next time they're recreated.
Configuration
Configure the strategy via the workloads.cast.ai/configuration annotation on the workload (Deployment, StatefulSet, DaemonSet, etc.):
metadata:
annotations:
workloads.cast.ai/configuration: |
vertical:
resourceStrategy:
type: nodeAllocatablePercentage
nodeAllocatablePercentage:
cpuPercent: 5
memoryPercent: 2You can set cpuPercent only, memoryPercent only, or both. The resource that is not configured falls back to whatever the rest of the Workload Autoscaler policy dictates (target utilization, recommender output, and so on).
Limits are derived from the resolved request using the workload's existing CPU and memory limit strategy. See Annotations reference for the available limit options.
Settings reference
| Field | Type | Required | Range | Description |
|---|---|---|---|---|
resourceStrategy.type | string | Yes | nodeAllocatablePercentage | Selects the strategy. |
resourceStrategy.nodeAllocatablePercentage.cpuPercent | float | One of cpuPercent/memoryPercent is required | (0, 100] | Percentage of node-allocatable CPU to request. |
resourceStrategy.nodeAllocatablePercentage.memoryPercent | float | One of cpuPercent/memoryPercent is required | (0, 100] | Percentage of node-allocatable memory to request. |
Example
Consider a DaemonSet annotated to request 5% of node-allocatable CPU and 2% of node-allocatable memory, with limits set to 1.5× the resolved request:
metadata:
annotations:
workloads.cast.ai/configuration: |
vertical:
resourceStrategy:
type: nodeAllocatablePercentage
nodeAllocatablePercentage:
cpuPercent: 5
memoryPercent: 2
cpu:
limit:
type: multiplier
multiplier: 1.5
memory:
limit:
type: multiplier
multiplier: 1.5On a 16 vCPU / 64 GiB node, each pod is sized to:
| Resource | Request | Limit |
|---|---|---|
| CPU | 800m | 1200m |
| Memory | ~1.28 GiB | ~1.92 GiB |
On a 4 vCPU / 16 GiB node, each pod is sized to:
| Resource | Request | Limit |
|---|---|---|
| CPU | 200m | 300m |
| Memory | ~327 MiB | ~491 MiB |
The same workload definition produces different requests on different nodes — no extra configuration, no per-node policies.
Limitations
- CPU and memory only. Ephemeral storage is not supported.
- Not compatible with
keepLimitsfor the same resource. Workload Autoscaler rejects configurations that combine the percentage strategy withkeepLimitson the same resource. Usemultiplier,noLimit, ormaintainRatioinstead. - Target node must be known at admission. Workload Autoscaler only resolves the percentage when the pod has
spec.nodeNameset, or when anodeAffinityrule pins it to a single node bymetadata.name. DaemonSet pods always meet this requirement — the DaemonSet controller fills inspec.nodeNamebefore the pod is admitted. For Deployments, StatefulSets, and other workloads where the pod is admitted before scheduling, the pod gets the static fallback request and limit unless it pins a node vianodeAffinity. - Resolved per pod, not retroactively. Resizing a node in place does not resize the pods already running on it — they pick up new values the next time they're recreated.
- Annotations only. There is no UI or scaling policy support yet; configure the strategy on each workload directly.
Troubleshooting
- Pod has the static fallback request, not the percentage-derived one. Workload Autoscaler couldn't determine the target node at admission. Confirm the pod has
spec.nodeNameset or anodeAffinityrule that pins it to a single node bymetadata.name. Check the Workload Autoscaler logs for fallback messages. - The percentage strategy isn't being applied at all. Check that the workload doesn't combine the strategy with
keepLimitson the same resource — if it does, Workload Autoscaler falls back to the standard recommendation for that resource. Either removekeepLimitsor remove the percentage for that resource. - Limits look higher than expected. Limits are derived from the resolved request — that is, after
min/maxconstraints are applied. If you set aminof500mand a limitmultiplierof1.5, a node where 5% would be200mstill produces a request of500mand a limit of750m.
See also
Updated about 2 hours ago
