Legacy Annotations Reference (Deprecated)

🚧
Important: Annotation Deprecation Notice
The v1 annotation format is deprecated but still supported for backward compatibility. We strongly recommend migrating to the new unified configuration format (v2) for future compatibility and access to the latest features. See Workload Autoscaler Configuration.

Configuration via Annotations v1

All settings are also available by adding annotations on the workload controller. When any workloads.cast.ai annotation is detected on a workload, it will be considered managed by annotations. This allows for flexible configuration, combining annotations and scaling policies.
Changes to the settings via the API/UI are no longer permitted for workloads with annotations. When a workload does not have an annotation for a specific setting, the default or scaling policy value is used.

📘
Note
Workloads can be managed through a combination of annotations and scaling policies. For example, you can set the workloads.cast.ai/scaling-policy annotation on a workload and toggle vertical autoscaling on/off in the scaling policy itself. This provides more flexibility in managing workload configurations.

The annotations generally follow a pattern of workloads.cast.ai/{resource}-{setting}. Currently, the available resources are cpu and memory. Available settings:

Annotation	Possible Values	Default	Info	Required*
workloads.cast.ai/vertical-autoscaling	on, off	-	Automated vertical scaling.	Optional
workloads.cast.ai/scaling-policy	any valid k8s annotation value	default	Specifies the scaling policy name to use. When set, this annotation allows the workload to be managed by both annotations and the specified scaling policy. The scaling policy can control global settings like enabling/disabling vertical autoscaling.	Optional
workloads.cast.ai/apply-type	immediate, deferred	immediate	Allows configuring the autoscaler operating mode to apply the recommendations. Use `immediate` to apply recommendations as soon as the thresholds are passed. Note: `immediate` mode can cause pod restarts. Use `deferred` to apply recommendations only on natural pod restarts.	Optional
workloads.cast.ai/vertical-downscale-apply-type	immediate, deferred	-	Configures the autoscaler operating mode specifically for downscaling operations, allowing for different behavior between upscaling and downscaling. When used in combination with `workloads.cast.ai/apply-type`, it provides fine-grained control over scaling operations.	Optional
workloads.cast.ai/memory-event-apply-type	immediate, deferred	-	Configures the autoscaler operating mode specifically for memory-related events, such as OOMKill or Node Memory Pressure Eviction.	Optional
workloads.cast.ai/{resource}-overhead	float >= 0	cpu: 0, memory: 0.1	Overhead expressed as a fraction, e.g., 10% would be expressed as 0.1.	Optional
workloads.cast.ai/{resource}-target	max, p{x}	cpu: p80, memory: max	The x in the p{x} is the target percentile. Integers between 0 and 99.	Optional
workloads.cast.ai/{resource}-apply-threshold	float >= 0	cpu: 0.1 memory: 0.1	The amount of the recommendation should differ from the requests so that it can be applied. For example, a 10% difference would be expressed as 0.1.	Optional
workloads.cast.ai/{resource}-max	4Gi, 60m, etc.	-	The upper limit for the recommendation. Recommendations won't exceed this value.	Optional
workloads.cast.ai/{resource}-min	4Gi, 60m, etc.	-	The lower limit for the recommendation. Min cannot be greater than max.	Optional
workloads.cast.ai/{resource}-look-back-period-seconds	86400 >= int <= 604800	86400 (24h)	The duration of the look-back period applied to the metric query when generating a recommendation.	Optional

`workloads.cast.ai/vertical-downscale-apply-type`

The workloads.cast.ai/vertical-downscale-apply-type annotation is fully compatible with the workloads.cast.ai/apply-type annotation and is meant to be used in combination with it. This allows for fine-grained control over both upscaling and downscaling. Here's how they interact:

If both annotations are set to the same value (both immediate or both deferred), the behavior remains unchanged.
If apply-type is set to immediate and vertical-downscale-apply-type is set to deferred:
- Upscaling operations will be applied immediately.
- Downscaling operations will be deferred to natural pod restarts.
If apply-type is set to deferred and vertical-downscale-apply-type is set to immediate:
- Upscaling operations will be deferred to natural pod restarts.
- Downscaling operations will be applied immediately.

Example config:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-app
  labels:
    app: my-app
  annotations:
    workloads.cast.ai/vertical-autoscaling: "on" # enable vertical automatic scaling
    workloads.cast.ai/scaling-policy: "my-custom" # select my-custom scaling policy
    workloads.cast.ai/apply-type: "immediate" # apply recommendations immediately for upscaling
    workloads.cast.ai/vertical-downscale-apply-type: "deferred" # defer downscaling to natural pod restarts

    workloads.cast.ai/cpu-overhead:                 "0"      # 0%
    workloads.cast.ai/cpu-apply-threshold:          "0.05"   # 5% 
    workloads.cast.ai/cpu-target:                   "p80"    # 80th percentile
    workloads.cast.ai/cpu-max:                      "400m"   # max 0.4 cpu
    workloads.cast.ai/cpu-min:                      "120m"   # min 0.12 cpu
    workloads.cast.ai/cpu-look-back-period-seconds: "259200" # 3 days

    workloads.cast.ai/memory-overhead:                 "0.1"    # 10%
    workloads.cast.ai/memory-apply-threshold:          "0.05"   # 5%
    workloads.cast.ai/memory-target:                   "max"    # max usage
    workloads.cast.ai/memory-max:                      "2Gi"    # max 2Gi
    workloads.cast.ai/memory-min:                      "1Gi"    # min 1Gi
    workloads.cast.ai/memory-look-back-period-seconds: "172800" # 2 days

Configuration Errors

If the workload manifest contains an invalid configuration, as an example workloads.cast.ai/autoscaling: "unknown-value" the configuration will not be updated (old configuration values will be used until the erroneous configuration is fixed), and you should be able to see the error in the workload details in the CAST AI Console. Since scaling policy names are not restricted character-wise -- any value can be set, but a non-existent policy will be treated as an invalid configuration.

Legacy Annotations Reference (Deprecated)

🚧
Important: Annotation Deprecation Notice

Configuration via Annotations v1

📘
Note

`workloads.cast.ai/vertical-downscale-apply-type`

Example config:

Configuration Errors

🚧Important: Annotation Deprecation Notice

Configuration via Annotations v1

📘Note

workloads.cast.ai/vertical-downscale-apply-type

Example config:

Configuration Errors

🚧
Important: Annotation Deprecation Notice

📘
Note

`workloads.cast.ai/vertical-downscale-apply-type`