Node Templates

What is it?

Node templates are a key part of the Autoscaler component. They allow users to define virtual buckets of constraints and properties for nodes to be added to the cluster during upscaling. Users can:

  • Specify various constraints to limit the inventory of instance types to be used;
  • Select a preferred instance resource offering (e.g., build a spot-only cluster without the need for specific tolerations or selectors);
  • Define how CAST AI should manage spot instance interruptions;
  • Specify labels and taints to be applied on the nodes;
  • Link a Node configuration to be applied when creating new nodes.

When a user enables CAST AI, the default Node template is created in the cluster. This template is essential and serves as the default Autoscaler behavior. In addition to the default template, users can set up multiple templates to suit their specific use case.

Default Node templateNode template
name: default-by-castai
Properties:
- Mandatory - created by CAST AI during cluster onboarding and can't be removed
- Can be switched off
- Used in autoscaling when workloads are not explicitly designated to trigger cluster upscaling through other Node templates
- Can't be renamed
name: custom
Properties:
- Optional - created by the user, can be deleted.
- Can be switched off
- Activated by workloads utilizing autoscaling labels that are also defined in a Node template
- Can't be renamed after creation

πŸ“˜

For the CAST AI autoscaler to function, the Unscheduled Pods policy and at least one Node template (default or not) must be enabled.

Main attributes

The table below outlines the Node template attributes and their availability across different cloud providers.

AttributeDescriptionAvailability per Cloud provider
Custom labels and taintsChoose this if custom labels or taints should be applied on the nodes created by CAST AI.

Based on the selected properties, nodeSelector and toleration must be applied to the workload to trigger the autoscaling.
All
Node configuration linkBy default, Node templates mostly focus on the type of resources that need to be scheduled. To learn how to configure the scheduling of your resources and how to use other kinds of attributes on provisioned nodes, see Node configurationAll
Resource offeringThis attribute tells the Autoscaler which node resource offering (spot or on-demand) it should create. There is an option to configure both offerings, in which case workloads targeted to run on spot nodes must be marked accordingly. Check this guide.

Additional configuration settings available when you want the Autoscaler to use spot nodes:

Spot fallback – when spot capacity is unavailable in the cloud, CAST AI will create temporary on-demand fallback nodes.

Interruption prediction model - this feature works only for AWS customers: CAST AI can react to AWS rebalancing notifications or its own ML model to predict spot interruptions and rebalance affected nodes proactively. See this guide for more details.

Diversified spot instances - by default, CAST AI seeks the most cost-effective instances without assessing your cluster's current composition. To limit the impact of a potential mass spot reclaim, you can instruct the Autoscaler to evaluate and enhance the diversity of spot nodes in the cluster, but this may increase your costs. Read more
The interruption prediction model is only available for AWS
Processor architectureYou can select x86_64, ARM64, or both architecture nodes to be created by CAST AI. When using a multi-architecture Node template, also use the nodeSelector kubernetes.io/arch: "arm64" to ensure that the Pod lands on an ARM node.

Azure only, to provision ARM nodes ensure that they are supported in the region and quota for Standard_DS2_v2 VMs is available. Contact the support team as this feature has to be enabled for the organization.
All
GPU-enabled instancesChoose this attribute to run workloads on GPU-enabled instances only. Once you select it, Instance constraints get enhanced with GPU-related properties.AWS, GCP
Apply Instance constraintsApply additional constraints on instances to be selected, such as:

- Instance Family;
- Min/Max CPU;
- Min/Max Memory;
- Compute-optimized;
- Storage-optimized;
- GPU manufacturer, name, count.
- Bare metal. AWS only
- OS selection (Windows 2019 or Linux). Azure only, ensure that quota for Standard_DS2_v2 VMs is available. Contact the support team as this feature has to be enabled for the organization.
All
Custom instancesThe Autoscaler will consider GCP custom VM types in addition to predefined machines. The extended memory setting will allow the Autoscaler to determine whether a custom VM with an increased amount of memory per vCPU would be the most optimal choice.GCP

Create a Node template

  • You have the following options to create Node templates:

  • Sometimes, when you create a Node template, you may want to associate it with custom Node configurations to be used when provisioning nodes. You can achieve this by linking the template with a custom Node configuration.

Using the shouldTaint flag

While creating a Node template, you can choose if the nodes created by the CAST AI Autoscaler should be tainted or not. This is controlled through shouldTaint property in the API payload.

🚧

When shouldTaint is set to false

Since no taints will be applied on the nodes created by CAST AI, any pods being deployed to the cluster, even the ones without nodeSelector, might get scheduled on these nodes. This effect might not always be desirable.

When shouldTaint is set to true

apiVersion: v1
kind: Pod
metadata:
  name: busybox-sleep
spec:
  nodeSelector:
    scheduling.cast.ai/node-template: spark-jobs
  tolerations:
    - key: scheduling.cast.ai/node-template
      value: spark-jobs
      operator: Equal
      effect: NoSchedule
  containers:
    - name: busybox
      image: busybox:1.28
      args:
        - sleep
        - "1200"

When shouldTaint is set to false

apiVersion: v1
kind: Pod
metadata:
  name: busybox-sleep
spec:
  nodeSelector:
    scheduling.cast.ai/node-template: spark-jobs
  containers:
    - name: busybox
      image: busybox:1.28
      args:
        - sleep
        - "1200"

Using nodeSelector

You can use nodeSelector to schedule pods on the nodes created using the template. By default, you construct nodeSelector using the template name. However, you can opt for a custom label if it better suits your use case.

Using a Node template name in nodeSelector

apiVersion: v1
kind: Pod
metadata:
  name: busybox-sleep
spec:
  nodeSelector:
    scheduling.cast.ai/node-template: spark-jobs
  containers:
    - name: busybox
      image: busybox:1.28
      args:
        - sleep
        - "1200"

Using multiple custom labels in nodeSelector

In case you have a Node template with multiple custom labels custom-label-key-1=custom-label-value-1 and custom-label-key-2=custom-label-value-2. You can schedule your pods on a node created using that Node template by providing nodeSelector with all the custom labels as described below:

apiVersion: v1
kind: Pod
metadata:
  name: busybox-sleep
spec:
  nodeSelector:
    custom-label-key-1: custom-label-value-1
    custom-label-key-2: custom-label-value-2
  containers:
    - name: busybox
      image: busybox:1.28
      args:
        - sleep
        - "1200"

Using nodeAffinity

You can use nodeAffinity to schedule pods on the nodes created using the template. By default, you construct nodeAffinity using the template name. However, you may choose to use a custom label to fit your use case better. The only supported nodeAffinity operator is In.

Using the Node template name in nodeAffinity

apiVersion: v1
kind: Pod
metadata:
  name: busybox-sleep
spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
          - matchExpressions:
            - key: scheduling.cast.ai/node-template
              operator: In
              values:
                - "spark-jobs"
  containers:
    - name: busybox
      image: busybox:1.28
      args:
        - sleep
        - "1200"

Using a Node template's custom labels in nodeAffinity

In case you have a Node template with multiple custom labels custom-label-key-1=custom-label-value-1 and custom-label-key-2=custom-label-value-2. You can schedule your pods on a node created using that Node template by providing nodeAffinity with all the custom labels as described below:

apiVersion: v1
kind: Pod
metadata:
  name: busybox-sleep
spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
          - matchExpressions:
            - key: custom-label-key-1
              operator: In
              values:
                - "custom-label-value-1"
            - key: custom-label-key-2
              operator: In
              values:
                - "custom-label-value-2"
  containers:
    - name: busybox
      image: busybox:1.28
      args:
        - sleep
        - "1200"

Using a mix of nodeAffinity and nodeSelector

In case you have a Node template with multiple custom labels custom-label-key-1=custom-label-value-1 and custom-label-key-2=custom-label-value-2, you can schedule your pods on a node created using that Node template by providing nodeAffinity and nodeSelector with all the custom labels as described below:

apiVersion: v1
kind: Pod
metadata:
  name: busybox-sleep
spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
          - matchExpressions:
            - key: custom-label-key-1
              operator: In
              values:
                - "custom-label-value-1"
    custom-label-key-2: custom-label-value-2
  containers:
    - name: busybox
      image: busybox:1.28
      args:
        - sleep
        - "1200"

More Specific Requirements/Constraints

Templates also support further instance type selection refinement via additional nodeAffinity/ nodeSelector values. This can be done if the additional constraints aren't in conflict with the template constraints.

Here's an example with some assumptions:

  • The template has no constraints;
  • The template is named my-template;
  • The template has custom labels enabled and they are as follows:
product: "my-product"
team: "my-team"

Here's an example deployment that would further specify what a pod supports/needs:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-deployment
  labels:
    app: nginx
spec:
  replicas: 2
  selector:
    matchLabels:
      app: my-app
  template:
    metadata:
      labels:
        app: my-app
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: "team"
                operator: In
                values: ["my-team"]
                # Pick only nodes that are compute-optimized
              - key: "scheduling.cast.ai/compute-optimized"
                operator: In
                values: ["true"]
                # Pick only nodes that are storage-optimized
              - key: "scheduling.cast.ai/storage-optimized"
                operator: In
                values: ["true"]
      nodeSelector:
        # template selector (can also be in affinities)
        product: "my-product"
        team: "my-team"
        # Storage optimized nodes will also have a taint, so we need to tolerate it.
      - key: "scheduling.cast.ai/storage-optimized"
        operator: Exists
        # toleration for the template
      - key: "scheduling.cast.ai/node-template"
        value: "my-template"
        operator: "Equal"
        effect: "NoSchedule"
      containers:
      - name: nginx
        image: k8s.gcr.io/nginx-slim:0.8
        ports:
        - containerPort: 80
        resources:
          requests:
            cpu: 700m
            memory: 700Mi
          limits:
            memory: 700Mi

Using this example, you'd get a storage-optimized, compute-optimized node on a template that doesn't have those requirements for the whole pool.

Note: If a template has one of those constraints (on the template itself), there is currently no ability to loosen requirements for some pods based on their nodeAffinity/ nodeSelector values.

Node template matching

By default, workloads must specify node selectors or affinity that match all terms defined in the Node template's Custom labels. If at least one term is not matched, then the Node template is not considered.

Multiple Node templates can be matched by a workload. In such cases, the templates are scored primarily by the number of terms that the Node template contains (there are additional scoring criteria). The greater the number of terms, the higher the score.

Partial Node template matching functionality

Partial matching allows workloads to specify at least one of all terms of the Node template for it to be considered when provisioning a node for the workload. To enable partial Node template matching, go to Autoscaler settings, navigate to Unscheduled pods policy, and select Partial node template matching.

When partial matching is enabled, the Unscheduled Pods Policy Applied event will include information on how the Node template was selected. This can be used to troubleshoot autoscaler decisions.

Node template selection

When multiple Node templates are matched, they are sorted, and the first one is selected to create a node for the workload. The sorting criteria, in order of their priority, are:

  • Node templates where all terms were matched are sorted higher.
  • The count of matched terms, in descending order.
  • The average price per CPU of instance types within the Node template, in descending order.
  • The name of the template, in alphabetical order.

Support of Dedicated (Sole tenant) nodes

πŸ“˜

This is a new feature currently supported only on GKE clusters

Please note that the sole tenancy node group must be shared with the GKE cluster's project or organization.

Dedicated nodes (also known as Sole tenant nodes) are dedicated hosts in the customer's cloud account that can host VMs that are used in the Kubernetes clusters. The setup for GKE clusters can be found in the Google Cloud documentation.

CAST AI Autoscaler can be instructed to prefer dedicated hosts until capacity is available when the following parameters are set in the Node template:

AttributeDescription
AffinityThe affinity rules required for choosing the node dedicated note group. It is constructed using:
Key
Value
Operator:
IN: In values
NotIn: Not in values
Exists: Just exist
DoesNotExist: Values does not exist
Gt: Greater then
Lt: Lower then
azNameThe availability zone of the dedicated node group
InstanceTypesInstance types of the dedicated node group

Example of configuration setup in the API response:

    "dedicatedNodeAffinity": [
      {
        "instanceTypes": [
          "c2-node-60-240"
        ],
        "name": "test-nodegroup",
        "affinity": [
          {
            "operator": "IN",
            "key": "stnodegroup",
            "values": [
              "test-group"
            ]
          },
          {
            "operator": "IN",
            "key": "compute.googleapis.com/project",
            "values": [
              "test-123"
            ]
          }
        ],
        "azName": "europe-west1-b"

Please note that all constraints configured in the node template will be respected (e.g. min CPU set to 16).

Once the dedicated host capacity is saturated, CAST AI will not add additional dedicated hosts to the node group. Instead, it will fall back to multi-tenant instances configured in the Node template. During the fallback to multi-tenant instances, the same availability zone will be selected as configured in the dedicatedNodeAffinity.

🚧

Currently, the Rebalancer does not simulate the deletion of instances running on dedicated nodes; therefore, rebalancing the whole cluster might produce some suboptimal results.