Node configuration

What is Node configuration?

The CAST AI provisioner allows you to set node configuration parameters that the platform will apply to provisioned nodes. Node configuration on its own does not influence workload placement. Its sole purpose is to apply user-provided configuration settings on the node during the provisioning process.

A cluster can have multiple Node Configurations linked to various Node Templates. However, you can select only one node configuration, which CAST AI Autoscaler will use as the default.

📘

Note

You can link node configuration to multiple node templates, but one node template can have just a single node configuration link.

You can manage node configurations via UI:Autoscaler->Node configuration, API or Terraform.

Shared configuration options

The following table provides a list of supported cloud-agnostic configuration parameters:

ConfigurationDescriptionDefault value
Root volume ratioCPU to storage (GiB) ratio1 CPU: 0 GiB
Initial disk sizeThe base size of the disk attached to the node100 GiB
ImageImage to be used when building a CAST AI provisioned node. See virtual machine image choice below for cloud-specific behaviors.The latest available for Kubernetes release, based on an OS chosen by CAST AI
SSH keyBase64-encoded public key or AWS key ID""
SubnetsSubnet IDs for CAST AI provisioned nodesAll subnets pointing to NAT/Internet Gateways inside the cluster's VPC
Instance tagsTags/VM labels to be applied on CAST AI provisioned nodes""
Kubelet configurationA set of values that will be added or overwritten in the Kubelet configurationJSON {}
Init scriptA script to be run when building the nodebash ""

EKS-specific subnet rules

📘

Note

In EKS only subnets which match one of the rules below are allowed to be added to Node Configuration:

  • association with a route table that has a 0.0.0.0/0 route to Internet Gateway, it's known as a public subnet. Subnet also must have "MapPublicIpOnLaunch: true" set
  • association with a route table that has a 0.0.0.0/0 route to Transit Gateway, it's known as a private subnet
  • association with a route table that has a 0.0.0.0/0 route to NAT Gateway, it's known as a private subnet

If CAST AI cannot detect a routable subnet (a subnet that has access to the Internet), you can add a tag cast.ai/routable=true to the subnet. CAST AI will then consider a subnet with this tag as having Internet access.

Some configuration options are cloud provider-specific. See the table below:

EKS-specific configuration options

ConfigurationDescriptionDefault value
Security groupsSecurity group IDs for nodes provisioned in CAST AITagged and CAST AI SG
Instance profile ARNInstance profile ARN for CAST AI provisioned nodescast-<cluster-name>-eks-<cluster-id> (only the last 8 digits of the cluster ID)
Dns-cluster-ipOverride the IP address to be used for DNS queries within the cluster""
Container runtimeContainer runtime engine selection: docker or containerdUnspecified
Docker configurationA set of values that will be overwritten in the Docker daemon configurationJSON {}
Volume typeEBS volume type to be used for provisioned nodesgp3
Volume IOPSEBS volume IOPS value to be used for provisioned nodes3000
KMS Key ARNCustomer-managed KMS encryption key to be used when encrypting EBS volumesUnspecified
Volume throughputEBS volume throughput in MiB/s to be used for provisioned nodes125
Use IMDS v1IMDSv1 and v2 are enabled by default, else only IMDSv2 will be allowedTrue
Target GroupsA list of Arn and port (optional). New instances will automatically be registered for all given load balancer target groups upon creation.Unspecified
Image FamilyWhich OS family will be used when provisioning nodes. Possible values: FAMILY_AL2, FAMILY_AL2023, FAMILY_BOTTLEROCKET.Amazon Linux 2 (FAMILY_AL2)

🚧

Kubelet configuration

Note that kubeReserved is not supported in EKS configurations.

KMS key for EBS volume

The key that you provide for the encryption of EBS volume must have the following policy:

  {
            "Sid": "Allow access through EBS for all principals in the account that are authorized to use EBS",
            "Effect": "Allow",
            "Principal": {
                "AWS": "*"
            },
            "Action": [
                "kms:ReEncrypt*",
                "kms:GenerateDataKey*",
                "kms:Encrypt",
                "kms:DescribeKey",
                "kms:Decrypt",
                "kms:CreateGrant"
            ],
            "Resource": "*",
            "Condition": {
                "StringEquals": {
                    "kms:CallerAccount": "<<account_ID>",
                    "kms:ViaService": "ec2.<<region>>.amazonaws.com"
                }
            }
}
    
module "kms" {
  source = "terraform-aws-modules/kms/aws"

  description = "EBS key"
  key_usage   = "ENCRYPT_DECRYPT"

  # Policy

  key_statements = [
    {
      sid =  "Allow access through EBS for all principals in the account that are authorized to use EBS",
      principals = [
        {
          type        = "AWS"
          identifiers = ["*"]
        }
      ]
      actions = [
        "kms:Encrypt",
        "kms:Decrypt",
        "kms:ReEncrypt*",
        "kms:GenerateDataKey*",
        "kms:CreateGrant",
        "kms:DescribeKey"
      ],
      resources =  ["*"],
      conditions = [
        {

          test     = "StringEquals"
          variable = "kms:ViaService"
          values   = [
            "ec2.${var.cluster_region}.amazonaws.com",
          ]
        },
        {
          test     = "StringEquals"
          variable = "kms:CallerAccount"
          values   = [
            data.aws_caller_identity.current.account_id
          ]
        }
      ]}
  ]

# Aliases
  aliases = ["mycompany/ebs"]

  tags = {
    Terraform   = "true"
    Environment = "dev"
  }
}

Load balancer target groups prerequisites

In order to use the target groups functionality, the CAST.AI IAM role must be extended with additional permissions. An example of IAM policy is provided below. The sample policy allows registering against all target groups but can be customized to only allow specific resources by replacing the wildcards * with appropriate values.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "castai-targetgroup-registeration",
            "Effect": "Allow",
            "Action": "elasticloadbalancing:RegisterTargets",
            "Resource": "arn:aws:elasticloadbalancing:<region>:<account>:targetgroup/*/*"
        }
    ]
}

If target groups are configured for a node configuration but the permission is missing, nodes will be created and not joined to the target group. A notification will be sent detailing which target groups failed to be updated.

Maximum Pods formula

The Maximum Pods formula dynamically determines the maximum number of pods that can run on a node in your EKS cluster. This calculation serves two primary purposes:

  • It informs the kubelet about the maximum number of pods that can be hosted on a node.
  • It assists the CAST AI Autoscaler in planning and optimizing cluster resources.

You can optimize your cluster's pod distribution and resource utilization by fine-tuning this formula.

Formula variables

Maximum Pods formulas are constructed using the following variables:

VariableDescriptionRange/ValueDefault
NUM_IP_PER_PREFIXNumber of IPv4 addresses per prefix. Affects calculation when using prefix delegation.0-2560
NUM_MAX_NET_INTERFACESMaximum number of network interfaces.Instance-specificN/A
NUM_IP_PER_INTERFACENumber of IPv4 addresses per interface.Instance-specificN/A
NUM_CPUNumber of CPUs.Instance-specificN/A
NUM_RAM_GBAmount of RAM in GB.Instance-specificN/A

CAST AI provides several preset formulas by default from which to choose.

Example formulas

The Maximum Pods formula can be tailored to different cluster configurations and requirements. Here are some common examples:

Default EKS formula

NUM_MAX_NET_INTERFACES * (NUM_IP_PER_INTERFACE - 1) + 2

This is often the default formula for EKS clusters. It calculates the maximum pods based on available network interfaces and IPs, reserving one IP per interface and adding 2 for system pods.

Prefix delegation formula

NUM_MAX_NET_INTERFACES * (NUM_IP_PER_INTERFACE - 1) * NUM_IP_PER_PREFIX + 2

Use this formula when you have prefix delegation configured. It accounts for the additional IPs available through prefix delegation.

Reserved network interface formula

(NUM_MAX_NET_INTERFACES - 1) * (NUM_IP_PER_INTERFACE - 1) + 2

This formula reserves one entire network interface.

Fixed value

512

Simple scalar values are allowed. This example sets a fixed maximum of 512 pods per node, regardless of other factors.

These examples demonstrate the flexibility of the Maximum Pods formula. You can create custom formulas that best suit your cluster's specific needs and constraints.

How the Maximum Pods formula works

The MaxPods formula is an expression that evaluates a number. We use common Common Expression Language (CEL) to evaluate it. Here is how it operates:

  • Input: The formula takes the node's characteristics as input variables (e.g., NUM_MAX_NET_INTERFACES, NUM_CPU, scalers, etc.).
  • Calculation: It performs evaluation using the CEL library.
  • Output: The calculation result is used as an integer. This number becomes the maximum number of pods allowed on the node.
  • Application: The calculated value is used by the kubelet to limit pod scheduling and by the CAST AI Autoscaler for capacity planning.

For example, let's break down a simple formula:

NUM_MAX_NET_INTERFACES * (NUM_IP_PER_INTERFACE - 1) + 2

If a node has 4 network interfaces (NUM_MAX_NET_INTERFACES = 4) and 10 IPs per interface (NUM_IP_PER_INTERFACE = 10), the calculation would be:

4 * (10 - 1) + 2 = 4 * 9 + 2 = 36 + 2 = 38

Thus, this node would be allowed to run a maximum of 38 pods.

The formula's flexibility allows for complex calculations that can account for various resource constraints and operational requirements. When you select or create a formula, you define the logic for this pod limit calculation.

Custom formulas

While the CAST AI Console UI provides several preset formulas for configuring an EKS node, you can define a custom formula using Terraform for more advanced configurations.

Using Bottlerocket with Cast AI

Bottlerocket is a Linux distribution designed and optimized for container orchestration. It contains only the essential components required for this purpose, minimizing the attack surface and enforcing container best practices. This section provides guidance on using Bottlerocket with Cast AI, including instructions on passing custom configurations and important considerations to remember. Useful resources are linked throughout to further understand Bottlerocket and its use cases. To start, here is a Bottlerocket overview:
Container Host - Bottlerocket - Amazon Web Services

Using Bottlerocket

You can use Bottlerocket on CAST AI by specifying the appropriate Amazon Machine Image (AMI) ID or name in CAST AI node configurations. This works out of the box for standard setups, but applying custom configurations requires a slightly different approach due to Bottlerocket's unique architecture.

Specifying the appropriate Amazon Machine Image (AMI) ID

Specifying the appropriate Amazon Machine Image (AMI) ID

Specifying the appropriate Amazon Machine Image (AMI) name

Specifying the appropriate Amazon Machine Image (AMI) name

Limitations with Advanced Node Configuration

When using Bottlerocket AMIs, many of the fields in the advanced section of CAST AI's node configurations do not apply and cannot be used to pass custom configurations. This limitation is due to Bottlerocket's security-focused architecture, which does not provide a default shell in its container. Consequently, any changes to the AMI configuration must be passed through the settings exposed in the Bottlerocket API.

For a full list of settings available in Bottlerocket’s API, refer to the Bottlerocket Settings API Reference.

Passing Custom Configurations via Init Script

Outside of CAST AI, custom configurations are typically passed to Bottlerocket AMIs in TOML format. Here's an example:

[settings.kubernetes]
api-server = "${endpoint}"
cluster-certificate = "${cluster_auth_base64}"
cluster-name = "${cluster_name}"
${additional_userdata}

[settings.kubernetes.node-labels]
"ingress" = "allowed"
"environment" = "prod"

[settings.kubernetes.system-reserved]
cpu = "100m"
memory = "256Mi"
ephemeral-storage = "2Gi"

[settings.kubernetes.kube-reserved]
cpu = "100m"
memory = "256Mi"
ephemeral-storage = "512Mi"

[settings.kubernetes.eviction-hard]
"memory.available" = "10%"

# Hardening based on <https://github.com/bottlerocket-os/bottlerocket/blob/develop/SECURITY_GUIDANCE.md>

[settings.kernel]
lockdown = "integrity"

[settings.host-containers.admin]
enabled = false
source = "328549459982.dkr.ecr.eu-central-1.amazonaws.com/bottlerocket-admin:v0.7.2"

[settings.host-containers.control]
enabled = false

Converting TOML Settings to Init Script Format

In Cast AI, you can pass custom Bottlerocket configurations through the init script in the advanced section of node configurations. However, the settings must be converted to a single-line format compatible with the init script to do this.

Below is an example that converts the TOML-style settings listed above into a format compatible with the init script:

📘

Note

The shebang #!/bin/bash is required to pass our init script validation, even though Bottlerocket AMIs do not have a default shell.

#!/bin/bash
settings.kubernetes.node-labels."ingress"="allowed"
settings.kubernetes.node-labels."environment"="prod"

settings.kubernetes.system-reserved.cpu="100m"
settings.kubernetes.system-reserved.memory="256Mi"
settings.kubernetes.system-reserved.ephemeral-storage="2Gi"

settings.kubernetes.kube-reserved.cpu="100m"
settings.kubernetes.kube-reserved.memory="256Mi"
settings.kubernetes.kube-reserved.ephemeral-storage="512Mi"

settings.kubernetes.eviction-hard."memory.available"="10%"

settings.kernel.lockdown="integrity"

settings.host-containers.admin.enabled=false
settings.host-containers.admin.source="328549459982.dkr.ecr.eu-central-1.amazonaws.com/bottlerocket-admin:v0.7.2"

settings.host-containers.control.enabled=false

Bottlerocket Settings Reference

The remaining fields in the advanced section of CAST AI's node configuration cannot be used to pass these configurations. Instead, you must use the settings exposed in the Bottlerocket API and set them within the init script.

Here is a visual of what sections of advanced node configuration settings can be used with Bottlerocket:

Customers migrating from other AMIs to Bottlerocket must convert their custom configurations into this Bottlerocket-compatible style and can only use the settings exposed in the API reference shared above.

Using Bootstrap Containers for Advanced Configurations

Any custom configurations outside of those settings, or the ability to change settings dynamically at boot, require the use of what Bottlerocket calls a bootstrap container.

Here is an example that dynamically adjusts the maximum number of pods per node based on the instance type:

settings.bootstrap-containers.max-pods-calculator.source = "docker.io/kisahm/bottlerocket-bootstrap-max-pods:v0.2"
settings.bootstrap-containers.max-pods-calculator.essential = false
settings.bootstrap-containers.max-pods-calculator.mode = "always"
settings.bootstrap-containers.max-pods-calculator.user-data = "ZXhwb3J0IEFERElUSU9OQUxfT1BUSU9OUz0iLS1jdXN0b20tY25pIGNpbGl1bSAtLWNpbGl1bS1maXJzdC1pbnRlcmZhY2UtaW5kZXggMSIK"

See the GitHub repository for detailed instructions: Bottlerocket Bootstrap Custom Max Pods

Connecting to Bottlerocket Nodes

Connecting to Bottlerocket nodes requires using an admin or control host container with special privileges. Guidance is provided in the Bottlerocket documentation:

Using Bottlerocket with Cast AI provides a secure foundation for container orchestration. While Bottlerocket's minimalistic design limits the use of traditional configuration methods, you can effectively pass custom configurations through the init script by converting them into the appropriate format. For advanced or dynamic configurations, bootstrap containers offer a solution to extend functionality.

GKE-specific configuration options

ConfigurationDescriptionDefault value
Network tagsA string to be added to a tags field in a GCP VM resourceEmpty
Max pods per nodeMaximum number of pods to be hosted on a node110
Boot diskBoot disk storage typebalanced as per GCP documentation
Use Local SSD-backed ephemeral storageAttach local ephemeral storage backed by Local SSD volumes. Check GCP documentation for more details.False

AKS-specific configuration options

ConfigurationDescriptionDefault value
Max pods per nodeMaximum number of pods to be hosted on a node30
OS DiskThe type of managed OS diskStandard SSD

📘

Note

Kubelet configuration is not supported in AKS.

Virtual machine image choice

When CAST AI provisions a node, it must choose an appropriate VM image. This choice is crucial because the OS and version of the image determine the correct bootstrapping logic and instance type support and are critical to ensuring the node joins the cluster successfully. For advanced use cases, CAST AI offers several options in the node configuration.

EKS

EKS supports a combination of the Image and Image Family fields to control OS choice.

  • Image family: Determines the provisioning logic based on OS. If not provided, a default family is used for all operations (currently Amazon Linux 2).
  • Image: Used to determine the actual image choice more precisely. The system supports three scenarios for this field:
  1. AMI ID (e.g., ami-1234567890abcdef0): A single item. Must point to a specific AMI. If the AMI architecture does not match the instance type, provisioning will fail. Use architecture restrictions in the Node template to avoid this scenario. The AMI must match the image family (default or provided value), or provisioning will fail.
  2. Search string (e.g., amazon-eks-node-*): The search matches the name filter in aws describe-images and can include wildcards. The search can result in multiple images, and the system will choose the latest image in the list based on instance type architecture and Kubernetes version (if part of the image's name). If no images match the instance type architecture or the images are from a different family than the Image family field, provisioning will fail.
  3. Empty: A default search will be performed based on the Image family. This search looks for public Amazon-owned images and will consider instance type architecture and Kubernetes versions to choose the proper image.

Sample scenarios and suggested configuration:

ScenarioSuggested setup
Hands-off approach, let CAST AI choose.Empty Image and Image family.
I want to use a specific OS family and let CAST AI choose the latest image based on the instance architecture and Kubernetes version.Select Image family, empty Image field.
I want to use private or third-party AMI images and let CAST AI choose the image based on instance architecture.Add a search string in Image that matches the required images. Select the proper image family (if different from the default). For multi-architecture instances, the list must include images for both arm64 and x86.
I want to use private or third-party AMI images that do not have architecture-agnostic builds but let CAST AI choose the latest release.Add a search string in Image. Select the proper image family (if different from the default). Add architecture constraints to node templates.
I want to use a specific golden AMI.Enter the AMI in the Image field. Select the Image family (if different from the default) that matches the OS. Add architecture constraints to node templates.

GKE/AKS

For GKE and AKS, the image field can be used to control the node bootstrapping logic (for Linux).

  • The reference must point to a specific image.
  • If the image does not match the instance type architecture (for example, ARM64 image for x86 node), node provisioning will fail.
  • Changing the value might require a successful reconciliation to recreate CAST AI-owned node pools.
  • If an image is not provided, the default behavior is to use the OS image captured when creating the castpool node pool.

How to create a node configuration

A default node configuration is created during cluster onboarding in the CAST AI-managed mode.

You can choose to modify this configuration or create a new one. If you add a new node configuration that will be applied to all newly provisioned nodes, you must mark it as default.

Node configurations are versioned, and when the CAST AI provisioner adds a new node, the latest version of the node configuration is applied.

A new configuration can't be applied to an existing node. If you want to upgrade node configuration on a node or a set of nodes, you need to delete an existing node and wait until Autoscaler replaces it with a new one or rebalance the cluster (fully or partially).

Kubelet configuration examples

You can find all available Kubelet settings in the Kubernetes documentation – Kubelet Configuration. Please refer to the version of your cluster.

For example, if you want to add some specific custom taints during node startup, you could do it with the following snippet:

{
    "registerWithTaints": [
        {
            "effect": "NoSchedule",
            "key": "nodes-service-critical",
            "value": "true"
        }
    ]
}

The second example involves configuring kubelet image pulling and setting kube API limits like the following:

{
    "eventBurst": 20,
    "eventRecordQPS": 10,
    "kubeAPIBurst": 20,
    "kubeAPIQPS": 10,
    "registryBurst": 20,
    "registryPullQPS": 10
}

Create node configuration with the CAST AI Terraform provider

Use the resource castai_node_configuration from CAST AI terraform provider.

Reference example:

resource "castai_node_configuration" "test" {
  name           = local.name
  cluster_id     = castai_eks_cluster.test.id
  disk_cpu_ratio = 5
  subnets        = aws_subnet.test[*].id
  tags           = {
    env = "development"
  }
  eks {
    instance_profile_arn = aws_iam_instance_profile.test.arn
    dns_cluster_ip       = "10.100.0.10"
    security_groups      = [aws_security_group.test.id]
  }
}