Commitments
The commitments feature set is Cast AI's generic approach to utilizing Reserved Instance (AWS, Azure), Savings Plans (AWS), and Committed Use Discount (GCP) capacity in autoscaling.
Cast AI will utilize Azure Reservations, AWS Reserved Instances (RIs), AWS Savings Plans, or GCP Instances procured using Resource-based committed use discounts (CUDs) to scale up a cluster and maximize the utilization of the customer's long-term commitments.
AWS Savings Plans are a flexible pricing model that provides significant savings on compute usage. Unlike Reserved Instances that are tied to specific instance types and availability zones, Savings Plans offer greater flexibility by applying to any instance family, size, operating system, tenancy, or region within the commitment scope. Cast AI supports both Compute Savings Plans (covering all AWS compute services) and EC2 Instance Savings Plans (for specific instance families in particular regions).
Supported providers
| Provider | Supported |
|---|---|
| GCP | + |
| Azure | + |
| AWS | + |
| Cast AI Anywhere | - |
How it works
Commitments are uploaded at the organization level using a script (Committed Use Discounts for GCP, RIs, and Savings Plans for AWS, Azure Reservations). One CUD, Reserved Instance, or Savings Plan in the user's cloud account equals one commitment in the Cast AI platform.
For AWS and Azure, the import workflow includes an auto-enablement option (selected by default) that automatically enables newly imported commitments for autoscaler usage and assigns them to all clusters. This ensures immediate utilization without requiring manual activation. For GCP, or if you disable auto-enablement during import, commitments must be manually enabled and assigned.
Once uploaded, commitments can be managed in the following way:
- Enabled—When a commitment is assigned to a cluster but is disabled, it will not be actively utilized in the autoscaling.
- Cast AI can be restricted to use only a specified percentage of an uploaded commitment by setting the allowed percentage of utilization in the commitment settings.
- Assigned to one or multiple clusters - if a commitment is uploaded to Cast AI but is not assigned to a cluster and is not enabled, it will not be used in the autoscaling; however, its utilization is still tracked.
- Cast AI supports assigning multiple commitments to all existing and newly added clusters as long as they match by region.
- Deleted—The commitment can be deleted from Cast AI inventory and will no longer be tracked or utilized. This action does not affect the commitment record in the cloud account.
When autoscaling and tracking commitment usage, Cast AI supports flex sizing, meaning an instance of any size can be provisioned as long as it is within the same instance family covered by the commitment.
Setting up commitments
Accessing the commitments page
Navigate to the Commitments page through the left sidebar:
- In the left navigation panel, expand the Optimization section
- Select Commitments from the submenu
Uploading commitments
To upload commitments to Cast AI:
-
Click Upload commitments: On the Commitments page, click Upload commitments in the top right corner
-
Select your cloud provider: If you have connected clusters from more than one Cloud Service Provider (CSP) in your Cast AI organization, choose AWS, Azure, or GCP from the available options in the upload flow. For single-CSP organizations, the CSP is detected automatically.
-
Configure auto-enablement (AWS and Azure only): By default, the Enable commitments for autoscaler usage and assign to all clusters option is selected. When enabled, newly imported commitments are automatically enabled for autoscaler usage and assigned to all clusters, ensuring immediate utilization of Reserved Instances and Savings Plans without requiring manual activation. You can disable this option if you prefer to manually enable and assign commitments after import.
-
Run the integration script: Copy and execute the provided script to establish the connection between Cast AI and your cloud account.
Cast AI automatically pulls commitment data, including Reserved Instances, Savings Plans (AWS), Azure Reservations, or GCP Committed Use Discounts.
Once the upload is complete, your commitments will appear in the table where you can manage their assignments and utilization settings.
Managing commitment assignments
After uploading commitments, you can control how they're utilized:
Enable and disable commitments: Toggle individual commitments on or off to control their use in autoscaling decisions.
Assign to clusters: Allocate commitments to specific clusters or enable auto-assignment for newly added compatible clusters.
Set utilization scope: Restrict Cast AI to use only a specified percentage of a commitment to accommodate services outside of Kubernetes that also use the commitment.
Upscaling the cluster using commitments
When Cast AI scales up your cluster, it follows specific rules to maximize your commitment utilization. Here's how the process works:
Regional matching: Commitments must be in the same region as your cluster to be used by the Autoscaler.
Priority order: When you have both Reserved Instances and Savings Plans, Cast AI uses Reserved Instances first (they provide the highest savings), then moves to Savings Plans once RIs are fully utilized.
Commitment scope: The Autoscaler uses up to 100% of each assigned commitment. When a commitment reaches full capacity, scaling continues with regular On-Demand or Spot Instances.
Organization-wide tracking: Commitment usage is tracked across your entire organization. If any cluster uses an instance type covered by a commitment, it counts toward that commitment's utilization - even if the cluster isn't directly assigned to that commitment.
Multi-cluster assignments: You can assign a single commitment to multiple clusters. During simultaneous scaling events, this may temporarily exceed the commitment capacity. Use scheduled rebalancing to optimize resource allocation during off-peak hours.
Pricing behavior: Commitment-covered instances appear as $0 in Cast AI tools (Rebalancer, Cost Monitoring) since they're prepaid. The Autoscaler always prioritizes these instances when scheduling conditions are met.
Commitment lifecycle: The Autoscaler stops using commitments when they expire or reach their maximum CPU capacity.
Using commitments with Spot Instances
Cast AI prioritizes commitment-covered instances when scaling up your cluster. The Autoscaler treats commitment-covered capacity as On-Demand instances and provisions them first to maximize commitment utilization. Once commitment capacity is exhausted, the Autoscaler selects the next cheapest option based on your node template configuration.
Maximizing commitment utilization with Spot Instance fallbackCast AI automatically prioritizes commitment-covered capacity when scaling your cluster. To set up workloads that maximize commitment usage and fall back to Spot Instances when commitments are exhausted:
Requirements:
- Your node template must include both On-Demand and Spot Instance offerings
- Workloads must have Spot tolerations (applied directly in manifests or via Pod Mutations)
Scaling behavior with these settings:
- Cast AI provisions commitment-covered instances first (treated as On-Demand capacity)
- Once commitments reach full utilization, new workloads with Spot tolerations are scheduled on Spot Instances
- If Spot capacity is unavailable, workloads fall back to regular On-Demand instances
Note: If your node template is configured for Spot-only (without On-Demand offering enabled), the Autoscaler will exclusively use Spot Instances regardless of commitment availability.
Prioritized utilization
The Prioritized Utilization feature enables allocating commitment-covered resources among different clusters based on each cluster's business priority. This feature allows high-priority clusters, such as production environments, to have first access to capacity, ensuring they primarily run on these discounted resources during peak times for maximum stability and cost efficiency. Lower-priority clusters, like non-production environments, can be set up to utilize Spot Instances when commitment-covered resources are unavailable. During off-peak hours, when higher-priority clusters scale down, lower-priority clusters can access any freed-up capacity.
The feature allows users to set priority levels for each cluster assigned to the commitment. As clusters upscale or downscale, the system dynamically adjusts resource allocation based on these priorities, ensuring that the highest priority clusters maximize commitments. When upscaling, a top-priority cluster's Autoscaler ignores the commitment count assigned to lower-priority clusters and considers the commitment fully available (only counting its own utilization). During the upscale of the second priority cluster, the commitment calculation considers the utilization of the top priority cluster; however, it ignores the utilization of all lower priority clusters, and so on. This behavior could lead to over-utilization of the assigned commitments and should be addressed by running a Rebalancer on the lower-priority clusters.
Reporting utilization of commitments
Currently, reporting only provides a snapshot of the present situation without the ability to review historical data.
Troubleshooting
Commitments not being utilized
If the Autoscaler is provisioning Spot or On-Demand instances when you have available commitment capacity, verify:
- Your node template includes the On-Demand offering (commitments are treated as On-Demand capacity)
- The commitment is enabled and assigned to your cluster
- The commitment matches your cluster's region
- The commitment has available capacity (not fully utilized by other clusters or workloads)
Key formulas
The following formulas are applied when it comes to commitment management:
Commitment utilization= (Provisioned CPU÷Total CPU in the commitment) × 0.5 + (Provisioned MEM÷Total MEM in the commitment) × 0.5
Known limitations
The current implementation of commitment management has the following limitations:
- Commitment utilization calculations are not updated frequently. They will be recalculated with each node addition or deletion and whenever clusters are onboarded or removed.
- If you upload commitments but haven't connected any clusters from that cloud provider to Cast AI, the commitments won't appear in the user interface.
- Commitments are not supported for Cast AI Anywhere deployments.
Updated about 17 hours ago
