Feature reference
Karpenter Enterprise brings Cast AI's optimization capabilities to clusters running open-source Karpenter. This page provides an overview of available features and how they integrate with your existing Karpenter setup.
For a conceptual introduction to the Karpenter Enterprise suite, see Karpenter Enterprise overview.
Feature availability
The following features are available for Karpenter-managed clusters:
| Feature | Description | Karpenter integration |
|---|---|---|
| Evictor | Workload consolidation through Evictor with container live migration capabilities | Works alongside Karpenter's consolidation |
| Rebalancer | Cluster-wide cost optimization through Node selection and replacement | Coordinates with Karpenter provisioning |
| Spot intelligence | Interruption prediction and reliability | Enhances Karpenter's Spot handling |
| Workload Autoscaler | Continuous workload rightsizing | Feeds optimized requests to Karpenter |
| Pod mutations | Automated Pod spec adjustments | Simplifies workload configuration |
| Cost reporting | Savings analysis and cost monitoring | Read-only analysis of Karpenter clusters |
How Cast AI features work with Karpenter
Cast AI features are designed to extend Karpenter rather than replace it. The integration follows these principles:
Karpenter remains the provisioner
Node creation and deletion continue to flow through Karpenter. Cast AI influences decisions by modifying Karpenter CRDs and providing optimization signals, but Karpenter executes the actual infrastructure changes.
CRD-native configuration
Where possible, Cast AI stores configuration in Kubernetes-native formats. Your existing NodePools and EC2NodeClasses remain the source of truth for provisioning constraints.
Incremental enablement
Each feature can be enabled independently. You can start with cost reporting only, then gradually enable optimization features as you build confidence.
Feature details
Evictor
Cast AI's Evictor integrates with Karpenter to improve Node utilization while minimizing workload disruption.
Although Karpenter has native consolidation, Evictor provides significantly better bin-packing through its integration with Workload Autoscaler and Container Live Migration. This coordination allows Evictor to consolidate based on actual resource usage rather than just requested resources, achieving utilization levels that Karpenter's consolidation alone cannot reach.
What it adds to Karpenter:
- Coordination with Workload Autoscaler to consolidate Pods according to optimized resource requests, even if they are yet to be applied to the Pod and are pending
- Container Live Migration support for eligible workloads, preserving Pod state and TCP connections when moving Pods from Node to Node (graceful fallback to traditional eviction included)
- Progressive consolidation that respects Pod Disruption Budgets
- Superior bin-packing efficiency by considering actual workload usage patterns, not just static resource requests
How it differs from standard Cast AI:
| Aspect | With Karpenter | Standard Cast AI |
|---|---|---|
| Node selection | Evictor identifies candidates; Karpenter handles Node lifecycle | Evictor works with Cast AI Autoscaler directly |
| Consolidation trigger | Coordinates with Karpenter's consolidation settings | Cast AI controls consolidation timing |
| Node deletion | Karpenter deletes empty Nodes | Cast AI Autoscaler deletes Nodes |
Coordinating with Karpenter consolidation
When Evictor is enabled, Cast AI marks Nodes with karpenter.sh/do-not-disrupt to prevent Karpenter from consolidating them. This ensures Evictor and Karpenter's consolidation don't conflict.
Evictor respects your NodePool's consolidateAfter setting. Evictor will not consolidate a Node until both its own grace period and Karpenter's consolidateAfter window have passed. For example, if your NodePool specifies:
disruption:
consolidateAfter: 30m
consolidationPolicy: WhenEmptyOrUnderutilizedEvictor will wait at least 30 minutes before considering the Node for consolidation, preventing conflicts with Karpenter's consolidation policy.
Protecting Nodes from consolidation
To exclude specific Nodes from Evictor consolidation, apply the autoscaling.cast.ai/removal-disabled label. This works similarly to Karpenter's karpenter.sh/do-not-disrupt label and allows you to manually protect critical Nodes.
For Evictor documentation, see Evictor.
Rebalancer
The Rebalancer optimizes your entire cluster by identifying Nodes that could be replaced with more cost-effective alternatives for your workloads.
What it adds to Karpenter:
- Cross-NodePool optimization that Karpenter doesn't perform natively
- Awareness of Reserved Instances and Savings Plans
- Coordinated replacements that maintain workload stability
- Integration with Workload Autoscaler to rebalance based on optimized resource requirements
- Container Live Migration support for zero-downtime Node replacements
Rebalancer capabilities not available with Karpenter:
When using Rebalancer with Karpenter, the following advanced drain controls from standard Cast AI Rebalancer are not available:
- Aggressive mode for faster consolidation
- Graceful eviction for controlled disruption
- Paused drain configuration
These limitations exist because Karpenter manages the Node lifecycle. Rebalancer coordinates with Karpenter by cordoning Nodes and allowing Karpenter to handle the actual replacement and drain process.
How it differs from standard Cast AI:
| Aspect | With Karpenter | Standard Cast AI |
|---|---|---|
| Node replacement | Rebalancer cordons Nodes; Karpenter provisions replacements | Cast AI handles both cordoning and provisioning |
| Instance selection | Influences Karpenter via CRD modifications | Cast AI selects instances directly |
| Commitment awareness | No native commitments integration | Native commitments integration |
| Drain controls | Limited by Karpenter's drain behavior | Full control over drain timing and aggression |
For Rebalancer documentation, see Rebalancer.
Spot intelligence
Cast AI improves Karpenter's Spot Instance handling with predictive capabilities and reliability improvements.
What it adds to Karpenter:
- Spot reliability model — Steers toward historically stable Spot pools
- Interruption prediction — Identifies at-risk Nodes before AWS announces interruptions
- Spot fallback recovery — Automatically returns to Spot when capacity becomes available again
How interruption prediction works:
Cast AI's Spot interruption prediction operates differently depending on whether Container Live Migration is enabled:
- Without CLM: Cast AI signals Karpenter to handle the replacement through Karpenter's standard interruption workflow
- With CLM: Cast AI coordinates a proactive rebalancing operation with zero-downtime Pod migration before the interruption occurs
How it differs from standard Cast AI:
| Aspect | With Karpenter | Standard Cast AI |
|---|---|---|
| Pool selection | Influences Karpenter's instance type priorities | Cast AI selects pools directly |
| Fallback handling | Monitors Karpenter's fallback Nodes for recovery | Native fallback and recovery |
| Prediction response | Signals Karpenter to replace at-risk Nodes | Direct Node replacement |
For Spot handling documentation, see Spot Instances and Spot Handler.
Workload Autoscaler
Workload Autoscaler continuously rightsizes workloads based on actual resource usage.
What it adds to Karpenter:
- Automatic adjustment of CPU and memory requests to match actual usage
- Tighter bin-packing as rightsized workloads require less capacity
- Integration with Evictor and Rebalancer for coordinated optimization
How it differs from standard Cast AI:
| Aspect | With Karpenter Enterprise | Standard Cast AI |
|---|---|---|
| Request updates | Workload Autoscaler updates requests; Karpenter sees new requirements | Same behavior |
| Node impact | Karpenter may consolidate as requests decrease | Cast AI coordinates this directly with Evictor |
| Scaling policies | Applied identically | Applied identically |
Workload Autoscaler behavior is largely identical whether you're using Karpenter Enterprise or Cast AI's Autoscaler—it operates at the workload level independently of Node provisioning.
For Workload Autoscaler documentation, see Workload Autoscaling.
Pod mutations
Pod mutations automate Pod spec adjustments to simplify workload configuration and reduce manual efforts by teams.
What it adds to Karpenter:
- Automatic application of labels, tolerations, and NodeSelectors
- Simplified onboarding without modifying Deployment manifests
- Consistent Pod configuration across workloads
How it differs from standard Cast AI:
Pod mutations work identically with Karpenter and standard Cast AI. The mutations apply to Pod specs before creation, independent of which autoscaler provisions Nodes.
Cost reporting
The savings report and other cost monitoring capabilities provide visibility into your cluster's optimization potential without making any changes.
What it provides:
- Current vs. optimized cost comparison
- Node utilization and bin-packing analysis
- Commitment utilization tracking
- Spot adoption opportunities
- Workload rightsizing recommendations
How it differs from standard Cast AI:
Cost reporting works identically for Karpenter clusters. The analysis examines your current state and models what Cast AI optimization could achieve.
For general cost monitoring, see Cost Monitoring.
Features not available with Karpenter
Some Cast AI capabilities require tighter integration with Node scheduling than the Karpenter-layered approach allows:
| Feature | Why is it not available | Alternative |
|---|---|---|
| Pod Pinner | Requires Cast AI Autoscaler's scheduling integration | Use Karpenter's native Pod affinity |
| Cluster hibernation | Requires direct control over Node lifecycle | Use Karpenter's NodePool weight=0 for manual scaling |
To benefit from these capabilities, consider migrating to Cast AI Autoscaler.
Related resources
- Karpenter Enterprise overview — Conceptual introduction
- Getting started — Connect your cluster
Updated 3 days ago
