Hosted components

Cast AI components hosted on customer clusters.

Cast AI Components Hosted On Customers' Clusters

The Cast AI connection process installs several components into a customer's cluster in phases, providing different levels of functionality:

  • Phase 1: Provides visibility into connected clusters without the ability to tune them. This phase operates in a read-only mode.
  • Phase 2: Enables full functionality of the Cast AI platform, primarily for cluster optimization. In this phase, Cast AI can instruct clusters and Cloud Providers to reorganize resources for optimal performance.

Phase 1 Component - Cast AI Kubernetes Agent

The Cast AI Agent is the first component installed when connecting a new cluster. It runs as a Pod in a dedicated Cast AI namespace:

» kubectl get pods -n castai-agent
NAME                                         READY   STATUS    RESTARTS   AGE
castai-agent-7f9d7ff65b-8qm7p                1/1     Running   0          78m
castai-agent-cpvpa-56f749fb-n2wzp            1/1     Running   0          22d

Phase 2 Autoscaling Components

When a connected cluster is promoted to Phase 2, Cast AI installs additional components to enable cost savings through cluster management:

❯ kubectl get pods -n castai-agent
NAME                                             READY   STATUS    RESTARTS   AGE
castai-agent-7f9d7ff65b-8qm7p                    1/1     Running   0          80m
castai-agent-7f9d7ff65b-kf2zp                    1/1     Running   0          5h7m
castai-agent-cpvpa-56f749fb-n2wzp                1/1     Running   0          22d
castai-cluster-controller-757997ff6c-r6x25       1/1     Running   0          27d
castai-cluster-controller-757997ff6c-xw54g       1/1     Running   0          27d
castai-evictor-5684748495-kl2q4                  1/1     Running   0          22d
castai-kvisor-787c5dd946-gmzs5                   1/1     Running   0          6d18h
castai-spot-handler-44shj                        1/1     Running   0          43m
  • The Cluster Controller executes actions received from the central platform, such as accepting newly created nodes into the cluster.
  • The Evictor removes pods from underutilized nodes to reduce the overall number of cluster nodes.
  • The Spot Handler monitors scheduled events (provided by the Instance Metadata Service) and relays them to the central platform. It is installed as a DaemonSet rather than a regular Deployment.

Phase 2 Security Component - Kvisor

  • Kvisor performs image vulnerability scanning, Kubernetes YAML manifest linting, and provides CIS security recommendations.

Component upgrade methods

Cast AI components installed in your cluster are upgraded using different methods. Understanding which components upgrade automatically versus those requiring manual intervention helps maintain optimal cluster operation.

The table below outlines the upgrade method for each Cast AI component:

ProductComponentUpgrade MethodFrequencyDescription
Cluster AutoscalingAgentManual*N/AMust be manually upgraded by running the upgrade script or the helm command

* - See "Automatic upgrades" section below
EvictorAuto*Upon new releaseAutomatically upgraded by Cast AI as soon as new versions are available

* - See "Automatic upgrades" section below
Spot-handlerManual*N/AMust be manually upgraded using the helm command

* - See "Automatic upgrades" section below
Cluster ControllerManual*Manual processCluster Controller updates are handled through a manual process by Cast AI.
Pod PinnerAutoUpon new releaseAutomatically upgraded by Cast AI as soon as new versions are available
Workload AutoscalingWorkload AutoscalerManualN/AMust be manually upgraded using the helm command
SecuritykvisorManualN/AMust be manually upgraded using the helm command
Reportinggpu-metrics-exporterManualN/AMust be manually upgraded using the helm command
Egressd exporterManualN/AMust be manually upgraded using the helm command
Genericaudit-logs-receiverManualN/AMust be manually upgraded using the helm command

Automatic upgrades

Components marked as "Auto" are automatically upgraded by Cast AI to ensure you always have the latest features and security updates. These upgrades typically occur shortly after a new version is released. Cluster administrators do not need to take any action for these components.

While the cluster-controller can theoretically update itself by receiving an update action from Cast AI, these updates are managed through a manual internal process. However, it cannot update other components, such as castai-evictor, castai-spot-handler or castai-agent. You can explicitly bind a role, such as cluster-admin to the castai-cluster-controller service account. This will allow cluster-controller to manage all other Cast AI components automatically. For more details, visit the Cluster controller auto-update documentation.

Self-Managed Component Options

For customers who prefer to manage their own update schedules, we provide self-managed installation options for several components:

Self-managed components can be updated using tools like Argo CD or Helm on your preferred schedule, giving you greater control over your infrastructure.

Manual upgrades

Components marked as "Manual" require cluster administrators to perform upgrades when new versions are released. These upgrades can typically be performed using helm commands or upgrade scripts provided in the component documentation.

Please refer to each component's dedicated documentation section for detailed instructions for manually upgraded components.

📘

Note

Always check the release notes before upgrading manually updated components to understand potential impacts and required actions.