Troubleshooting Cast AI components

Solutions for resolving issues with Cast AI agent, cluster controller, and other platform components.

Upgrading the agent

For clusters onboarded via console

To check the version of the agent running on your cluster, use the following command:

kubectl describe pod castai-agent -n castai-agent | grep castai-hub/library/agent:v

You can cross-check our GitHub repository for the number of the latest version available.

To upgrade the Cast AI agent version, please perform the following:

  1. Go to Connect cluster.
  2. Select the correct cloud service provider.
  3. Run the provided script.

In case of an error when upgrading the agent, e.g. MatchExpressions:[]v1.LabelSelectorRequirement(nil)}: field is immutable run the command kubectl delete deployment -n castai-agent castai-agent and repeat step 3.

The latest version of the Cast AI agent is now deployed in your cluster.

For clusters onboarded via Terraform

By default, Terraform modules do not specify the castai-agent Helm chart version. As a result, the latest available cast-agent Helm chart is installed when onboarding the cluster, but as new agent versions are released, re-running Terraform doesn't upgrade the agent.

We are looking to solve this, but a short-term fix would be to provide a specific agent version as TF variable agent-version and reapply the Terraform plan. For valid values of the castai-agent Helm chart, see releases of Cast AI helm-charts.


Deleted agent

If you delete the Cast AI agent deployment from the cluster, you can reinstall it by rerunning the script from the Connect cluster screen. Please ensure you choose the correct cloud service provider.


Custom secret management

There are many technologies for managing Secrets in GitOps. Some store the encrypted secret data in a git repository and use a cluster add-on to decrypt it during deployment. Others use a reference to an external secret manager/vault.

The agent helm chart provides the parameter apiKeySecretRef to enable the use of Cast AI with custom secret managers.

# Name of secret with Token to be used for authorizing agent access to the API
# apiKey and apiKeySecretRef are mutually exclusive
# The referenced secret must provide the token in .data["API_KEY"]
apiKeySecretRef: ""

An example of the Cast AI agent

Here's an example of using a Cast AI agent helm chart with a custom secret:

helm repo add castai-helm https://castai.github.io/helm-charts
helm repo update
helm upgrade --install castai-agent castai-helm/castai-agent -n castai-agent \
  --set apiKeySecretRef=<your-custom-secret> \
  --set clusterID=<your-cluster-id>

An example of the Cast AI cluster controller

An example of using the Cast AI cluster controller helm chart with a custom secret:

helm repo add castai-helm https://castai.github.io/helm-charts
helm repo update
helm upgrade --install castai-agent castai-helm/castai-cluster-controller -n castai-agent \
  --set castai.apiKeySecretRef=<your-custom-secret> \
  --set castai.clusterID=<your-cluster-id>