Upgrading Kubernetes version
This guide outlines the recommended steps for upgrading your Kubernetes cluster managed by CAST AI. Following these steps ensures a smooth upgrade process and minimizes potential disruptions to your workloads.
Before you begin
Before deciding to upgrade the Kubernetes version in your clusters, review the Kubernetes release notes for any new features or deprecations that may affect your workloads.
Before starting the upgrade process, ensure that:
- You have access to your cloud provider's console (AWS, Azure, or GCP).
- You have the necessary permissions to modify your Kubernetes cluster.
- You have access to the Cast AI console.
Upgrade process
Step 1: Upgrade the control plane
- Log in to your cloud provider's console.
- Navigate to your Kubernetes cluster management section.
- Initiate the control plane upgrade process.
Note
Make sure to upgrade the control plane separately from the node pools. Cast AI will automatically handle node pool upgrades after the control plane upgrade. This approach helps avoid scaling delays and ensures seamless updates.
- Wait for the control plane upgrade to complete successfully.
Note
For exact steps on how to perform a control plane upgrade using your cloud provider's console, refer to your cloud provider's documentation:
Step 2: Reconcile in CAST AI
After the control plane upgrade is finished:
- Navigate to the CAST AI console.
- Locate your cluster and click the "Trigger reconcile" button.
Note
Alternatively, you can wait for the auto-reconcile, which occurs every 30 minutes.
The reconciliation process will:
- Initiate the creation of new images for node pools (this can take 20-30 minutes).
- Ensure that all new nodes added by CAST AI use the upgraded control plane version.
Tip
For AKS clusters, you can confirm the image update by checking the Azure Compute Gallery for new images with the current date.
Step 3: (Optional) Full node replacement
To replace all existing nodes with new ones using the upgraded version:
-
Set up a scheduled rebalance with the following parameters:
- Minimum node age: 0
- Target savings: 0%
This ensures all old nodes are replaced.
- Generate and run a rebalancing plan.
For detailed instructions on scheduled rebalancing, see our Scheduled rebalancing guide.
Troubleshooting
If you encounter issues during the upgrade process, follow these steps:
Nodes failing to be added after rebalance
If you initiated a rebalance without reconciling first, and nodes are timing out or failing to be added:
-
For AKS clusters:
- Delete the
castpool
from the Azure cloud console.
- Delete the
-
Trigger a reconciliation from the CAST AI UI.
-
After the reconciliation is complete, re-run the rebalance to add nodes successfully.
Next steps
After successfully upgrading your Kubernetes cluster:
- Verify that all your workloads are running correctly.
- Update any client applications or tools that interact with your cluster to ensure compatibility with the new Kubernetes version.
Updated 3 months ago