CAST AI provides a set of CronJobs that can be used to pause and resume Kubernetes cluster on a defined schedule. When executed CAST AI components will continue to run on a defined single node, while the rest of the cluster capacity will be removed. Once cluster is set to resume, CAST AI will use standard Autoscaler capabilities to provide most cost efficient nodes to run pending pods.
In order to pause and resume a cluster two CronJobs will be executed:
Disable Unscheduled Pod Policy (to prevent growing cluster)
Prepare Hibernation node (node that will stay hosting essential components)
Mark essential Deployments with Hibernation toleration
Delete all other nodes (only hibernation node should stay running)
Renable Unscheduled Pod Policy to allow cluster to expand to needed size
Set the HIBERNATE_NODE environment variable to override the default node sizing selections. Make sure the size selected is appropriate for your cloud.
Run this command to install Hibernate CronJobs
kubectl apply -f https://raw.githubusercontent.com/castai/hibernate/main/deploy.yaml
Create API token with Full Access permissions and encode base64
echo -n "98349587234524jh523452435kj2h4k5h2k34j5h2kj34h5k23h5k2345jhk2" | base64
use this value to update Secret
apiVersion: v1 kind: Secret metadata: name: castai-hibernate namespace: castai-agent type: Opaque data: API_KEY: >- CASTAI-API-KEY-REPLACE-ME-WITH-ABOVE==
OR for convenience use one liner
kubectl get secret castai-hibernate -n castai-agent -o json | jq --arg API_KEY "$(echo -n 9834958-CASTAI-API-KEY-REPLACE-ME-5k2345jhk2 | base64)" '.data["API_KEY"]=$API_KEY' | kubectl apply -f -
AKS is set by default, but requires changing in both CronJobs "Cloud" env variable to [EKS|GKE|AKS]
Add CAST AI helm charts repository.
helm repo add castai-helm https://castai.github.io/helm-charts helm repo update
Now let's install it. (update cloud and apiKey variables)
helm upgrade -i castai-hibernate castai-helm/castai-hibernate -n castai-agent --set cloud=<AKS|EKS|GKE> --set apiKey=< CASTAI-API-KEY-REPLACE-ME-WITH-BASE64_ENCODE>
Update hibernate-pause and hibernate-resume cronjob schedules according to business needs.
#update hibernate-pause schedule according to business needs. pauseCronSchedule: "0 22 * * 1-5" #update hibernate-resume schedule according to business needs. resumeCronSchedule: "0 7 * * 1-5"
In order to upgrade this component to the latest version, run the following command:
helm repo add castai-helm https://castai.github.io/helm-charts helm repo update helm upgrade castai-hibernate castai-helm/castai-hibernate --reuse-values -n castai-agent
Each of CAST AI helm charts has values described in this Github repository.
Updated about 1 month ago