Spot only cluster

How it works

CAST AI gives customer a flexibility to run all or portion of workloads on spot instances without the need to modify manifest files. In order to to achieve this, a Mutating Admission webhook needs to be installed and configured.

When there's a request to schedule a pod, the CAST AI Mutating Admission Webhook (in short, mutating webhook) will mutate workload manifest - for example, adding spot toleration to influence the desired pod placement by the Kubernetes Scheduler.

Mutating Admission Webhook

CAST AI Mutating Admission Webhook presets:

  • Spot-only
  • Spot-only except kube-system
  • Partial Spot
  • Custom
  • [Coming soon] Intelligent placement on Rebalancing

!!! note "Running pods will not be affected"
The Webhook only mutates pods during scheduling. Over time, all pods should eventually be re-scheduled and, in turn, mutated. The application owners will release a new version of workload that will trigger all the replicas to be rescheduled, Evictor, or Rebalancing will remove older nodes, putting pods for rescheduling, etc.

If you'd like to initiate mutation for the whole namespace immediately, run this command which will recreate all pods:

```shell
kubectl -n {NAMESPACE} rollout restart deploy
```

Spot-only

Preset allSpot.

The Spot-only mutating webhook will mark all workloads in your cluster as suitable for spot instances, causing the autoscaler to prefer spot instances when scaling the cluster up. As this will make cluster more cost-efficient, choosing this mode is recommended for Development and Staging environments, batch job processing clusters, etc. The CAST AI autoscaler will create spot instances only if the pod has "Spot toleration," see Spot instances . The Mutating Webhook will add the Spot toleration and the Spot node selector to all the workloads being scheduled.

Install Spot-only

To run all pods (including kube-system) on spot instances, use:

helm repo add castai-helm https://castai.github.io/helm-charts
helm upgrade -i --create-namespace -n castai-pod-node-lifecycle castai-pod-node-lifecycle \
    castai-helm/castai-pod-node-lifecycle \
    --set staticConfig.preset=allSpot

Spot-only except kube-system

Preset allSpotExceptKubeSystem.

This mode works the same as the Spot-only mode but it forces all pods in the kube-system namespace to be placed on on-demand nodes. This mode is recommended for clusters where the high-availability aspect of the control-plane is vitally important while other pods can tolerate spot interruptions.

Install Spot-only except kube-system

To run all pods excluding kube-system on spot instances, use:

helm repo add castai-helm https://castai.github.io/helm-charts
helm upgrade -i --create-namespace -n castai-pod-node-lifecycle castai-pod-node-lifecycle \
    castai-helm/castai-pod-node-lifecycle \
    --set staticConfig.preset=allSpotExceptKubeSystem

Partial Spot

Preset partialSpot.

When 100% of pods on spot instances is not a desirable scenario, you can use a ratio like 60% on stable on-demand instances and remaining 40% of pods in same ReplicaSet (Deployment / StatefulSet) running on spot instances. This conservative configuration ensures that there are enough pods on stable compute for the base load, but still allows achieving significant savings for pods above the base load by putting them on spot instances. This setup is recommended for all types of environment, from Production to Development.

Install Partial Spot

For running 40% workload pods on spot instances and keep remaining pods of same ReplicaSet on on-demand instances, use:

helm repo add castai-helm https://castai.github.io/helm-charts
helm upgrade -i --create-namespace -n castai-pod-node-lifecycle castai-pod-node-lifecycle \
    castai-helm/castai-pod-node-lifecycle \
    --set staticConfig.preset=partialSpot

To set a custom ratio for partial Spot, replace 70 with [1-99] as percentage value:

helm repo add castai-helm https://castai.github.io/helm-charts
helm upgrade -i --create-namespace -n castai-pod-node-lifecycle castai-pod-node-lifecycle \
    castai-helm/castai-pod-node-lifecycle \
    --set staticConfig.defaultToSpot=false --set staticConfig.spotPercentageOfReplicaSet=70

Custom

No preset.

This mode can be adjusted to match the needs and requirements of your cluster. Instead of choosing a specific preset, you configure the behavior yourself.

KeyTypeDefaultDescription
staticConfig.defaultToSpotbooleantrueShould the webhook add spot tolerations and node selectors to all pods which don't match other rules?
staticConfig.spotPercentageOfReplicaSetint0The percentage of pods (per ReplicaSet) which should be put on Spot instances. Acceptable values [1-100]. 0 means the feature is turned off.
staticConfig.ignorePodslist of PodAffinityTerm[]Terms describing the label selectors for pods which should be ignored by the webhook.
staticConfig.forcePodsToSpotlist of PodAffinityTerm[]Terms describing the label selectors for pods which should be put on Spot instances.
staticConfig.forcePodsToOnDemandlist of PodAffinityTerm[]Terms describing the label selectors for pods which should be put on Spot instances.

Schema description of the PodAffinityTerm object can be found in the official kubernetes-api documentation. The property topologyKey is ignored and the property namespaceSelector is not yet supported.

Install Custom

Here is an example of a values.yaml with custom rules defined:

staticConfig:
  defaultToSpot: true
  spotPercentageOfReplicaSet: 0
  ignorePods:
    - labelSelector:
        matchLabels:
          app.kubernetes.io/name: ignored-pod
  forcePodsToSpot:
    - labelSelector:
        matchExpressions:
          - key: app.kubernetes.io/name
            operator: In
            values:
              - spot-pod-1
              - spot-pod-2
  forcePodsToOnDemand:
    - namespaces:
        - kube-system

To install the webhook with these custom rules, execute this command:

helm repo add castai-helm https://castai.github.io/helm-charts
helm upgrade -i --create-namespace -n castai-pod-node-lifecycle castai-pod-node-lifecycle \
    castai-helm/castai-pod-node-lifecycle \
    --values values.yaml

Workload level override

Mutating Webhook is a cluster level configuration, but one can have exceptions that could be enforced per Deployment or StatefulSet.

Annotation NameValueLocationEffect
scheduling.cast.ai/lifecycle"on-demand"Deployment or StatefulSetAll Pods will be scheduled on on-demand instances
scheduling.cast.ai/lifecycle"spot"Deployment or StatefulSetAll Pods will be scheduled on spot instances
scheduling.cast.ai/spot-percentage"65" [1-99]Deployment or StatefulSetOverride Partial Spot configuration, schedule up to 65% on spot and remaining (at least 35%) on on-demand
kubectl patch deployment resilient-app -p '{"spec": {"template":{"metadata":{"annotations":{"scheduling.cast.ai/lifecycle":"spot"}}}}}'
kubectl patch deployment sensitive-app -p '{"spec": {"template":{"metadata":{"annotations":{"scheduling.cast.ai/lifecycle":"on-demand"}}}}}'
kubectl patch deployment conservative-app -p '{"spec": {"template":{"metadata":{"annotations":{"scheduling.cast.ai/spot-percentage":"50"}}}}}'

Troubleshooting

The mutating webhook will ignore these type of pods:

  • Bare pods without ReplicaSet Controller
  • Pods in "castai-pod-node-lifecycle" namespace
  • Pods with TopologySpreadConstraints with TopologyKey=Lifecycle
  • DaemonSets will get Spot Toleration by default, ensuring DaemonSet Pods could run on spot and on-demand nodes

The CAST AI Mutating webhook pods write logs to stdOut.

If cluster has Deployments with 1000+ replicas set higher Memory Requests and Limits, by appending these parameters to Helm command

--set resources.requests.memory=1G --set resources.limits.memory=1G

What’s Next