Pod Pinner

πŸ“˜

Beta release

A recently released feature for which we are actively gathering community feedback.

Pod Pinner is a CAST AI in-cluster component, similar to the CAST AI agent, cluster controller, and others. It aims to address the issue of misalignment between the actions of the CAST AI Autoscaler and the Kubernetes cluster scheduler. For example, while the CAST AI Autoscaler efficiently binpacks pods and creates nodes in the cluster in a cost-optimized manner, it is the Kubernetes cluster scheduler that determines the actual placement of pods on nodes. This can lead to suboptimal pod placement, with fragmentation and unnecessary resource waste, as pods may end up on different nodes than those anticipated by the CAST AI Autoscaler. Pod Pinner enables the integration of the CAST AI Autoscaler's decisions into your cluster, allowing it to override the decisions of the Kubernetes cluster scheduler. Installing Pod Pinner can directly enhance savings in the cluster.

Installation

  1. Review whether your cluster has the castai-pod-pinner deployment in the castai-agent namespace available. If it has, make sure it has 0 replicas running. If it hasn't - rerun the phase 2 onboarding script. More information on onboarding can be found here;
  2. Contact our support to enable the Pod Pinner for your cluster from CAST AI side;
  3. Finally, scale up the castai-pod-pinner deployment to exactly 1 replica.

It is suggested that you keep the Pod Pinner pod as stable as possible, especially during rebalancing. You can do so by applying the same approach you are using for castai-agent. For instance, you can add the autoscaling.cast.ai/removal-disabled: "true" label/annotation to the pod. If the Pod Pinner pod restarts during rebalancing, the pods won't get pinned to the nodes as expected by the Rebalancer. It may result in suboptimal pod placement as the Kubernetes cluster scheduler will schedule the pods.

Note that you can scale down the castai-pod-pinner deployment anytime. This will result in normal behavior and will not impact the cluster negatively other than the Kubernetes scheduler taking over pod scheduling.

Logs

You can access logs in the Pod Pinner pod to see what decisions are being made. Here is a list of important logs:

ExampleMeaning
node placeholder createdA node placeholder has been created. This placeholder will be used by the real node when it joins the cluster.
pod pinnedA pod has been successfully bound to a node. Such logs always come after the creation of node placeholder.
node placeholder not foundThis log appears when Pod Pinner tries to bind a pod to a non-existing node. This may occur if Pod Pinner failed to create the node placeholder.
pinning podThis log occurs when the Pod Pinner's webhook intercepts a pod creation and binds it to a node. This happens during rebalancing.
node placeholder deletedA node placeholder has been deleted. This happens when a node fails to get created in the cloud and Pod Pinner needs to clean up the created placeholder.
failed streaming pod pinning actions, restarting...The connection between the Pod Pinner pod and CAST AI has been reset. This is expected to occasionally happen and will not negatively impact your cluster.
http: TLS handshake error from 10.0.1.135:48024: EOFThis log appears as part of the certificate rotation performed by the webhook. This is a non-issue log and will not negatively impact the cluster.

Good to Know

Failed pod status reason: OutOf{resource}

OutOfcpu, OutOfmemory, OutOf{resource} pod statuses happen when the scheduler schedules a pod on a node but the kubelet rejects it due to lack of some resource. These are Failed pods that CAST AI and the Kubernetes control-plane know how to ignore.

This happens when many pods are upscaled at the same time. The scheduler has various optimisations to deal with large bursts of pods so it does scheduling decisions in parallel and sometimes those decisions conflict, resulting in pods scheduled on nodes where they don't fit. This happens especially in GKE. If you see this status, don't be afraid, the control-plane will eventually clean those pods up after a few days.

Pods might get this status when the Kubernetes scheduler takes over scheduling decisions due to a blip in Pod Pinner's availability. However, this does not negatively impact the cluster as Kubernetes recreates the pods.

Failed pod status reason: NodeAffinity

If you are using spot-webhook, your cluster may run into this issue and put the pods in Failed status. This occurs because Pod Pinner is unaware of other webhook-applied changes to the pods when binding them to nodes. This means that Pod Pinner may have a pod with different node selectors in mind compared to reality.

As with OutOf{resource} pod status, this is simply a visual inconvenience as the pod will get recreated by Kubernetes.