Pod Pinner

Pod Pinner addresses the misalignment between the actions of the CAST AI Autoscaler and the Kubernetes cluster scheduler.

For example, while the CAST AI Autoscaler efficiently binpacks pods and creates nodes in the cluster in a cost-optimized manner, the Kubernetes cluster scheduler determines the actual placement of pods on nodes. This can lead to suboptimal pod placement, fragmentation, and unnecessary resource waste, as pods may end up on different nodes than those anticipated by the CAST AI Autoscaler.

Pod Pinner enables the integration of the CAST AI Autoscaler's decisions into your cluster, allowing it to override the decisions of the Kubernetes cluster scheduler. Installing Pod Pinner can directly enhance savings in the cluster. Pod Pinner is a CAST AI in-cluster component, similar to the CAST AI agent, cluster controller, and others.

Limitations

🚧

Notice

Pod Pinner may conflict with Spot-webhook, we do not recommend using them together at this time.

Using them together may result in some failed pods during scheduling. This is because Pod Pinner is unaware of changes applied by other webhooks when binding pods to nodes. While these failed pods are typically recreated by Kubernetes without negative impact, we are working on improving compatibility between Pod Pinner and Spot-webhook to fully address this issue.

Installation and version upgrade

For newly onboarded clusters, the latest version of the Pod Pinner castware component castai-pod-pinner is installed automatically. Therefore, at the beginning:

  1. Review whether your cluster has the castai-pod-pinner deployment in the castai-agent namespace available:
$ kubectl get deployments.apps   -n castai-agent
NAME                        READY   UP-TO-DATE   AVAILABLE   AGE
castai-agent                1/1     1            1           15m
castai-agent-cpvpa          1/1     1            1           15m
castai-cluster-controller   2/2     2            2           15m
castai-evictor              0/0     0            0           15m
castai-kvisor               1/1     1            1           15m
castai-pod-pinner           2/2     2            2           15m

Helm

Option 1: CAST AI-Managed (default)

By default, CAST AI manages Pod Pinner, including automatic upgrades.

  1. Check the currently installed Pod Pinner chart version. If it's >= 1.0.0, an upgrade is not needed.
    You can check the version with the following command:
$ helm list -n castai-agent --filter castai-pod-pinner
NAME             	NAMESPACE   	REVISION	UPDATED                                	STATUS  	CHART                  	APP VERSION
castai-pod-pinner	castai-agent	11      	2024-09-26 11:40:00.245427517 +0000 UTC	deployed	castai-pod-pinner-1.0.2	v1.0.0
  1. If the version is < 1.0.0 run the following commands to install or upgrade Pod Pinner to the latest version.
helm repo add castai-helm https://castai.github.io/helm-charts
helm repo update
helm upgrade --install castai-pod-pinner castai-helm/castai-pod-pinner -n castai-agent

After installation or upgrade to version >= 1.0.0Pod Pinner will automatically be scaled to 2 replicas and will be managed by CAST AI, as indicated by the charts.cast.ai/managed=true label applied to the pods of the castai-pod-pinner deployment. All the following Pod Pinner versions will be updated automatically.

Option 2: Self-Managed

To control the Pod Pinner version yourself:

helm upgrade -i castai-pod-pinner castai-helm/castai-pod-pinner -n castai-agent --set managedByCASTAI=false

This prevents CAST AI from automatically managing and upgrading Pod Pinner.

Re-running Onboarding Script

You can also install Pod Pinner by re-running the phase 2 onboarding script. For more information, see the cluster onboarding documentation.

Terraform Users

For Terraform users, you can manage Pod Pinner installation and configuration through your Terraform scripts. This allows for version control and infrastructure-as-code management of Pod Pinner settings.

Enabling/Disabling Pod Pinner

Pod Pinner is enabled by default but can be disabled in the CAST AI console.

If you disable Pod Pinner this way, the deployment will be scaled down to 0 replicas and not auto-upgraded by CAST AI.

Autoscaler Settings UI

To enable/disable Pod Pinner:

  1. Navigate to Autoscaler settings in the CAST AI console.
  2. You'll find the Pod Pinner option under the "Unscheduled pods policy" section.
  3. Check/uncheck the "Enable pod pinning" box to activate/deactivate the Pod Pinner. Disabling Pod Pinner will scale down the deployment to 0 replicas and turn off auto-upgrades.

📘

Note

When Pod Pinner is disabled through the console and the charts.cast.ai/managed=true label is present, CAST AI will scale down the deployment to 0 replicas no matter what. To manually control Pod Pinner while keeping it active, use the self-managed installation option mentioned above.

Ensuring stability

It is suggested that you keep the Pod Pinner pod as stable as possible, especially during rebalancing. You can do so by applying the same approach you are using for castai-agent.

For instance, you can add the autoscaling.cast.ai/removal-disabled: "true" label/annotation to the pod. If the Pod Pinner pod restarts during rebalancing, the pods won't get pinned to the nodes as expected by the Rebalancer. It may result in suboptimal pod placement as the Kubernetes cluster scheduler will schedule the pods.

📘

Note

You can scale down the castai-pod-pinner deployment anytime. This will result in normal behavior and will not impact the cluster negatively other than the Kubernetes scheduler taking over pod scheduling.

Logs

You can access logs in the Pod Pinner pod to see what decisions are being made. Here is a list of the most important log entries:

ExampleMeaning
node placeholder createdA node placeholder has been created. The real node will use this placeholder when it joins the cluster.
pod pinnedA pod has been successfully bound to a node. Such logs always appear after the node placeholder is created.
node placeholder not foundThis log appears when Pod Pinner tries to bind a pod to a non-existing node. This may occur if Pod Pinner fails to create the node placeholder.
pinning podThis log occurs when the Pod Pinner's webhook intercepts a pod creation and binds it to a node. This happens during rebalancing.
node placeholder deletedA node placeholder has been deleted. This happens when a node fails to be created in the cloud, and Pod Pinner must clean up the placeholder that was created.
failed streaming pod pinning actions, restarting...The connection between the Pod Pinner pod and CAST AI has been reset. This is expected to happen occasionally and will not negatively impact your cluster.
http: TLS handshake error from 10.0.1.135:48024: EOFThis log appears as part of the certificate rotation performed by the webhook. This is a non-issue log and will not negatively impact the cluster.

Troubleshooting

Failed pod status reason: OutOf{resource}

OutOfcpu, OutOfmemory, OutOf{resource} pod statuses happen when the scheduler schedules a pod on a node, but the kubelet rejects it due to a lack of some sort of resource. These are Failed pods that CAST AI and the Kubernetes control-plane know how to ignore.

This happens when many pods are upscaled at the same time. The scheduler has various optimizations to deal with large bursts of pods, so it makes scheduling decisions in parallel. Sometimes, those decisions conflict, resulting in pods scheduled on nodes where they don't fit. This happens especially in GKE. If you see this status, don't be afraid. The control-plane will eventually clean those pods up after a few days.

Pods might get this status when the Kubernetes scheduler takes over scheduling decisions due to a blip in Pod Pinner's availability. However, this does not negatively impact the cluster as Kubernetes recreates the pods.

Failed pod status reason: Node affinity

If you use spot-webhook, your cluster may encounter this issue, which puts the pods in a Failed status. This occurs because Pod Pinner is unaware of other webhook-applied changes to the pods when binding them to nodes. This means that Pod Pinner may have a pod with different node selectors in mind compared to reality.

As with the OutOf{resource} pod status, this is simply a visual inconvenience as the pod will get recreated by Kubernetes.