Evictor
Evictor continuously compacts pods into fewer nodes, creating empty nodes that can be removed following the Node deletion policy, if you choose to enable it. This mechanism is called “bin-packing.”
To avoid any downtime, Evictor will only consider applications with multiple replicas. You can also disable this by ticking the box next to “Use aggressive mode.” If you do so, Evictor will consider applications with single replicas as well. This mode is not recommended for production clusters.
How does Evictor work?
Evictor automatically follows the sequence of steps below:
- First, it identifies underutilized nodes as candidates for eviction.
- Then it automatically moves pods to other nodes. Learn more about the bin packing mechanism here.
- Once the node is empty, it gets deleted from the cluster via the Node deletion policy (which should be enabled).
- Evictor returns to the first step, constantly looking for nodes that are good candidates for eviction.
How Evictor avoids downtime
Evictor follows certain rules to avoid downtime. In order for the node to be considered a candidate for possible removal due to bin-packing, all of the pods running on the node must meet the following criteria:
- A pod must be replicated: it should be managed by a
Controller
(e.g.ReplicaSet
,ReplicationController
,Deployment
) that has more than one replica (see Overrides) - A pod is not part of
StatefulSet
- A pod must not be marked as non-evictable (see Overrides)
- All static pods (YAMLs defined in the node's
/etc/kubernetes/manifests
by default) are considered evictable - All
DaemonSet
-controller pods are considered evictable - Pod disruption budgets are respected
Aggressive Mode
In more fault tolerant systems, you can achieve even higher waste reduction by turning the aggressive mode on. In this scenario, Evictor will bin-pack not only multi-replica applications but single-replica ones as well. Note that if you have a job pod running on a node to be evicted, that job will get interrupted. Note: when using this mode, make sure to have removal-disabled annotation on a job pod to avoid restarting job.
Override Evictor rules for pods and nodes
-
autoscaling.cast.ai/removal-disabled="true"
- Node: Annotation / Label
- Pod: Annotation / Label
- Description: Evictor won't try to evict a node with this annotation/label or a node running a pod annotated/labeled with this value.
-
autoscaling.cast.ai/disposable="true"
- Pod: Annotation / Label
- Description Evictor will treat the
Pod
as evictable despite any of the other rules
-
beta.evictor.cast.ai/disposable="true"
(deprecated)- Pod: Annotation
- Description: Evictor will treat this
Pod
as Evictable despite any of the other rules.
-
beta.evictor.cast.ai/eviction-disabled="true"
(deprecated)- Node: Annotation / Label
- Pod: Annotation
- Description: Evictor won't try to evict a node with this annotation or a node running a pod annotated with this value.
Examples of override commands
Annotate a pod so Evictor won't evict a node running an annotated pod (can be applied to a node as well).
kubectl annotate pods <pod-name> autoscaling.cast.ai/removal-disabled="true"
Label or annotate a node to prevent the eviction of pods as well as the removal of the node (even when it's empty):
kubectl label nodes <node-name> autoscaling.cast.ai/removal-disabled="true"
kubectl annotate nodes <node-name> autoscaling.cast.ai/removal-disabled="true"
You can also annotate a pod to make it disposable, irrespective of other criteria that would normally make the pod un-evictable. Here is an example of a disposable pod manifest:
kind: Pod
metadata:
name: disposable-pod
annotations:
autoscaling.cast.ai/disposable: "true"
spec:
containers:
- name: nginx
image: nginx:1.14.2
ports:
- containerPort: 80
resources:
requests:
cpu: '1'
limits:
cpu: '1'
Due to the applied annotation, the pod will be targeted for eviction even though it is not replicated.
Advanced configuration options
A list of more advanced configuration options can be found at: helm-charts/castai-evictor
Troubleshooting
Evictor policy is not allowed to be turned on
- The reason why Evictor is unavailable in the policies page is that CAST AI has detected an already existing Evictor installation.
- In such scenario, CAST AI will not try to manage evictor settings or upgrades.
- If you want CAST AI to manage Evictor configurations and upgrade to the most recent version, you need to remove the current installation first.
How to check the logs
To check Evictor logs, run the following command:
kubectl logs -l app.kubernetes.io/name=castai-evictor -n castai-agent
Manually install Evictor
Evictor will compact your pods into fewer nodes, creating empty nodes that will be removed by the Node deletion policy. To install Evictor, run this command:
helm repo add castai-helm https://castai.github.io/helm-charts
helm upgrade --install castai-evictor castai-helm/castai-evictor -n castai-agent --set dryRun=false
This process will take some time. Also, by default, Evictor will not cause any downtime to single replica deployments / StatefulSets, pods without ReplicaSet, meaning that those nodes can't be removed gracefully. Familiarize yourself with rules and available overrides in order to set up Evictor to meet your needs.
In order for evictor to run in more aggressive mode (start considering applications with single replica), you should pass the following parameters:
--set dryRun=false,aggressiveMode=true
For Evictor to run in scoped mode –only removing nodes created by CAST AI when using the scoped autoscaler – pass the following parameters:
--set dryRun=false,scopedMode=true
By default, Evictor will only impact nodes that are older than 5 minutes.
If you wish to change the grace period before a node can be considered for eviction, set the nodeGracePeriodMinutes parameter to the desired time in minutes. This is useful for slow-to-start nodes – it prevents them from being marked for eviction before they can start taking on workloads.
--set dryRun=false,nodeGracePeriodMinutes=8
Manually upgrade Evictor
-
Check the Evictor version you are currently using:
helm ls -n castai-agent
-
Update the Helm chart repository to make sure that your Helm command is aware of the latest charts:
helm repo update
-
Install the latest Evictor version:
helm upgrade --install castai-evictor castai-helm/castai-evictor -n castai-agent --set dryRun=false --set image.repository=us-docker.pkg.dev/castai-hub/library/castai-evictor
-
Check whether the Evictor version was changed:
helm ls -n castai-agent
Updated 17 days ago