Evictor
Learn how to enable and configure CAST AI's Evictor, a bin-packing component that continuously compacts pods into fewer nodes for cost savings.
Evictor continuously compacts pods into fewer nodes, creating empty nodes that can be removed following the Node deletion policy. This mechanism is called βbin-packing.β
To prevent any downtime, Evictor will only target applications that have multiple replicas. However, you can override this by enabling an aggressive mode. When activated, Evictor will also consider applications with only a single replica. It is important to note that this mode is not advised for use in production clusters.
How does Evictor work?
Evictor automatically follows the sequence of steps below:
- First, it identifies underutilized nodes as candidates for eviction.
- Then it automatically moves pods to other nodes. Learn more about the bin packing mechanism here.
- Once the node is empty, it gets deleted from the cluster via the Node deletion policy (which should be enabled).
- Evictor returns to the first step, constantly looking for nodes that are good candidates for eviction.
Note
Evictor does not use percentage-based utilization thresholds. Instead, it considers the overall cluster capacity and workload distribution when making eviction decisions. This means that even a node with high utilization might be evicted if its workloads can be efficiently redistributed across other nodes with sufficient capacity.
How Evictor avoids downtime
Evictor follows certain rules to avoid downtime. For the node to be considered a candidate for possible removal due to bin-packing, all of the pods running on the node must meet the following criteria:
- A pod must be replicated: it should be managed by a
Controller
(e.g.ReplicaSet
,ReplicationController
,Deployment
) that has more than one replica (see Overrides) - A pod that is not part of a
StatefulSet
and is not marked or targeted as disposable by a label, annotation, or Advanced Configuration. - A pod must not be marked as non-evictable (see Overrides)
- All static pods (YAMLs defined in the node's
/etc/kubernetes/manifests
by default) are considered evictable - All
DaemonSet
-controller pods are considered evictable - Pod disruption budgets are respected
Aggressive Mode
In more fault tolerant systems, you can achieve even higher waste reduction by turning the aggressive mode on. In this scenario, Evictor will bin-pack not only multi-replica applications but single-replica ones as well.
Note: if you have a
job
pod running on a node to be evicted, that job will get interrupted.Note: when using this mode, make sure to have
removal-disabled
annotation on ajob
pod to avoid restarting the job.Note: aggressive mode does not affect how
StatefulSets
are handled, meaning Evictor follows default behavior
Override Evictor rules for pods and nodes
-
autoscaling.cast.ai/removal-disabled="true"
- Node: Annotation / Label
- Pod: Annotation / Label
- Description: Evictor won't try to evict a node with this annotation/label or a node running a pod annotated/labeled with this value.
-
autoscaling.cast.ai/disposable="true"
- Pod: Annotation / Label
- Description Evictor will treat the
Pod
as evictable despite any of the other rules
-
cluster-autoscaler.kubernetes.io/safe-to-evict="false"
- Pod: Annotation
- Description: Evictor won't try to evict a pod annotated with this value.
-
cluster-autoscaler.kubernetes.io/safe-to-evict="true"
- Pod: Annotation
- Description: Evictor will treat the
Pod
as evictable despite any of the other rules
-
beta.evictor.cast.ai/disposable="true"
(deprecated)- Pod: Annotation
- Description: Evictor will treat this
Pod
as Evictable despite any of the other rules.
-
beta.evictor.cast.ai/eviction-disabled="true"
(deprecated)- Node: Annotation / Label
- Pod: Annotation
- Description: Evictor won't try to evict a node with this annotation or a node running a pod annotated with this value.
Examples of override commands
Annotate a pod so Evictor won't evict a node running an annotated pod (can be applied to a node as well).
kubectl annotate pods <pod-name> autoscaling.cast.ai/removal-disabled="true"
Label or annotate a node to prevent the eviction of pods as well as the removal of the node (even when it's empty):
kubectl label nodes <node-name> autoscaling.cast.ai/removal-disabled="true"
kubectl annotate nodes <node-name> autoscaling.cast.ai/removal-disabled="true"
You can also annotate a pod to make it disposable, irrespective of other criteria that would normally make the pod un-evictable. Here is an example of a disposable pod manifest:
kind: Pod
metadata:
name: disposable-pod
annotations:
autoscaling.cast.ai/disposable: "true"
spec:
containers:
- name: nginx
image: nginx:1.14.2
ports:
- containerPort: 80
resources:
requests:
cpu: '1'
limits:
cpu: '1'
Due to the applied annotation, the pod will be targeted for eviction even though it is not replicated.
Configuration
A list of configuration settings can be found at: helm-charts/castai-evictor
Advanced configuration
Advanced configuration can be utilized for more granular optimization, as it allows specific resources to be targeted or shielded from eviction. Users can provide additional rules for Evictor, enabling it to target specific nodes and/or pods, thereby overriding the default behavior of Evictor.
Use case examples
Evictor Advanced Configuration use case for marking Job
kind pods as removalDisabled
if matchers are fulfilled:
Job
kind pods as removalDisabled
if matchers are fulfilled:evictionConfig:
- podSelector:
namespace: "namespace"
kind: Job
labelSelector:
matchLabels:
pod.cast.ai/name: "cron-job"
app.kubernetes.io/active: "true"
settings:
removalDisabled:
enabled: true
Marking pods of the Job
kind in a namespace as removalDisabled
:
Job
kind in a namespace as removalDisabled
:evictionConfig:
- podSelector:
namespace: "namespace"
kind: Job
settings:
removalDisabled:
enabled: true
Marking node as disposable
:
disposable
:evictionConfig:
- nodeSelector:
labelSelector:
matchLabels:
app.kubernetes.io/name: castai-node
app.kubernetes.io/instance: instance
settings:
disposable:
enabled: true
Disposing pod(s) using disposable
flag:
disposable
flag:evictionConfig:
- podSelector:
namespace: "replicaset-ns"
kind: ReplicaSet
labelSelector:
matchExpressions:
- key: pod.cast.ai/flag
operator: In
values:
- "true"
settings:
disposable:
enabled: true
Applying aggressive
mode even if Evictor is running without aggressive
mode turned on:
aggressive
mode even if Evictor is running without aggressive
mode turned on:evictionConfig:
- podSelector:
namespace: "namespace"
kind: ReplicaSet
labelSelector:
matchExpressions:
- key: pod.cast.ai/flag
operator: In
values:
- "true"
settings:
aggressive:
enabled: true
Selectors
Selectors specify criteria for matching specific resources, e.g.: pods or nodes. If the selector satisfies the rules, it applies the eviction mode specified in the settings. If no matches are found or if matches are found with the eviction mode enabled
flag set to false
, default Evictor rules will be applied for a given pod or node selector.
Selector | Target | Supported Keys |
---|---|---|
podSelector | Pod | namespace, kind, labelSelector |
nodeSelector | Node | labelSelector |
Selector Keys
Selectors have specific matchers for deciding if targeting mode should be applied for resource. matchLabels
and matchExpressions
follow the same structure as in Kubernetes, documentation
Name | Type | Description |
---|---|---|
namespace | string | Specifies namespace to match for pod |
kind | string | Pod owner kind matcher |
labelSelector | object | Holds matchLabels and matchExpressions array of label keys, operators and values |
Settings
Settings hold one of the supported targeting modes and if that mode is enabled.
Name | Supported Targeting Modes | Mode Enable Key |
---|---|---|
settings | removalDisabled, aggressive, disposable | enabled |
Targeting Modes
Targeting mode - specifies the eviction type for the matched resource. The specified mode takes precedence and overrides default Evictor behavior.
Name | Description |
---|---|
removalDisabled | Resource is not removed |
aggressive | Apply aggressive mode to targeted resource |
disposable | A resource is eligible to be removed |
NOTE aggressive
targeting works the same as turning it on from the console, but only for a certain pod or node. If aggressive
mode is turned on in the console and specified in the advanced configuration it won't have any additional behavior.
Targeting Mode Keys
Name | Type | Description |
---|---|---|
enabled | boolean | specifies if to apply selector if a match is successful |
How to pass Advanced Configuration?
Standard installation flow
Pass the advanced configuration using either the console interface or a config map.
During Evictor initialisation CM (ConfigMap) is created under the castai-agent
namespace with castai-evictor-config
name.
Edit castai-evictor-config
CM data, under config.yaml: |
part with wanted evictionConfig
YAML format contents, e.g.:
-
apiVersion: v1 kind: ConfigMap metadata: name: "castai-evictor-config" labels: helm.sh/chart: castai-evictor-1 app.kubernetes.io/name: castai-evictor app.kubernetes.io/instance: release-name app.kubernetes.io/version: "version" app.kubernetes.io/managed-by: Helm data: config.yaml: | evictionConfig: - nodeSelector: labelSelector: matchLabels: app.kubernetes.io/name: castai-node app.kubernetes.io/instance: instance settings: removalDisabled: enabled: true - podSelector: namespace: "namespace" kind: ReplicaSet labelSelector: matchLabels: pod.cast.ai/name: castai-pod settings: disposable: enabled: true - podSelector: namespace: "namespace" kind: ReplicaSet labelSelector: matchExpressions: - key: pod.cast.ai/flag operator: In values: - "true" - key: pod.cast.ai/name operator: Exists matchLabels: pod-label: "pod-label-value" settings: disposable: enabled: true
Manual installation
To pass Evictor Advanced Configuration with manual Evictor installation, you can provide a path to the created YAML file through thehelm
, by setting the file location flag or with the set flag. You can find a guide on how to install Evictor manually here.
-
--set-file customConfig=<path_to_file>
-
--set customConfig="<config_string>"
Troubleshooting
Evictor policy is not allowed to be turned on
- The reason why Evictor is unavailable in the policies page is that CAST AI has detected an already existing Evictor installation.
- In such a scenario, CAST AI will not try to manage evictor settings or upgrades.
- If you want CAST AI to manage Evictor configurations and upgrade to the most recent version, you need to remove the current installation first.
How to check the logs
To check Evictor logs, run the following command:
kubectl logs -l app.kubernetes.io/name=castai-evictor -n castai-agent
Manually install Evictor
Evictor will compact your pods into fewer nodes, creating empty nodes that will be removed by the Node deletion policy. To install Evictor, run this command:
helm repo add castai-helm https://castai.github.io/helm-charts
helm upgrade --install castai-evictor castai-helm/castai-evictor -n castai-agent --set dryRun=false
This process will take some time. Also, by default, Evictor will not cause any downtime to single replica deployments / StatefulSets, pods without ReplicaSet, meaning that those nodes can't be removed gracefully. Familiarize yourself with rules and available overrides in order to set up Evictor to meet your needs.
In order for evictor to run in more aggressive mode (start considering applications with single replica), you should pass the following parameters:
--set dryRun=false,aggressiveMode=true
For Evictor to run in scoped mode βonly removing nodes created by CAST AI when using the scoped autoscaler β pass the following parameters:
--set dryRun=false,scopedMode=true
By default, Evictor will only impact nodes that are older than 5 minutes.
If you wish to change the grace period before a node can be considered for eviction, set the nodeGracePeriodMinutes parameter to the desired time in minutes. This is useful for slow-to-start nodes β it prevents them from being marked for eviction before they can start taking on workloads.
--set dryRun=false,nodeGracePeriodMinutes=8
Manually upgrade Evictor
-
Check the Evictor version you are currently using:
helm ls -n castai-agent
-
Update the Helm chart repository to make sure that your Helm command is aware of the latest charts:
helm repo update
-
Install the latest Evictor version:
helm upgrade --install castai-evictor castai-helm/castai-evictor -n castai-agent --set dryRun=false
-
Check whether the Evictor version was changed:
helm ls -n castai-agent
Eviction of StatefulSet pods
The support for eviction of StatefulSet
pods was added in 0.26.3
Helm release and requires Helm release update due to RBAC changes (adding additional read permissions for PersistentVolume
, PersistentVolumeClaim
and StorageClass
).
The Evictor excludes StatefulSet
pods from eviction by default, the StatefulSet
pods have to be explicitly marked or targeted as disposable so that Evictor can consider them for eviction. The eviction of StatefulSet
pods can be disruptive to the application and should be used cautiously. The StatefulSet
pods should have K8s probes reflecting the application state and a PodDisruptionBudget
configured to minimize the impact of eviction.
The StatefulSet
pods can be marked or targeted for eviction by:
-
Labeling or annotating pod using:
autoscaling.cast.ai/disposable="true"
-
Targeting all
StatefulSet
pods using a pod selector in the Advanced Configuration:evictionConfig: - podSelector: kind: StatefulSet settings: disposable: enabled: true
-
Targeting replicated
StatefulSet
pods using a pod selector in the Advanced Configuration:evictionConfig: - podSelector: kind: StatefulSet replicasMin: 2 settings: disposable: enabled: true
-
Targeting labeled
StatefulSet
pods using a pod selector in the Advanced Configuration:evictionConfig: - podSelector: kind: StatefulSet labelSelector: matchLabels: app.kubernetes.io/name: database settings: disposable: enabled: true
-
Targeting all pods (including
StatefulSet
pods) running on a targeted node using a node selector in the Advanced Configuration:evictionConfig: - nodeSelector: labelSelector: matchLabels: app.kubernetes.io/name: castai-node settings: disposable: enabled: true
Updated about 2 months ago