Evictor continuously compacts pods into fewer nodes, creating empty nodes that can be removed following the Node deletion policy. This mechanism is called “bin-packing.”

To prevent any downtime, Evictor will only target applications that have multiple replicas. However, you can override this by enabling an aggressive mode. When activated, Evictor will also consider applications with only a single replica. It is important to note that this mode is not advised for use in production clusters.

How does Evictor work?

Evictor automatically follows the sequence of steps below:

  • First, it identifies underutilized nodes as candidates for eviction.
  • Then it automatically moves pods to other nodes. Learn more about the bin packing mechanism here.
  • Once the node is empty, it gets deleted from the cluster via the Node deletion policy (which should be enabled).
  • Evictor returns to the first step, constantly looking for nodes that are good candidates for eviction.

How Evictor avoids downtime

Evictor follows certain rules to avoid downtime. For the node to be considered a candidate for possible removal due to bin-packing, all of the pods running on the node must meet the following criteria:

  • A pod must be replicated: it should be managed by a Controller (e.g. ReplicaSet, ReplicationController, Deployment) that has more than one replica (see Overrides)
  • A pod is not part of StatefulSet
  • A pod must not be marked as non-evictable (see Overrides)
  • All static pods (YAMLs defined in the node's /etc/kubernetes/manifests by default) are considered evictable
  • All DaemonSet-controller pods are considered evictable
  • Pod disruption budgets are respected

🚧

Aggressive Mode

In more fault tolerant systems, you can achieve even higher waste reduction by turning the aggressive mode on. In this scenario, Evictor will bin-pack not only multi-replica applications but single-replica ones as well.

Note: if you have a job pod running on a node to be evicted, that job will get interrupted.

Note: when using this mode, make sure to have removal-disabled annotation on a job pod to avoid restarting the job.

Note: aggressive mode does not affect how StatefuleSets are handled, meaning Evictor follows default behavior

Override Evictor rules for pods and nodes

  • autoscaling.cast.ai/removal-disabled="true"

    • Node: Annotation / Label
    • Pod: Annotation / Label
    • Description: Evictor won't try to evict a node with this annotation/label or a node running a pod annotated/labeled with this value.
  • autoscaling.cast.ai/disposable="true"

    • Pod: Annotation / Label
    • Description Evictor will treat the Pod as evictable despite any of the other rules
  • cluster-autoscaler.kubernetes.io/safe-to-evict="false"

    • Pod: Annotation
    • Description: Evictor won't try to evict a pod annotated with this value.
  • cluster-autoscaler.kubernetes.io/safe-to-evict="true"

    • Pod: Annotation
    • Description: Evictor will treat the Pod as evictable despite any of the other rules
  • beta.evictor.cast.ai/disposable="true" (deprecated)

    • Pod: Annotation
    • Description: Evictor will treat this Pod as Evictable despite any of the other rules.
  • beta.evictor.cast.ai/eviction-disabled="true" (deprecated)

    • Node: Annotation / Label
    • Pod: Annotation
    • Description: Evictor won't try to evict a node with this annotation or a node running a pod annotated with this value.

Examples of override commands

Annotate a pod so Evictor won't evict a node running an annotated pod (can be applied to a node as well).

kubectl annotate pods <pod-name> autoscaling.cast.ai/removal-disabled="true"

Label or annotate a node to prevent the eviction of pods as well as the removal of the node (even when it's empty):

kubectl label nodes <node-name> autoscaling.cast.ai/removal-disabled="true"
kubectl annotate nodes <node-name> autoscaling.cast.ai/removal-disabled="true"

You can also annotate a pod to make it disposable, irrespective of other criteria that would normally make the pod un-evictable. Here is an example of a disposable pod manifest:

kind: Pod
metadata:
  name: disposable-pod
  annotations:
    autoscaling.cast.ai/disposable: "true"
spec:
  containers:
    - name: nginx
      image: nginx:1.14.2
      ports:
        - containerPort: 80
      resources:
        requests:
          cpu: '1'
        limits:
          cpu: '1'

Due to the applied annotation, the pod will be targeted for eviction even though it is not replicated.

Configuration

A list of configuration settings can be found at: helm-charts/castai-evictor

Advanced configuration

Advanced configuration can be utilized for more granular optimization, as it allows specific resources to be targeted or shielded from eviction. Users can provide additional rules for Evictor, enabling it to target specific nodes and/or pods, thereby overriding the default behavior of Evictor.

Use case examples

Evictor Advanced Configuration use case for marking Job kind pods as removalDisabled if matchers are fulfilled:

evictionConfig:
  - podSelector:
      namespace: "namespace"
      kind: Job
      labelSelector:
      	matchLabels:
          pod.cast.ai/name: "cron-job"
          app.kubernetes.io/active: "true"
    settings:
      removalDisabled:
        enabled: true

Marking pods of the Job kind in a namespace as removalDisabled:

evictionConfig:
  - podSelector:
      namespace: "namespace"
      kind: Job
    settings:
      removalDisabled:
        enabled: true

Marking node as disposable:

evictionConfig:
  - nodeSelector:
      labelSelector:
        matchLabels:
          app.kubernetes.io/name: castai-node
          app.kubernetes.io/instance: instance
    settings:
      disposable:
        enabled: true

Disposing pod(s) using disposable flag:

evictionConfig:
  - podSelector:
      namespace: "statefulSet-ns"
      kind: StatefulSet
      labelSelector:
        matchExpressions:
          - key: pod.cast.ai/flag
            operator: In
            values:
              - "true"
    settings:
      disposable:
        enabled: true

Applying aggressive mode even if Evictor is running without aggressive mode turned on:

evictionConfig:
  - podSelector:
      namespace: "namespace"
      kind: ReplicaSet
      labelSelector:
        matchExpressions:
          - key: pod.cast.ai/flag
            operator: In
            values:
              - "true"
    settings:
      aggressive:
        enabled: true

Selectors

Selectors specify criteria for matching specific resources, e.g.: pods or nodes. If the selector satisfies the rules, it applies the eviction mode specified in the settings. If no matches are found or if matches are found with the eviction mode enabled flag set to false, default Evictor rules will be applied for a given pod or node selector.

SelectorTargetSupported Keys
podSelectorPodnamespace, kind, labelSelector
nodeSelectorNodelabelSelector

Selector Keys

Selectors have specific matchers for deciding if targeting mode should be applied for resource. matchLabels and matchExpressions follow the same structure as in Kubernetes, documentation

NameTypeDescription
namespacestringSpecifies namespace to match for pod
kindstringPod owner kind matcher
labelSelectorobjectHolds matchLabels and matchExpressions array of label keys, operators and values

Settings

Settings hold one of the supported targeting modes and if that mode is enabled.

NameSupported Targeting ModesMode Enable Key
settingsremovalDisabled, aggressive, disposableenabled

Targeting Modes

Targeting mode - specifies the eviction type for the matched resource. The specified mode takes precedence and overrides default Evictor behavior.

NameDescription
removalDisabledResource is not removed
aggressiveApply aggressive mode to targeted resource
disposableA resource is eligible to be removed

NOTE aggressive targeting works the same as turning it on from the console, but only for a certain pod or node. If aggressive mode is turned on in the console and specified in the advanced configuration it won't have any additional behavior.

Targeting Mode Keys

NameTypeDescription
enabledbooleanspecifies if to apply selector if a match is successful

How to pass Advanced Configuration?

Standard installation flow

Pass the advanced configuration using either the console interface or a config map.

During Evictor initialisation CM (ConfigMap) is created under the castai-agent namespace with castai-evictor-config name.

Edit castai-evictor-config CM data, under config.yaml: | part with wanted evictionConfig YAML format contents, e.g.:

  • apiVersion: v1
    kind: ConfigMap
    metadata:
      name: "castai-evictor-config"
      labels:
        
        helm.sh/chart: castai-evictor-1
        app.kubernetes.io/name: castai-evictor
        app.kubernetes.io/instance: release-name
        app.kubernetes.io/version: "version"
        app.kubernetes.io/managed-by: Helm
    data:
      config.yaml: |
    
        evictionConfig:
          - nodeSelector:
              labelSelector:
                matchLabels:
                  app.kubernetes.io/name: castai-node
                  app.kubernetes.io/instance: instance
            settings:
              removalDisabled:
                enabled: true
          - podSelector:
              namespace: "namespace"
              kind: ReplicaSet
              labelSelector:
                matchLabels:
                  pod.cast.ai/name: castai-pod
            settings:
              disposable:
                enabled: true
          - podSelector:
              namespace: "namespace"
              kind: ReplicaSet
              labelSelector:
                matchExpressions:
                  - key: pod.cast.ai/flag
                    operator: In
                    values:
                      - "true"
                  - key: pod.cast.ai/name
                    operator: Exists
                matchLabels:
                  pod-label: "pod-label-value"
            settings:
              disposable:
                enabled: true
    

Manual installation

To pass Evictor Advanced Configuration with manual Evictor installation, you can provide a path to the created YAML file through thehelm, by setting the file location flag or with the set flag. You can find a guide on how to install Evictor manually here.

  • --set-file customConfig=<path_to_file>
    
  • --set customConfig="<config_string>"
    

Troubleshooting

Evictor policy is not allowed to be turned on

  • The reason why Evictor is unavailable in the policies page is that CAST AI has detected an already existing Evictor installation.
  • In such a scenario, CAST AI will not try to manage evictor settings or upgrades.
  • If you want CAST AI to manage Evictor configurations and upgrade to the most recent version, you need to remove the current installation first.

How to check the logs

To check Evictor logs, run the following command:

kubectl logs -l app.kubernetes.io/name=castai-evictor -n castai-agent

Manually install Evictor

Evictor will compact your pods into fewer nodes, creating empty nodes that will be removed by the Node deletion policy. To install Evictor, run this command:

helm repo add castai-helm https://castai.github.io/helm-charts
helm upgrade --install castai-evictor castai-helm/castai-evictor -n castai-agent --set dryRun=false

This process will take some time. Also, by default, Evictor will not cause any downtime to single replica deployments / StatefulSets, pods without ReplicaSet, meaning that those nodes can't be removed gracefully. Familiarize yourself with rules and available overrides in order to set up Evictor to meet your needs.

In order for evictor to run in more aggressive mode (start considering applications with single replica), you should pass the following parameters:

--set dryRun=false,aggressiveMode=true

For Evictor to run in scoped mode –only removing nodes created by CAST AI when using the scoped autoscaler – pass the following parameters:

--set dryRun=false,scopedMode=true

By default, Evictor will only impact nodes that are older than 5 minutes.

If you wish to change the grace period before a node can be considered for eviction, set the nodeGracePeriodMinutes parameter to the desired time in minutes. This is useful for slow-to-start nodes – it prevents them from being marked for eviction before they can start taking on workloads.

--set dryRun=false,nodeGracePeriodMinutes=8

Manually upgrade Evictor

  • Check the Evictor version you are currently using:

    helm ls -n castai-agent
    
  • Update the Helm chart repository to make sure that your Helm command is aware of the latest charts:

    helm repo update
    
  • Install the latest Evictor version:

    helm upgrade --install castai-evictor castai-helm/castai-evictor -n castai-agent --set dryRun=false
    
  • Check whether the Evictor version was changed:

    helm ls -n castai-agent