Evictor

Learn how to enable and configure CAST AI's Evictor, a bin-packing component that continuously compacts pods into fewer nodes for cost savings.

Evictor continuously compacts pods into fewer nodes, creating empty nodes that can be removed following the Node deletion policy. This mechanism is called β€œbin-packing.”

To prevent any downtime, Evictor will only target applications that have multiple replicas. However, you can override this by enabling an aggressive mode. When activated, Evictor will also consider applications with only a single replica. It is important to note that this mode is not advised for use in production clusters.

How does Evictor work?

Evictor automatically follows the sequence of steps below:

  • First, it identifies underutilized nodes as candidates for eviction.
  • Then it automatically moves pods to other nodes. Learn more about the bin packing mechanism here.
  • Once the node is empty, it gets deleted from the cluster via the Node deletion policy (which should be enabled).
  • Evictor returns to the first step, constantly looking for nodes that are good candidates for eviction.

πŸ“˜

Note

Evictor does not use percentage-based utilization thresholds. Instead, it considers the overall cluster capacity and workload distribution when making eviction decisions. This means that even a node with high utilization might be evicted if its workloads can be efficiently redistributed across other nodes with sufficient capacity.

How Evictor avoids downtime

Evictor follows certain rules to avoid downtime. For the node to be considered a candidate for possible removal due to bin-packing, all of the pods running on the node must meet the following criteria:

  • A pod must be replicated: it should be managed by a Controller (e.g. ReplicaSet, ReplicationController, Deployment) that has more than one replica (see Overrides)
  • A pod that is not part of a StatefulSet and is not marked or targeted as disposable by a label, annotation, or Advanced Configuration.
  • A pod must not be marked as non-evictable (see Overrides)
  • All static pods (YAMLs defined in the node's /etc/kubernetes/manifests by default) are considered evictable
  • All DaemonSet-controller pods are considered evictable
  • Pod disruption budgets are respected

🚧

Aggressive Mode

In more fault tolerant systems, you can achieve even higher waste reduction by turning the aggressive mode on. In this scenario, Evictor will bin-pack not only multi-replica applications but single-replica ones as well.

Note: if you have a job pod running on a node to be evicted, that job will get interrupted.

Note: when using this mode, make sure to have removal-disabled annotation on a job pod to avoid restarting the job.

Note: aggressive mode does not affect how StatefulSets are handled, meaning Evictor follows default behavior

Override Evictor rules for pods and nodes

  • autoscaling.cast.ai/removal-disabled="true"

    • Node: Annotation / Label
    • Pod: Annotation / Label
    • Description: Evictor won't try to evict a node with this annotation/label or a node running a pod annotated/labeled with this value.
  • autoscaling.cast.ai/disposable="true"

    • Pod: Annotation / Label
    • Description Evictor will treat the Pod as evictable despite any of the other rules
  • cluster-autoscaler.kubernetes.io/safe-to-evict="false"

    • Pod: Annotation
    • Description: Evictor won't try to evict a pod annotated with this value.
  • cluster-autoscaler.kubernetes.io/safe-to-evict="true"

    • Pod: Annotation
    • Description: Evictor will treat the Pod as evictable despite any of the other rules
  • beta.evictor.cast.ai/disposable="true" (deprecated)

    • Pod: Annotation
    • Description: Evictor will treat this Pod as Evictable despite any of the other rules.
  • beta.evictor.cast.ai/eviction-disabled="true" (deprecated)

    • Node: Annotation / Label
    • Pod: Annotation
    • Description: Evictor won't try to evict a node with this annotation or a node running a pod annotated with this value.

Examples of override commands

Annotate a pod so Evictor won't evict a node running an annotated pod (can be applied to a node as well).

kubectl annotate pods <pod-name> autoscaling.cast.ai/removal-disabled="true"

Label or annotate a node to prevent the eviction of pods as well as the removal of the node (even when it's empty):

kubectl label nodes <node-name> autoscaling.cast.ai/removal-disabled="true"
kubectl annotate nodes <node-name> autoscaling.cast.ai/removal-disabled="true"

You can also annotate a pod to make it disposable, irrespective of other criteria that would normally make the pod un-evictable. Here is an example of a disposable pod manifest:

kind: Pod
metadata:
  name: disposable-pod
  annotations:
    autoscaling.cast.ai/disposable: "true"
spec:
  containers:
    - name: nginx
      image: nginx:1.14.2
      ports:
        - containerPort: 80
      resources:
        requests:
          cpu: '1'
        limits:
          cpu: '1'

Due to the applied annotation, the pod will be targeted for eviction even though it is not replicated.

Configuration

A list of configuration settings can be found at: helm-charts/castai-evictor

Advanced configuration

Advanced configuration can be utilized for more granular optimization, as it allows specific resources to be targeted or shielded from eviction. Users can provide additional rules for Evictor, enabling it to target specific nodes and/or pods, thereby overriding the default behavior of Evictor.

Use case examples

Evictor Advanced Configuration use case for marking Job kind pods as removalDisabled if matchers are fulfilled:

evictionConfig:
  - podSelector:
      namespace: "namespace"
      kind: Job
      labelSelector:
      	matchLabels:
          pod.cast.ai/name: "cron-job"
          app.kubernetes.io/active: "true"
    settings:
      removalDisabled:
        enabled: true

Marking pods of the Job kind in a namespace as removalDisabled:

evictionConfig:
  - podSelector:
      namespace: "namespace"
      kind: Job
    settings:
      removalDisabled:
        enabled: true

Marking node as disposable:

evictionConfig:
  - nodeSelector:
      labelSelector:
        matchLabels:
          app.kubernetes.io/name: castai-node
          app.kubernetes.io/instance: instance
    settings:
      disposable:
        enabled: true

Disposing pod(s) using disposable flag:

evictionConfig:
  - podSelector:
      namespace: "replicaset-ns"
      kind: ReplicaSet
      labelSelector:
        matchExpressions:
          - key: pod.cast.ai/flag
            operator: In
            values:
              - "true"
    settings:
      disposable:
        enabled: true

Applying aggressive mode even if Evictor is running without aggressive mode turned on:

evictionConfig:
  - podSelector:
      namespace: "namespace"
      kind: ReplicaSet
      labelSelector:
        matchExpressions:
          - key: pod.cast.ai/flag
            operator: In
            values:
              - "true"
    settings:
      aggressive:
        enabled: true

Selectors

Selectors specify criteria for matching specific resources, e.g., pods or nodes. If the selector satisfies the rules, the eviction mode specified in the settings is applied. If no matches are found or if matches are found with the eviction mode enabled flag set to false, default Evictor rules will be applied for a given pod or node selector.

SelectorTargetSupported Keys
podSelectorPodnamespace, kind, labelSelector
nodeSelectorNodelabelSelector

Selector Keys

Selectors have specific matchers for deciding if targeting mode should be applied for resource. matchLabels and matchExpressions follow the same structure as in Kubernetes, documentation

NameTypeDescription
namespacestringSpecifies namespace to match for pod
kindstringPod owner kind matcher
labelSelectorobjectHolds matchLabels and matchExpressions array of label keys, operators and values

Settings

Settings hold one of the supported targeting modes and if that mode is enabled.

NameSupported Targeting ModesMode Enable Key
settingsremovalDisabled, aggressive, disposableenabled

Targeting Modes

Targeting mode - specifies the eviction type for the matched resource. The specified mode takes precedence and overrides default Evictor behavior.

NameDescription
removalDisabledResource is not removed
aggressiveApply aggressive mode to targeted resource
disposableA resource is eligible to be removed

NOTE aggressive targeting works the same as turning it on from the console, but only for a certain pod or node. If aggressive mode is turned on in the console and specified in the advanced configuration, it won't have any additional behavior.

Targeting Mode Keys

NameTypeDescription
enabledbooleanspecifies if to apply selector if a match is successful

How to pass Advanced Configuration?

Standard installation flow

Pass the advanced configuration using either the console interface or a config map.

During Evictor initialisation, CM (ConfigMap) is created under the castai-agent namespace with castai-evictor-config name.

Edit castai-evictor-config CM data, under config.yaml: | part with wanted evictionConfig YAML format contents, e.g.:

  • apiVersion: v1
    kind: ConfigMap
    metadata:
      name: "castai-evictor-config"
      labels:
        
        helm.sh/chart: castai-evictor-1
        app.kubernetes.io/name: castai-evictor
        app.kubernetes.io/instance: release-name
        app.kubernetes.io/version: "version"
        app.kubernetes.io/managed-by: Helm
    data:
      config.yaml: |
    
        evictionConfig:
          - nodeSelector:
              labelSelector:
                matchLabels:
                  app.kubernetes.io/name: castai-node
                  app.kubernetes.io/instance: instance
            settings:
              removalDisabled:
                enabled: true
          - podSelector:
              namespace: "namespace"
              kind: ReplicaSet
              labelSelector:
                matchLabels:
                  pod.cast.ai/name: castai-pod
            settings:
              disposable:
                enabled: true
          - podSelector:
              namespace: "namespace"
              kind: ReplicaSet
              labelSelector:
                matchExpressions:
                  - key: pod.cast.ai/flag
                    operator: In
                    values:
                      - "true"
                  - key: pod.cast.ai/name
                    operator: Exists
                matchLabels:
                  pod-label: "pod-label-value"
            settings:
              disposable:
                enabled: true
    

Manual installation

To pass Evictor Advanced Configuration with manual Evictor installation, you can provide a path to the created YAML file through thehelm, by setting the file location flag or using the set flag. You can find a guide on installing Evictor manually here.

  • --set-file customConfig=<path_to_file>
    
  • --set customConfig="<config_string>"
    

Troubleshooting

Evictor policy is not allowed to be turned on

  • Evictor is unavailable on the policies page because Cast AI has detected an existing Evictor installation.
  • Cast AI will not try to manage evictor settings or upgrades in such a scenario.
  • If you want Cast AI to manage Evictor configurations and upgrade to the most recent version, you must first remove the current installation.
  • If you want to manage Evictor yourself, the helm chart values can contain the corresponding configurations available in the UI.

How to check the logs

To check Evictor logs, run the following command:

kubectl logs -l app.kubernetes.io/name=castai-evictor -n castai-agent

Manually install Evictor

Evictor will compact your pods into fewer nodes, creating empty nodes that the Node deletion policy will remove. To install Evictor, run this command:

helm repo add castai-helm https://castai.github.io/helm-charts
helm upgrade --install castai-evictor castai-helm/castai-evictor -n castai-agent --set dryRun=false

This process will take some time. Also, by default, Evictor will not cause any downtime to single replica deployments / StatefulSets, pods without ReplicaSet, meaning that those nodes can't be removed gracefully. Familiarize yourself with rules and available overrides in order to set up Evictor to meet your needs.

In order for the Evictor to run in more aggressive mode (start considering applications with a single replica), you should pass the following parameters:

--set dryRun=false,aggressiveMode=true

For Evictor to run in scoped mode –only removing nodes created by CAST AI when using the scoped autoscaler – pass the following parameters:

--set dryRun=false,scopedMode=true

By default, Evictor will only impact nodes that are older than 5 minutes.

If you wish to change the grace period before a node can be considered for eviction, set the nodeGracePeriodMinutes parameter to the desired time in minutes. This is useful for slow-to-start nodes – it prevents them from being marked for eviction before they can start taking on workloads.

--set dryRun=false,nodeGracePeriodMinutes=8

Manually upgrade Evictor

  • Check the Evictor version you are currently using:

    helm ls -n castai-agent
    
  • Update the Helm chart repository to make sure that your Helm command is aware of the latest charts:

    helm repo update
    
  • Install the latest Evictor version:

    helm upgrade --install castai-evictor castai-helm/castai-evictor -n castai-agent --set dryRun=false
    
  • Check whether the Evictor version was changed:

    helm ls -n castai-agent
    

Eviction of StatefulSet pods

The support for eviction of StatefulSet pods was added in 0.26.3 Helm release and requires Helm release update due to RBAC changes (adding additional read permissions for PersistentVolume, PersistentVolumeClaim and StorageClass).

The Evictor excludes StatefulSet pods from eviction by default, the StatefulSet pods have to be explicitly marked or targeted as disposable so that Evictor can consider them for eviction. The eviction of StatefulSet pods can be disruptive to the application and should be used cautiously. The StatefulSet pods should have K8s probes reflecting the application state and a PodDisruptionBudget configured to minimize the impact of eviction.

The StatefulSet pods can be marked or targeted for eviction by:

  • Labeling or annotating pod using:

    autoscaling.cast.ai/disposable="true"
    
  • Targeting all StatefulSet pods using a pod selector in the Advanced Configuration:

    evictionConfig:
      - podSelector:
          kind: StatefulSet
        settings:
          disposable:
            enabled: true
    
  • Targeting replicated StatefulSet pods using a pod selector in the Advanced Configuration:

    evictionConfig:
      - podSelector:
          kind: StatefulSet
          replicasMin: 2
        settings:
          disposable:
            enabled: true
    
  • Targeting labeledStatefulSet pods using a pod selector in the Advanced Configuration:

    evictionConfig:
      - podSelector:
          kind: StatefulSet
          labelSelector:
            matchLabels:
              app.kubernetes.io/name: database
        settings:
          disposable:
            enabled: true
    
  • Targeting all pods (including StatefulSet pods) running on a targeted node using a node selector in the Advanced Configuration:

    evictionConfig:
      - nodeSelector:
          labelSelector:
            matchLabels:
              app.kubernetes.io/name: castai-node
        settings:
          disposable:
            enabled: true