Kubernetes permissions

Kubernetes' Service Accounts and permissions used by CAST AI components

CAST AI components running on customers' clusters use predefined Service Accounts and relevant permissions to be able to perform certain functions (like for example sending data about cluster state, etc.).
This section contains detailed description of all required service accounts and permissions granted to CAST AI components.

Kubernetes Service Accounts used by CAST AI components

Each CAST AI component installed into customer's cluster uses a dedicated Service Account.
Such setup allows fine-grained permissions tuning for each component:

» kubectl get serviceAccounts -n castai-agent
NAME                        SECRETS   AGE
castai-agent                1         46h
castai-cluster-controller   1         4h20m
castai-evictor              1         4h20m
castai-spot-handler         1         4h20m
default                     1         46h

CAST AI Kubernetes Agent permissions (Phase 1)

CAST AI Kubernetes Agent must be able to collect cluster operational details (snapshots) and provide them to the central platform to estimate whether there is an optimisation opportunity.
Thus, it must be granted with cluster wide permissions:

API GroupResourcesVerbs
corepods, nodes, replicationcontrollers, persistentvolumeclaims, persistentvolumes, servicesget, list, watch
corenamespacesget
appsdeployments, replicasets, daemonsets, statefulsetsget, list, watch
storage.k8s.iostorageclasses, csinodesget, list, watch
batchjobsget, list, watch

CAST AI Kubernetes Agent's resource consumption vastly depends on the cluster size.
The agent requires possibility to adjust resource limits proportionally to the size of the cluster.
For that purpose Cluster Proportional Vertical Autoscaler patches CAST AI Kubernetes Agent's deployment with re-estimated limits, which requires following permission:

API GroupResourcesVerbsDescription
appsdeploymentspatchUsed only to patch castai-agent deployment

CAST AI Cluster Controller (Phase 2)

CAST AI Cluster Controller component is installed when a connected cluster is promoted to Phase 2, which enables cost savings by managing customer's cluster:

» kubectl get deployments -n castai-agent
NAME                        READY   UP-TO-DATE   AVAILABLE   AGE
castai-agent                1/1     1            1           43h
castai-cluster-controller   2/2     2            2           64m
castai-evictor              0/0     0            0           64m

Cluster wide permissions used by Cluster Controller

Cluster Controller operates mostly on cluster level as it performs operations required to optimize customer clusters' costs:

API GroupResourcesVerbsDescription
corenamespaceget
corepods, nodesget, list
corenodespatch, updateUsed for node draining and patching
corepods, nodesdelete
corepods/evictioncreate
certificates.k8s.iocertificatesigningrequestsget, list, delete, createUsed for creating a new certificate when adding a node to the cluster
certificates.k8s.iocertificatesigningrequests/approvalpatch, updateUsed for creating a new certificate when adding a node to the cluster
certificates.k8s.iosignersapproveApplicable only for kubelet
coreeventslist, create, patch
rbac.authorization.k8s.ioroles, clusterroles, clusterrolebindingsget, patch, update, delete, escalateApplicable for all CAST AI Components
corenamespacedeleteApplicable only for CAST AI Kubernetes Agent

Namespace wide (castai-agent) permissions used by Cluster Controller

One of the main task of Cluster Controller is to performs CAST AI components upgrades.
Cluster Controller is granted with all permissions in castai-agent namespace which is required for the current and future updates.
Additionally, Cluster Controller is granted with two cluster wide permissions to be able to manage RBAC of CAST AI components and possibility to delete CAST AI namespace (see above).

CAST AI Evictor (Phase 2)

When a cluster is onboarded with CAST AI for cost optimisation (Phase 2), there are more components installed (not just Cluster Controller).
One other CAST AI components is Evictor - its responsibility is to minimize amount of nodes used by the cluster.

Cluster wide permissions used by Evictor

When installed Evictor manipulates non CAST AI pods, so it requires a set to cluster wide permissions:

API GroupResourcesVerbsDescription
coreeventscreate, patch
corenodesget, list, watch, patch, updateUsed to find a suitable node for eviction
corepodsget, list, watch, patch, update, create, deleteList pods to find a suitable node for eviction and delete a stuck pod from a node
appsreplicasetsgetUsed to find out whether it's safe to evict a pod (it belongs to RS and has replicas)
corepods/evictioncreateUsed for pod eviction
coordination.k8s.ioleases*Used for leader election when there may be a single instance active