Cloud permissions

Permissions setup used by cloud providers (AWS/GCP/Azure)

When the cluster enters the mode of automated cost optimization – the central system of CAST AI can start performing operations on the Cloud Provider (AWS/GCP/Azure) level. An example of such action would be a request to add a node to a cluster.

Performing such operations requires relevant credentials and permissions specific to your Cloud Service Provider. This guide describes permission setups for AWS, GCP and Azure.

AWS

AWS user created by CAST AI

The onboarding script creates a dedicated AWS user for CAST AI to request and manage AWS resources on your behalf.

The user follows a cast-eks-<cluster name> convention:

» aws iam list-users --output text|grep cast-eks-
USERS   arn:aws:iam::123456789012:user/cast-eks-some-cluster   2022-05-12T12:48:47+00:00   /   123456789012345678901   cast-eks-some-cluster

AWS permissions used by CAST AI

Once the AWS user is created, the following policies are attached to it:

API GroupTypeDescription
AmazonEC2ReadOnlyAccessAWS managed policyUsed to fetch details about Virtual Machines.
IAMReadOnlyAccessAWS managed policyUsed to fetch required data from IAM.
CastEKSPolicyManaged policyCAST AI policy for creating and removing Virtual Machines when managing Cluster nodes.

You can validate these policies by combining results from the following commands (for details on how to do that, please consult the AWS documentation):

aws iam list-user-policies --user-name <user name>
aws iam list-attached-user-policies --user-name <user name>
aws iam list-groups-for-user --user-name <user name>

The result also contains the ARNs of policies, which are required for inspecting permissions. This can be done using the following commands:

» aws iam list-policy-versions --policy-arn arn:aws:iam::123456789012:policy/CastEKSPolicy
{
    "Versions": [
        {
            "VersionId": "v83",
            "IsDefaultVersion": true,
            "CreateDate": "2022-05-12T12:49:01+00:00"
        },
        {
            "VersionId": "v82",
            "IsDefaultVersion": false,
            "CreateDate": "2022-05-12T09:53:58+00:00"
        }
    ]
}

... and then:

» aws iam get-policy-version --policy-arn arn:aws:iam::123456789012:policy/CastEKSPolicy --version-id v83
{
    "PolicyVersion": {
        "Document": {
            "Version": "2012-10-17",
            "Statement": [
                {
                    "Sid": "PassRoleEC2",
                    "Action": "iam:PassRole",
                    "Effect": "Allow",
                    "Resource": "arn:aws:iam::*:role/*",
                    "Condition": {
                        "StringEquals": {
                            "iam:PassedToService": "ec2.amazonaws.com"
                        }
                    }
                },
                {
                    "Sid": "NonResourcePermissions",
                    "Effect": "Allow",
                    "Action": [
                        "iam:CreateServiceLinkedRole",
                        "ec2:CreateKeyPair",
                        "ec2:DeleteKeyPair",
                        "ec2:CreateTags",
                        "ec2:ImportKeyPair"
                    ],
                    "Resource": "*"
                },
                {
                    "Sid": "RunInstancesPermissions",
                    "Effect": "Allow",
                    "Action": "ec2:RunInstances",
                    "Resource": [
                        "arn:aws:ec2:*:028075177508:network-interface/*",
                        "arn:aws:ec2:*:028075177508:security-group/*",
                        "arn:aws:ec2:*:028075177508:volume/*",
                        "arn:aws:ec2:*:028075177508:key-pair/*",
                        "arn:aws:ec2:*::image/*"
                    ]
                }
            ]
        },
        "VersionId": "v83",
        "IsDefaultVersion": true,
        "CreateDate": "2022-05-12T12:49:01+00:00"
    }
}

AWS permissions with access granted using a cross-account IAM role

When enabling cost optimization for a connected cluster, you can grant permissions using a cross-account IAM role.

This feature allows creating a dedicated cluster user in the CAST AI AWS account with a trust policy able to assume the role defined in your AWS account.

Keeping role definitions and users in separate AWS accounts enables storing the user's credentials on the CAST AI side without handing them over when running the onboarding script. In turn, this improves security levels.

From the customer perspective, the used role contains the same set of permissions as in the case of a regular flow when the user is created in your AWS account. You can verify this using the following command:

aws iam list-attached-role-policies --role-name <role name>
aws iam list-role-policies --role-name <role name>

Additionally, you can create a trust relationship with the following:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::123456789012:user/cast-crossrole-f8f82b9c-d375-40d2-9483-123456789012"
            },
            "Action": "sts:AssumeRole"
        }
    ]
}

GCP

The GCP service account used by CAST AI

The onboarding script creates a dedicated GCP service account CAST AI uses to request and manage GCP resources on your behalf.

The Service Account follows a castai-gke-<cluster-name-hash> convention. You can verify the service account by:

gcloud iam service-accounts describe castai-gke-<cluster-name-hash>@<your-gcp-project>.iam.gserviceaccount.com

The service account created by CAST AI includes the following roles:

Role nameDescription
castai.gkeAccessA CAST AI-managed role used to handle CAST AI add/delete node operations. You can find a full list of permissions below.
container.developerA GCP-managed role for full access to Kubernetes API objects inside the Kubernetes cluster.
iam.serviceAccountUserA GCP-managed role to allow running operations as a service account.

IAM Conditions

When creating that service account you can enforce conditional, attribute-based access on the iam.serviceAccountUser role.
It can access and act as all other service accounts or be scoped to the ones used by node pools in the GKE cluster, which is more secure and therefore recommended. By default the onboarding script follows the more secure option.

When onboarding the cluster with Terraform you can use the castai_gke_iam module to specify which method you want to use, you can find an example here.

List of castai.gkeAccess role permissions:

» gcloud iam roles describe --project=<your-project-name> castai.gkeAccess

description: Role to manage GKE cluster via CAST AI
etag: example-tag
includedPermissions:
- compute.addresses.use
- compute.disks.create
- compute.disks.setLabels
- compute.disks.use
- compute.images.useReadOnly
- compute.instanceGroupManagers.get
- compute.instanceGroupManagers.update
- compute.instanceGroups.get
- compute.instanceTemplates.create
- compute.instanceTemplates.delete
- compute.instanceTemplates.get
- compute.instanceTemplates.list
- compute.instances.create
- compute.instances.delete
- compute.instances.get
- compute.instances.list
- compute.instances.setLabels
- compute.instances.setMetadata
- compute.instances.setServiceAccount
- compute.instances.setTags
- compute.instances.start
- compute.instances.stop
- compute.networks.use
- compute.networks.useExternalIp
- compute.subnetworks.get
- compute.subnetworks.use
- compute.subnetworks.useExternalIp
- compute.zones.get
- compute.zones.list
- container.certificateSigningRequests.approve
- container.clusters.get
- container.clusters.update
- container.operations.get
- serviceusage.services.list
- resourcemanager.projects.getIamPolicy
name: projects/<your-project-name>/roles/castai.gkeAccess
stage: ALPHA
title: Role to manage GKE cluster via CAST AI

Azure

An overview of Azure permissions used by CAST AI

The onboarding script creates a dedicated Azure app registration for CAST AI to request and manage Azure resources on your behalf.

App registration naming follows this convention: CAST.AI ${CLUSTER_NAME}-${CASTAI_CLUSTER_ID:0:8}".

Created CAST AI app registration has a custom role bound to it. Custom role name follows this naming convention: CastAKSRole-${CASTAI_CLUSTER_ID:0:8}.

The role has a predefined list of permissions scoped only to managed cluster's resource groups.

List of CastAKSRole role permissions:

ROLE_NAME="CastAKSRole-${CASTAI_CLUSTER_ID:0:8}"
ROLE_DEF='{
   "Name": "'"$ROLE_NAME"'",
   "Description": "CAST.AI role used to manage '"$CLUSTER_NAME"' AKS cluster",
   "IsCustom": true,
   "Actions": [
       "Microsoft.Compute/*/read",
       "Microsoft.Compute/virtualMachines/*",
       "Microsoft.Compute/virtualMachineScaleSets/*",
       "Microsoft.Compute/disks/write",
       "Microsoft.Compute/disks/delete",
       "Microsoft.Compute/disks/beginGetAccess/action",
       "Microsoft.Compute/galleries/write",
       "Microsoft.Compute/galleries/delete",
       "Microsoft.Compute/galleries/images/write",
       "Microsoft.Compute/galleries/images/delete",
       "Microsoft.Compute/galleries/images/versions/write",
       "Microsoft.Compute/galleries/images/versions/delete",
       "Microsoft.Compute/snapshots/write",
       "Microsoft.Compute/snapshots/delete",
       "Microsoft.Network/*/read",
       "Microsoft.Network/networkInterfaces/write",
       "Microsoft.Network/networkInterfaces/delete",
       "Microsoft.Network/networkInterfaces/join/action",
       "Microsoft.Network/networkSecurityGroups/join/action",
       "Microsoft.Network/virtualNetworks/subnets/join/action",
       "Microsoft.Network/applicationGateways/backendhealth/action",
       "Microsoft.Network/applicationGateways/backendAddressPools/join/action",
       "Microsoft.Network/applicationSecurityGroups/joinIpConfiguration/action",
       "Microsoft.Network/loadBalancers/backendAddressPools/write",
       "Microsoft.Network/loadBalancers/backendAddressPools/join/action",
       "Microsoft.ContainerService/*/read",
       "Microsoft.ContainerService/managedClusters/start/action",
       "Microsoft.ContainerService/managedClusters/stop/action",
       "Microsoft.ContainerService/managedClusters/runCommand/action",
       "Microsoft.ContainerService/managedClusters/agentPools/*",
       "Microsoft.Resources/*/read",
       "Microsoft.Resources/tags/write",
       "Microsoft.Authorization/locks/read",
       "Microsoft.Authorization/roleAssignments/read",
       "Microsoft.Authorization/roleDefinitions/read",
       "Microsoft.ManagedIdentity/userAssignedIdentities/assign/action"
     ],
     "AssignableScopes": [
       "/subscriptions/'"$SUBSCRIPTION_ID"'/resourceGroups/'"$CLUSTER_GROUP"'",
       "/subscriptions/'"$SUBSCRIPTION_ID"'/resourceGroups/'"$NODE_GROUP"'"
     ]
}'