Currently CAST AI supports only kOps clusters running on AWS.
To connect your cluster, log in to the CAST AI console and navigate to Connect cluster window, kOps tab. Copy the provided script and run it in your terminal or cloud shell. Make sure that kubectl is installed and can access your cluster.
The script will create following kubernetes objects related to
- namespace and deployment
- serviceaccount and secret
- clusterrole and clusterrolebinding
- role and rolebinding
After installation, your cluster name will appear below connection instructions as well as in the Cluster list. From there, you can open the cluster details and explore a detailed savings estimate based on your cluster configuration.
The agent will run in a read-only mode, providing savings suggestions without applying any actual modifications.
To unlock all the benefits and enable automatic cost optimization, CAST AI needs to have access to your cluster. The following section describes the steps required to onboard the kOps cluster on the CAST AI console. To make it less troublesome, we created a script that automates most of the steps.
AWS CLI- A command line tool for working with AWS services using commands in your command-line shell. For more information, see Installing AWS CLI.
jq– a lightweight command line JSON processor. For more information about the tool click here.
IAM permissions – The IAM security principal that you're using must have permissions to work with AWS IAM, and related resources. Additionally, you should have access to the kOps cluster that you wish to onboard on the CAST AI console.
The CAST AI agent has to be running on the cluster.
To onboard your cluster, go to the Available Savings report and click on the Start saving or Enable CAST AI button. The button's name will depend on the number of optimizations available from your cluster.
Follow the instruction in the pop-up window to create and use AWS
That’s it! Your cluster is onboarded. Now you can enable optimization policies to keep your cluster configuration optimal.
Actions performed by the onboarding script¶
The script will perform the following actions:
cast-kops-*cluster-name*IAM user with the required permissions to manage the cluster:
- Manage instances in specified cluster restricted to cluster VPC
- Manage autoscaling groups in the specified cluster
- Manage EC2 Node Groups in the specified cluster
CASTKopsPolicyV2managed policy used to manage kOps cluster. The policy contains the following permissions:
- Create & delete instance profiles
- Create & manage roles
- Create & manage EC2 security groups, key pairs, and tags
- Run EC2 instances
- Create and manage the lambda function
CASTKopsRestrictedaccessinline policy to manage cluster specific resources.
CastLambdaRoleForSpotrole used to manage Spot interruption events with following AWS managed permission policies applied:
aws-authConfigMap to map newly created IAM user to the cluster
- Create and print AWS
SecretAccessKey, which then can be added to the CAST AI console and assigned to the corresponding kOps cluster. The
SecretAccessKeyare used to by CAST to make programmatic calls to AWS and are stored in CAST AI's secret store that runs on Google's Secret manager solution.
Write permissions are scoped to a single kOps cluster - it won't have access to resources of any other clusters in the AWS account.
Manual credential onboarding¶
To complete the steps mentioned above manually (without our script), be aware that when you create a cluster, the IAM entity user or role (such as a federated user that creates the cluster) is automatically granted a
system:masters permissions in the cluster's RBAC configuration in the control plane. To grant additional AWS users or roles the ability to interact with your cluster, you need to edit the
aws-auth ConfigMap within Kubernetes. For more information, see Managing users or IAM roles for your cluster.
Usage of AWS services¶
CAST AI relies on the agent runs inside customer's cluster. The following services are consumed during the operation:
- A portion of EC2 node resources from the customer's cluster. The CAST AI agent uses Cluster proportional vertical autoscaler to consume a minimum required resources depending on the size of the cluster
- Low amount of network traffic to communicate with CAST AI SaaS
- Lambda function to handle Spot Instance interruptions
- EC2 instances, their storage, and intra-cluster network traffic to manage Kubernetes cluster and perform autoscaling
- IAM resources as detailed in the onboarding section
Custom taints on kOps v1.17 with kube-router¶
There's a known issue on kOps v1.17.
Nodes with custom taints are not able to join the cluster when cluster is used with
kube-router networking component.
This happens because
kube-router doesn't have required tolerations to start on nodes with custom taints.
Impact: CAST.AI won't be able to add any nodes with custom taints (ex. Spot) for impacted clusters.
Resolution: add following tolerations to
kube-router daemonSet in your kOps v1.17 cluster's
tolerations: - effect: NoSchedule operator: Exists - effect: NoExecute operator: Exists - key: CriticalAddonsOnly operator: Exists