GKE service account impersonation
Cast AI supports GKE service account impersonation as an alternative to using service account keys. This method enhances security by eliminating the need for key management while allowing Cast AI to access your GCP resources.
How impersonation works
Cast AI uses a two-tier service account architecture for GKE impersonation:
Organization-level impersonation service account: Cast AI creates one impersonation service account per organization, shared across all clusters within that organization. This service account uses the naming pattern cast-gke-<hash>@prod-cast-identity.iam.gserviceaccount.com.
Cluster-specific service accounts: Each cluster maintains its own dedicated GCP service account in your project, following the castai-gke-<cluster-name-hash> naming convention for direct resource management.
Prerequisites
Before setting up impersonation, ensure you have:
- An existing GCP service account with the necessary permissions for your GKE cluster
- The Cast AI API token and cluster ID
- The
gcloudCLI configured with appropriate permissions - The
jqcommand-line JSON processor installed
Setup process
Using the Terraform module (optional)
Cast AI provides a Terraform module for GKE IAM impersonation that automates the IAM policy configuration process. This module handles the complex IAM policy bindings and conditions required for impersonation.
If you use the Terraform module, you'll still need to call the /gcp-create-sa API endpoint to complete the registration with Cast AI's system. This API call is required for Cast AI's internal cluster linking and cannot be replaced by infrastructure provisioning alone.
Step 1: Register the impersonation service account
Call the /gcp-create-sa API endpoint to register your service account for impersonation. This step is mandatory for every cluster, even when managing service accounts through infrastructure-as-code.
curl -X POST \
-H "X-API-Key: $CASTAI_API_TOKEN" \
"$CASTAI_API_URL/v1/kubernetes/external-clusters/$CASTAI_CLUSTER_ID/gcp-create-sa" \
-d '{
"gke": {
"project_id": "'"$PROJECT_ID"'",
"gke_sa_impersonate": "'"$SERVICE_ACCOUNT_EMAIL"'"
}
}'
NoteThis API call returns the same impersonation service account for all clusters in your organization. While the service account remains consistent, each API call is required to:
- Link the GCP project to the specific cluster in Cast AI's system
- Associate the impersonated service account with the cluster configuration
- Enable proper permissions mapping for cluster operations
Step 2: Configure service account permissions
Grant the following permissions to your service account:
| Permission | Purpose |
|---|---|
roles/iam.serviceAccountUser | Allows Cast AI to act as the service account |
roles/iam.serviceAccountTokenCreator | Enables token generation for impersonation |
compute.subnetworks.useExternalIp | Required for network operations |
compute.networks.useExternalIp | Required for network operations |
Step 3: Set configuration variables
Configure your environment with the following settings:
- Set
CASTAI_IMPERSONATE=truein your environment variables - Include both
gkeSaImpersonateandprojectIdfields in your GKE configuration block
Implementation example
The following script demonstrates the complete impersonation setup process:
if [[ -n $CASTAI_IMPERSONATE ]]; then
echo "Registering service account for impersonation: $SERVICE_ACCOUNT_EMAIL"
# Register service account with Cast AI
RESPONSE=$(curl -sSL --write-out "HTTP_STATUS:%{http_code}" \
-X POST -H "X-API-Key: $CASTAI_API_TOKEN" \
"$CASTAI_API_URL/v1/kubernetes/external-clusters/$CASTAI_CLUSTER_ID/gcp-create-sa" \
-d '{"gke":{"project_id":"'$PROJECT_ID'","gke_sa_impersonate":"'$SERVICE_ACCOUNT_EMAIL'"}}')
RESPONSE_STATUS=$(echo "$RESPONSE" | tr -d '\n' | sed -e 's/.*HTTP_STATUS://')
RESPONSE_BODY=$(echo "$RESPONSE" | sed -e 's/HTTP_STATUS\:.*//g')
if [[ $RESPONSE_STATUS != "200" ]]; then
echo "Failed to register service account for impersonation. HTTP status: $RESPONSE_STATUS"
echo $RESPONSE_BODY
exit 1
fi
# Extract Cast AI service account from response
CASTAI_SERVICE_ACCOUNT=$(echo "$RESPONSE_BODY" | jq -r '.serviceAccountEmail')
echo "Cast AI impersonation service account: $CASTAI_SERVICE_ACCOUNT"
if [[ "$CASTAI_SERVICE_ACCOUNT" == "" || "$CASTAI_SERVICE_ACCOUNT" == "null" ]]; then
echo "Failed to retrieve Cast AI service account from response"
echo $RESPONSE_BODY
exit 1
fi
# Clean up existing IAM policy bindings
echo "Removing existing IAM policy bindings"
gcloud projects remove-iam-policy-binding $SERVICE_ACCOUNT_EMAIL \
--member="serviceAccount:$CASTAI_SERVICE_ACCOUNT" \
--project $PROJECT_ID \
--role='roles/iam.serviceAccountUser' \
--all --no-user-output-enabled >/dev/null 2>&1
gcloud projects remove-iam-policy-binding $SERVICE_ACCOUNT_EMAIL \
--member="serviceAccount:$CASTAI_SERVICE_ACCOUNT" \
--project $PROJECT_ID \
--role='roles/iam.serviceAccountTokenCreator' \
--all --no-user-output-enabled >/dev/null 2>&1
# Grant token creator permissions
echo "Configuring impersonation permissions"
gcloud iam service-accounts add-iam-policy-binding $SERVICE_ACCOUNT_EMAIL \
--member="serviceAccount:$CASTAI_SERVICE_ACCOUNT" \
--role="roles/iam.serviceAccountTokenCreator" \
--condition="title=AlwaysTrueCondition,description=This condition is always true,expression=true" \
--project $PROJECT_ID
# Grant impersonation permissions with conditional access
gcloud iam service-accounts add-iam-policy-binding $SERVICE_ACCOUNT_EMAIL \
--member="serviceAccount:$CASTAI_SERVICE_ACCOUNT" \
--role="roles/iam.serviceAccountUser" \
--condition="title=SpecificServiceAccountCondition,description=Allow impersonation only for Cast AI service account,expression=request.auth.claims.email == \"$CASTAI_SERVICE_ACCOUNT\"" \
--project $PROJECT_ID
echo "Waiting for IAM permissions to propagate (180 seconds)"
sleep 180
# Update cluster configuration
echo "Updating cluster configuration with impersonation settings"
RESPONSE=$(curl -sSL --write-out "HTTP_STATUS:%{http_code}" \
-X POST -H "X-API-Key: $CASTAI_API_TOKEN" \
"$CASTAI_API_URL/v1/kubernetes/external-clusters/$CASTAI_CLUSTER_ID" \
-d '{"credentials":"{}"}')
RESPONSE_STATUS=$(echo "$RESPONSE" | tr -d '\n' | sed -e 's/.*HTTP_STATUS://')
RESPONSE_BODY=$(echo "$RESPONSE" | sed -e 's/HTTP_STATUS\:.*//g')
if [[ $RESPONSE_STATUS -eq 200 ]]; then
echo "Impersonation setup completed successfully"
else
echo "Failed to update cluster configuration with impersonation settings"
echo "Error details: HTTP $RESPONSE_STATUS - $RESPONSE_BODY"
exit 1
fi
fiTroubleshooting
HTTP 500 Internal Server Error during cluster update
Symptoms: The cluster update fails with a 500 Internal Server Error after successfully configuring IAM permissions.
Common causes:
- The
/gcp-create-saAPI endpoint was not called before attempting the cluster update - Mixing impersonation and non-impersonation authentication methods in the same script
- Insufficient IAM permissions on the service account
- IAM permission changes have not yet propagated
Resolution steps:
- Verify that you called
/gcp-create-sabefore updating the cluster configuration - Ensure your script uses either impersonation or key-based authentication consistently
- Confirm all required permissions are properly configured on your service account
- Wait at least 3 minutes after configuring IAM permissions before updating the cluster
Updated 26 days ago
