AI Enabler settings
Learn how to configure AI Enabler settings to control LLM proxying behavior, optimize costs, and manage prompt data.
Service account token scopes
Service account tokens in Cast AI can be scoped at two levels: organization or cluster. For AI Enabler, the token scope affects which providers and models are accessible, and how usage is tracked.
Understanding token scopes
| Token scope | Description | Supported model types |
|---|---|---|
| Organization-scoped | Grants access to all AI Enabler resources across your organization. This is the default scope for service accounts and is required for SaaS provider models. | SaaS providers (OpenAI, Anthropic, etc.) and hosted models |
| Cluster-scoped | Restricts access to AI Enabler resources associated with a specific cluster. Use this for stricter environment separation, such as different tokens for staging versus production. | Hosted/self-hosted models only |
SaaS provider models (OpenAI, Anthropic, Google Gemini, Mistral, and others) always require organization-scoped tokens because they operate at the organization level. Cluster-scoped tokens are only supported for hosted model deployments running in your Kubernetes clusters.
NoteTo create a cluster-scoped service account, navigate to Manage organization > Access control > Service Accounts and select specific cluster access when configuring the service account. See Creating service accounts for detailed instructions.
Behavior with cluster-scoped tokens
When you use a cluster-scoped token with AI Enabler, the following behaviors apply:
Provider filtering
Requests made with cluster-scoped tokens only have access to providers and models associated with that specific cluster. For hosted model deployments, this means:
- You only see hosted models deployed to that cluster
- Provider lists returned by the API are filtered to show cluster-relevant options
- Attempts to access models from other clusters will fail
Usage tracking
The AI Enabler proxy correctly attributes usage to the appropriate cluster when using cluster-scoped tokens. This ensures accurate cost reporting and analytics per cluster.
Settings endpoint
The settings endpoint (/v1/llm/settings) works with both organization-scoped and cluster-scoped tokens. When using a cluster-scoped token:
- Settings queries return the configuration relevant to that cluster
- Settings updates apply to the cluster context
Fallback model behavior
When a hosted model has a SaaS provider configured as a fallback, the fallback continues to work even when using cluster-scoped tokens. This is because:
- The primary request uses your cluster-scoped token to access the hosted model
- If the hosted model is unavailable (hibernating, scaling, or erroring), the system routes to the fallback
- The SaaS fallback operates at the organization level, using the organization context associated with your cluster
This ensures service continuity without requiring separate organization-scoped tokens for fallback scenarios.
NoteFallback routing to SaaS providers is transparent to your application. The cluster-scoped token handles authentication, and the system manages the fallback routing internally.
API Reference
For developers looking to override settings programmatically, here are the available request headers:
| Header | Type | Values | Description |
|---|---|---|---|
X-Provider-Name | String | Provider name | Route the request to a specific registered provider. The provider must belong to your Cast AI organization. |
Remember that header overrides have the highest priority and will take precedence over both API key settings and organization settings for the specific request.
Troubleshooting
Settings changes not taking effect
Check the following:
- Look for request headers that might be overriding your settings
- Verify the API key you're using doesn't have overriding settings that conflict with your expectations
- Verify the API key you're using is scoped to the correct resource (cluster, organization)
- Allow a few minutes for changes to propagate
Remember the priority order: Headers → API Key Settings → Organization Settings. A higher-priority setting will always override a lower-priority one.
For additional assistance with AI Enabler, contact Cast AI support or visit our community Slack channel.
Updated 25 days ago
