Jump to Content
Cast AI
DocsAPI ReferenceRelease Notes
Log InCast AI
Docs
Log In
DocsAPI ReferenceRelease Notes
All
Pages
Start typing to search…
  • Getting started
    • About the read-only agent
    • Step by step guide to connecting your cluster
      • GCP Private Service Connect
      • AWS PrivateLink
  • Enable automation
    • Autoscaler preparation checklist
  • Platform permissions & data privacy
    • Kubernetes permissions
    • Cloud permissions
      • GKE service account impersonation
    • Data collection and storage
    • Communication requirements
  • Component management
    • Hosted components
    • Helm charts
    • Terraform provider
    • Component control
    • Cast AI Operator
  • Disconnect your cluster
  • Overview
  • Getting started
  • Overview
  • Getting started
  • Feature reference
  • Available savings
  • Overview
  • Getting started
  • Runbooks
    • Fix container image vulnerabilities
    • Synchronize Workload Autoscaler recommendations
  • Overview
  • Getting started
  • Autoscaling
    • Node templates
    • Node configuration
    • Spot Instances
      • Spot interruption prediction API
    • GPU Instances
    • GPU sharing
      • Time-slicing
      • Multi-Instance GPU (MIG)
      • Fractional GPUs (AWS)
    • Dynamic Resource Allocation (DRA)
    • Pod placement
    • Pod Pinner
    • Subnets
    • Network bandwidth
    • Commitments
      • AWS capacity reservations
    • Autoscaler settings
  • Downscaling
    • Evictor
    • Evictor vs. Rebalancer
  • Rebalancing
    • Workload preparation
    • Scheduled rebalancing
    • Paused drain configuration
  • Cluster hibernation
    • Cluster hibernation (Legacy)
  • Migration from Karpenter
  • Upgrading Kubernetes version
  • Container Live Migration
    • Concept
      • Overview
      • Probe and lifecycle behavior
    • Reference
      • Requirements and limitations
      • Labels, Annotations, and Events
    • Tutorials
      • Using Container Live Migration with Evictor and Rebalancer
  • Pod mutations
    • Quickstart
    • Overview
    • Tutorials
      • Enable Workload Autoscaler with pod mutations
    • Reference
  • Watchdog
  • Overview
  • Workload Autoscaler configuration
    • Available settings
    • Annotations reference
      • Legacy annotations reference (deprecated)
  • Scaling policies
    • How-to: Create a scaling policy
    • How-to: Manage scaling policies
  • In-Place Pod Resizing
  • Horizontal Pod Autoscaling
    • Tutorials
      • How-to: Configure HPA on a workload
      • How-to: HPA in scaling policies
      • How-to: Migrate from legacy horizontal scaling to HPA
    • Reference
      • KEDA compatibility
      • Vertical & horizontal workload autoscaling
      • Legacy horizontal scaling (v1) (deprecated)
  • Event log
  • Overview
  • Dashboard
  • Cluster score
  • Organization-level reports
    • Organizational cluster cost report
    • Organizational allocation groups
    • Idle resources report
  • Cluster-level reports
    • Efficiency
    • Workloads
    • Namespaces
    • Allocation groups
    • Cost comparison
  • GPU utilization
  • Network cost
  • Storage cost
  • CPU vs. memory cost calculation
  • Getting started
  • Serverless inference
    • Rate limits
    • Run serverless inference
      • OpenCode
      • Cline
  • Supported LLM providers
  • Hosted model deployment
    • Model autoscaling and hibernation
    • Tutorials
      • Deploy custom model with AI Enabler
  • AI Enabler settings
  • Introduction
  • Getting started
    • AWS RDS & Aurora quick start
    • Cloud SQL Proxy quick start
  • How does it work?
    • Access requirements and permissions
    • Security and compliance
    • Supported platforms
    • Performance estimation & cost savings
  • Connecting client applications
  • Application failover configuration
  • Is Your Application Ready for Connection Pooling?
  • Tutorials
    • Analyzing database performance
  • How-to
    • Pause DBO for troubleshooting
  • Index Advisor
    • Installing the Index Advisor agent
  • Database Optimizer FAQ
  • Getting started
  • Kvisor security agent
    • Overview
    • Installation & upgrading
    • Configuring Kvisor features
    • Private image scanning
  • Security reports
    • Security dashboard
    • Compliance
    • Vulnerabilities
    • Attack paths
    • Node updates
  • Runtime security
    • Overview
    • Installation & upgrading
    • Anomaly rules engine
  • Notifications
    • Tutorials
      • Set up Slack notifications
      • Set up webhook notifications
    • Webhook integration examples
  • Metrics
    • Tutorials
      • Integrating Prometheus Metrics with New Relic
  • Network intelligence
  • Single Sign-On (SSO)
    • IdP user group sync
      • Okta: Set up IdP user group sync
      • Okta: Managing IdP user group sync
  • Organization management
  • Role-Based Access Control (RBAC)
    • Users
      • How-to: Inviting users to organization
      • How-to: Changing user roles and access
      • How-to: Removing users
    • User Groups
      • How-to: Creating and configuring user groups
      • How-to: Managing existing user groups
    • Service Accounts
      • How-to: Creating service accounts
      • How-to: Managing service accounts
  • AWS Marketplace subscription setup
  • Price adjustments
    • Overview
    • Getting started
  • Audit log
  • Cluster controller
  • Spot Handler
  • egressd (deprecated)
  • Audit log exporter
  • Kvisor security agent
  • GPU metrics exporter
  • Reference
  • Using ARM nodes with Cast AI
  • Cluster and node status overview
  • Managing DaemonSets with Cast AI
  • Cast AI components troubleshooting
  • Cloud provider troubleshooting
  • Common deployment challenges
  • Terraform troubleshooting
  • Cluster certificate rotation
  • Pod startup failures with PD Standard on C3/C3D nodes (GKE)
  • Risks and detection
  • Minimize impact
  • General
  • API
  • Arm and Graviton
  • Autoscaler
  • Evictor
  • CUD, Savings Plans, and reservations
  • egressd, network and VPC
  • Helm
  • Kubernetes
  • Logs, alerts, and metrics
  • Node templates, node configuration, and labels
  • Permissions, users, Orgs, and SSO
  • Reports and UI/UX
  • Storage
  • Terraform
  • Updates and images
Powered by 
  1. Monitoring

GPU utilization

Updated 22 days ago


Cost comparison
Network cost