Configuring Kvisor features
WarningThe Cast AI Kubernetes Security feature set is undergoing significant changes. Some features shown in this documentation are being deprecated and others are moving to the cluster view in the console. Screenshots and navigation paths may not reflect the current product. Updated documentation is in progress.
Which upgrade method to useThe Helm commands on this page use the umbrella chart (
castai-helm/castai) by default. If you need to use a different method:
- castctl: To upgrade all Cast AI components at once without managing Helm flags:
This preserves your existing configuration. See the castctl documentation for installation and authentication instructions.castctl castware upgrade- Individual charts: If you installed each component as a separate Helm release (e.g., for ArgoCD or custom GitOps), replace the release name and chart reference with the component-specific ones (e.g.,
castai-workload-autoscalerandcastai-helm/castai-workload-autoscaler) and remove theautoscaler.castai-workload-autoscaler.value prefix.Not sure which method you used? Run
helm list -n castai-agent. A single release namedcastaimeans umbrella chart; separate releases likecastai-workload-autoscalermean individual charts.
This guide explains how to configure various features of the Kvisor security agent to enhance your Kubernetes security posture. Kvisor provides several specialized monitoring and scanning capabilities that can be enabled and customized according to your needs.
Configuration overview
Kvisor supports multiple configuration options that can be set via Helm during installation or upgrade. The basic format for enabling or modifying features is:
Configuration-only changesThis command upgrades all components in the umbrella chart, not just Kvisor. To change configuration without upgrading component versions, pin the chart to your current version by adding
--version <your-current-chart-version>to the command. Runhelm list -n castai-agentto check your current version.
helm upgrade castai castai-helm/castai -n castai-agent \
--reset-then-reuse-values \
--set autoscaler.castai-kvisor.[configuration-option]=[value]All supported configuration values are found in the Kvisor Helm chart values.yaml.
Scanning frequencies and intervals
Cast AI performs different types of security scans at various intervals. You can customize these frequencies to meet your specific requirements:
Image vulnerability scanning
Default behavior:
- Detection interval: Checks for new images every 30 seconds
- Concurrent scans: 1 image scanned at a time
- Trigger: Scans start automatically when new running images are detected
Configure image scan detection interval
helm upgrade castai castai-helm/castai -n castai-agent \
--reset-then-reuse-values \
--set autoscaler.castai-kvisor.controller.extraArgs.image-scan-interval=60sConfigure concurrent image scans
To increase scanning performance for environments with many images:
helm upgrade castai castai-helm/castai -n castai-agent \
--reset-then-reuse-values \
--set autoscaler.castai-kvisor.controller.extraArgs.image-concurrent-scans=6Compliance checking
Default behavior:
- Scan frequency: Every 60 seconds
- Monitoring: Continuous configuration monitoring
Configure compliance scan interval
helm upgrade castai castai-helm/castai -n castai-agent \
--reset-then-reuse-values \
--set autoscaler.castai-kvisor.controller.extraArgs.kube-linter-scan-interval=120sSummary of scan frequencies
| Security Feature | Default Frequency | Configuration Parameter |
|---|---|---|
| Image vulnerability scanning | Every 30 seconds | image-scan-interval |
| Compliance checks | Every 60 seconds | kube-linter-scan-interval |
| Attack paths | Every 3 hours | Not configurable |
| Runtime anomalies | Real-time | Event-driven |
Image scanning configuration
Scanning intervals and behavior
Cast AI automatically scans container images for vulnerabilities with the following default behavior:
- Detection interval: Checks for new images every 30 seconds
- Concurrent scans: 1 image scanned at a time
- Trigger: Scans start automatically when new running images are detected
Configure image scan detection interval
helm upgrade castai castai-helm/castai -n castai-agent \
--reset-then-reuse-values \
--set autoscaler.castai-kvisor.controller.extraArgs.image-scan-interval=60sConfigure compliance scan interval
Compliance checks run every 60 seconds by default. To adjust this interval:
helm upgrade castai castai-helm/castai -n castai-agent \
--reset-then-reuse-values \
--set autoscaler.castai-kvisor.controller.extraArgs.kube-linter-scan-interval=120sUse a custom image
You can configure Kvisor to use a custom image from your private registry:
helm upgrade castai castai-helm/castai -n castai-agent \
--reset-then-reuse-values \
--set autoscaler.castai-kvisor.image.repository=my-kvisor-repository \
--set autoscaler.castai-kvisor.image.tag=my-tag \
--set 'autoscaler.castai-kvisor.imagePullSecrets[0].name=my-pull-secret'Private image scanning
For detailed instructions on configuring private image scanning, refer to the dedicated Private Image Scanning documentation.
Excluding Namespaces from Scanning
You may want to exclude specific namespaces from image scanning, particularly for third-party monitoring tools or systems that manage their own security. To configure namespace exclusions:
helm upgrade castai castai-helm/castai -n castai-agent \
--reset-then-reuse-values \
--set autoscaler.castai-kvisor.controller.extraArgs.image-scan-ignored-namespaces=namespace1,namespace2For example, to exclude the Dynatrace namespace from scanning:
helm upgrade castai castai-helm/castai -n castai-agent \
--reset-then-reuse-values \
--set autoscaler.castai-kvisor.controller.extraArgs.image-scan-ignored-namespaces=dynatraceYou can specify multiple namespaces by separating them with commas.
Network traffic monitoring
Kvisor can collect Kubernetes network flows using eBPF. This feature provides visibility into pod-to-pod and pod-to-external communications, which is valuable for security analysis and network optimization.
To enable network traffic monitoring:
helm upgrade castai castai-helm/castai -n castai-agent \
--reset-then-reuse-values \
--set autoscaler.castai-kvisor.castai.apiKey=<your-api-token> \
--set autoscaler.castai-kvisor.castai.clusterID=<your-cluster-id> \
--set autoscaler.castai-kvisor.agent.enabled=true \
--set autoscaler.castai-kvisor.agent.extraArgs.netflow-enabled=true
NoteIf you have the
egressdcomponent running, it should be uninstalled before enabling network traffic monitoring in Kvisor:helm uninstall castai-helm/egressd -n castai-agent
Enrich network flows with cloud contextTo enrich netflow data with VPC, subnet, and region information from your cloud provider, see Cloud network context.
Resource statistics monitoring
Kvisor can collect resource usage statistics from containers and nodes. These monitoring capabilities are controlled by two independent flags, each enabling a distinct category of metrics.
Kubernetes storage metrics
Kvisor collects storage utilization data for both ephemeral storage and persistent volumes across your cluster nodes. This data surfaces per-node storage usage in the Node list of the Cast AI console.
Storage metrics collection requires both the Kvisor controller and agent to be enabled.
helm upgrade castai castai-helm/castai -n castai-agent \
--reset-then-reuse-values \
--set autoscaler.castai-kvisor.castai.apiKey=<your-api-token> \
--set autoscaler.castai-kvisor.castai.clusterID=<your-cluster-id> \
--set autoscaler.castai-kvisor.controller.enabled=true \
--set autoscaler.castai-kvisor.agent.enabled=true \
--set autoscaler.castai-kvisor.agent.extraArgs.storage-stats-enabled=trueBesides storage utilization directly in the node, Kvisor is also able to collect metrics on cloud volumes that are
being used. Collection happens inside the controller component and requires cloud access permissions, in order to fetch
volume related information. For now, only AWS and GCP are supported.
AWS
If IMDS (Instance Metadata Service) access is enabled for the Kvisor controller pod, no additional configuration is
required. Kvisor will automatically use the node's instance role to authenticate with AWS.
If IMDS access is not available (e.g. it has been restricted at the instance or pod level), you must set up
IRSA (IAM Roles for Service Accounts)
and annotate the Kvisor controller service account with the appropriate IAM role ARN. The IAM role must have
the following permission:
ec2:DescribeVolumes
Once authentication is configured, enable cloud volume metrics collection:
helm upgrade castai castai-helm/castai -n castai-agent \
--reset-then-reuse-values \
--set autoscaler.castai-kvisor.castai.apiKey=<your-api-token> \
--set autoscaler.castai-kvisor.castai.clusterID=<your-cluster-id> \
--set autoscaler.castai-kvisor.controller.enabled=true \
--set autoscaler.castai-kvisor.agent.enabled=true \
--set autoscaler.castai-kvisor.agent.extraArgs.storage-stats-enabled=true \
--set autoscaler.castai-kvisor.controller.extraArgs.cloud-provider=aws \
--set autoscaler.castai-kvisor.controller.extraArgs.cloud-provider-aws-region=<your-aws-region> \
--set autoscaler.castai-kvisor.controller.extraArgs.cloud-provider-storage-sync-enabled=trueReplace <your-aws-region> with the AWS region your cluster is running in.
GCP
Cloud volume metrics collection on GCP requires Workload Identity
to be configured so that the Kvisor controller service account can authenticate with the GCP API. The bound GCP
service account must have the following permissions in your project:
compute.disks.listcompute.instances.list
Once Workload Identity is configured, enable cloud volume metrics collection:
helm upgrade castai castai-helm/castai -n castai-agent \
--reset-then-reuse-values \
--set autoscaler.castai-kvisor.castai.apiKey=<your-api-token> \
--set autoscaler.castai-kvisor.castai.clusterID=<your-cluster-id> \
--set autoscaler.castai-kvisor.controller.enabled=true \
--set autoscaler.castai-kvisor.agent.enabled=true \
--set autoscaler.castai-kvisor.agent.extraArgs.storage-stats-enabled=true \
--set autoscaler.castai-kvisor.controller.extraArgs.cloud-provider=gcp \
--set autoscaler.castai-kvisor.controller.extraArgs.cloud-provider-gcp-project-id=<your-gcp-project-id> \
--set autoscaler.castai-kvisor.controller.extraArgs.cloud-provider-storage-sync-enabled=trueReplace <your-gcp-project-id> with your GCP project ID.
Pressure Stall Information (PSI) metrics
Kvisor can collect Pressure Stall Information (PSI) metrics from containers and nodes, along with CPU, memory, and I/O usage statistics. PSI metrics indicate how often workloads experience delays due to contention for CPU, memory, or I/O resources, helping you identify performance bottlenecks before they affect application behavior.
helm upgrade castai castai-helm/castai -n castai-agent \
--reset-then-reuse-values \
--set autoscaler.castai-kvisor.castai.apiKey=<your-api-token> \
--set autoscaler.castai-kvisor.castai.clusterID=<your-cluster-id> \
--set autoscaler.castai-kvisor.agent.enabled=true \
--set autoscaler.castai-kvisor.agent.extraArgs.stats-enabled=trueGPU metrics
Kvisor collects GPU metrics from NVIDIA GPUs in your cluster using DCGM Exporter. The GPU export pipeline and dcgm-exporter run only on nodes with GPUs attached, regardless of cloud provider — Kvisor itself continues to run on all cluster nodes.
To enable GPU metrics collection:
helm upgrade castai castai-helm/castai -n castai-agent \
--reset-then-reuse-values \
--set autoscaler.castai-kvisor.castai.apiKey=<your-api-token> \
--set autoscaler.castai-kvisor.castai.clusterID=<your-cluster-id> \
--set autoscaler.castai-kvisor.agent.gpu.enabled=trueMigrating from gpu-metrics-exporter
If you are currently using the standalone gpu-metrics-exporter component, follow these steps to migrate:
- Upgrade Kvisor and enable GPU metrics:
helm repo update castai-helm
helm upgrade castai castai-helm/castai -n castai-agent \
--reset-then-reuse-values \
--set autoscaler.castai-kvisor.castai.apiKey=<your-api-token> \
--set autoscaler.castai-kvisor.castai.clusterID=<your-cluster-id> \
--set autoscaler.castai-kvisor.agent.gpu.enabled=true- Uninstall the old component:
helm uninstall gpu-metrics-exporter -n castai-agent
ImportantKvisor detects if
gpu-metrics-exporteris present in the cluster and will not report GPU metrics while it remains installed. Uninstall the old component for GPU metrics to start flowing.
Runtime security monitoring
Runtime Security monitoring enables real-time detection of anomalous activities in your cluster. This feature requires installing the Kvisor agent as a DaemonSet on all nodes.
To enable runtime security:
helm upgrade castai castai-helm/castai -n castai-agent \
--reset-then-reuse-values \
--set autoscaler.castai-kvisor.castai.apiKey=<your-api-token> \
--set autoscaler.castai-kvisor.castai.clusterID=<your-cluster-id> \
--set autoscaler.castai-kvisor.agent.enabled=true \
--set autoscaler.castai-kvisor.agent.extraArgs.ebpf-events-enabled=true \
--set autoscaler.castai-kvisor.agent.extraArgs.file-hash-enricher-enabled=trueFor detailed information on runtime security, refer to the Runtime Security documentation.
Resource configuration
For large clusters or environments with intensive monitoring requirements, you may need to increase the resources allocated to Kvisor:
helm upgrade castai castai-helm/castai -n castai-agent \
--reset-then-reuse-values \
--set autoscaler.castai-kvisor.controller.resources.requests.cpu=100m \
--set autoscaler.castai-kvisor.controller.resources.requests.memory=2Gi \
--set autoscaler.castai-kvisor.controller.resources.limits.memory=2GiAdjust these values based on your specific cluster size and monitoring needs.
Combine multiple features
You can enable multiple features and configure scan intervals in a single Helm command:
helm upgrade castai castai-helm/castai -n castai-agent \
--reset-then-reuse-values \
--set autoscaler.castai-kvisor.castai.apiKey=<your-api-token> \
--set autoscaler.castai-kvisor.castai.clusterID=<your-cluster-id> \
--set autoscaler.castai-kvisor.controller.enabled=true \
--set autoscaler.castai-kvisor.agent.enabled=true \
--set autoscaler.castai-kvisor.agent.extraArgs.ebpf-events-enabled=true \
--set autoscaler.castai-kvisor.agent.extraArgs.file-hash-enricher-enabled=true \
--set autoscaler.castai-kvisor.agent.extraArgs.netflow-enabled=true \
--set autoscaler.castai-kvisor.agent.extraArgs.stats-enabled=true \
--set autoscaler.castai-kvisor.agent.extraArgs.storage-stats-enabled=true \
--set autoscaler.castai-kvisor.agent.gpu.enabled=true \
--set autoscaler.castai-kvisor.controller.extraArgs.image-concurrent-scans=6 \
--set autoscaler.castai-kvisor.controller.extraArgs.image-scan-interval=30s \
--set autoscaler.castai-kvisor.controller.extraArgs.kube-linter-scan-interval=60sVerify your configuration
To verify that your configuration changes have been applied correctly:
helm get values castai -n castai-agentThis command displays the current configuration values for Kvisor.
Troubleshooting
If you encounter issues after changing Kvisor's configuration:
Check Controller Logs
kubectl logs -l app.kubernetes.io/name=castai-kvisor-controller -n castai-agentCheck Agent Logs
kubectl logs -l app.kubernetes.io/name=castai-kvisor-agent -n castai-agentRestart the Pods
If configuration changes don't seem to be taking effect, you can restart the Kvisor pods:
kubectl rollout restart deployment castai-kvisor-controller -n castai-agent
kubectl rollout restart daemonset castai-kvisor-agent -n castai-agentNext Steps
After configuring Kvisor features, you can:
- View security insights in the Security dashboard
- Explore the Vulnerabilities report to identify image security issues
- Check your compliance status in the Compliance report
- Analyze Runtime anomalies if you've enabled Runtime Security
