Labels, Annotations, and Events

Reference documentation for all labels, annotations, and Kubernetes events used by Container Live Migration.

This reference documents all labels, annotations, and Kubernetes events used by Container Live Migration. Use this page to configure workloads, troubleshoot migrations, or monitor migration status.

Labels

Labels control or show how Cast AI's live controller and Evictor handle workloads during migration operations.

Node labels

These labels are applied by Cast AI to nodes to indicate live migration capability.

LabelValuesApplied byDescription
live.cast.ai/migration-enabledtrueCast AIIndicates the node supports container live migration. Applied to nodes provisioned from CLM-enabled node templates.
live.cast.ai/custom-networktrueCast AIIndicates the node uses ENIConfig for dedicated Pod subnets. Required for custom networking configurations.

Pod labels

These labels are applied to pods either automatically by the live controller or manually to control migration behavior.

LabelValuesApplied byDescription
live.cast.ai/migration-enabledtrueLive controllerIndicates the pod is eligible for live migration. Applied automatically when a workload meets all eligibility requirements.
autoscaling.cast.ai/live-migration-disabledtrueUserOpt-out label. Prevents live migration for this pod. The pod will be evicted using traditional eviction instead.
autoscaling.cast.ai/removal-disabledtrueUserPrevents pod removal during optimization. If migration fails, the pod is recovered on the original node rather than evicted. Can be combined with live-migration-disabled.

Opting out of live migration

To disable live migration for specific workloads, add the opt-out label to your pod spec:

metadata:
  labels:
    autoscaling.cast.ai/live-migration-disabled: "true"

To prevent removal entirely (allowing only migration, with recovery on failure):

metadata:
  labels:
    autoscaling.cast.ai/removal-disabled: "true"

These labels can be combined. When both are set, the workload is excluded from both live migration and traditional eviction.

Annotations

Annotations control specific behaviors during the migration restore phase. These are applied to the source pod before migration.

Restore behavior annotations

By default, init containers, PostStart hooks, and startup probes are skipped on the restored pod because the container state is fully preserved. In rare cases, you may need to override this behavior.

AnnotationValue formatDescription
cast.ai/restore-run-startup-probeComma-separated container namesForces startup probes to run on the restored pod for the specified containers.
cast.ai/restore-run-post-start-execComma-separated container namesForces PostStart hooks to run on the restored pod for the specified containers. Only affects exec-based hooks.
cast.ai/restore-run-init-containersComma-separated init container namesForces the specified init containers to run on the restored pod instead of being skipped.

Example: Forcing startup probe execution

metadata:
  annotations:
    cast.ai/restore-run-startup-probe: "container1,container2"

Example: Forcing init container execution

metadata:
  annotations:
    cast.ai/restore-run-init-containers: "init-config,init-deps"
⚠️

Use with caution

These annotations should only be used when necessary:

  • Re-running PostStart hooks can cause duplicate side effects (duplicate service registrations, duplicate cache entries)
  • Re-running startup probes delays pod readiness
  • Re-running init containers increases migration time and can cause errors

For most applications, the default behavior (skipping these components) is correct and considered best practice.

System annotations

These annotations are applied by Cast AI during the migration process. Do not modify these manually.

AnnotationApplied byDescription
cast.ai/restoreLive controllerIndicates the pod was created via restore from a checkpoint. Applied to the clone pod on the destination node.

If you are unfamiliar with clone pods or destination nodes, please refer to the Overview document.

Migration events

Live migration progress is tracked through Kubernetes events on the migration custom resource. Use these events to monitor migration status and troubleshoot failures.

Viewing migration events

# List all migrations
kubectl get migrations -A

# Watch migrations in real-time
kubectl get migrations -A -w

# Get detailed events for a specific migration
kubectl describe migration <migration-name> -n <namespace>

Event reference

EventDescriptionPhase
PreDumpSkippedMemory pre-dump iteration was skipped.Pre-dump
PreDumpFinishedMemory pre-dump completed successfully. Dirty pages captured for incremental transfer.Pre-dump
PreDumpFailedMemory pre-dump operation failed. Check logs for details.Pre-dump
MigrationReconfiguredMigration configuration was updated during the operation.Any
PodCreateFinishedPod successfully recreated on the destination node.Restore
PodCreateFailedFailed to create pod on destination node. Check destination node capacity and permissions.Restore
MigrationFinishedPod successfully migrated to the new node. Migration complete.Complete
MigrationFailedMigration failed. Check logs for details. Evictor will fall back to traditional eviction if configured.Failed

Migration phases

Events occur during specific phases of the migration process:

  1. Pre-dump phase: Iterative memory snapshots while the container continues running. PreDumpFinished events indicate successful iterations. Multiple pre-dumps reduce final checkpoint time.

  2. Checkpoint phase: Final memory dump and state capture. No specific event due to the briefness of this phase.

  3. Restore phase: Container state restoration on the destination node. PodCreateFinished indicates the pod was created successfully.

  4. Completion: MigrationFinished indicates the entire migration succeeded and the pod is running on the new node.

Pod naming

After migration, pods receive modified names to indicate they are restored clones:

Workload typeNaming patternExample
Deployments<original-name>-clone-<N>my-app-7d4b8c9-abc12my-app-7d4b8c9-abc12-clone-1
StatefulSetsOriginal name preservedmy-db-0my-db-0
Other workloads<original-name>-clone-<N>Follows deployment pattern

StatefulSets retain their original pod names because their identity (and associated PVCs) must remain stable.

Quick reference

This section provides commonly used commands for working with Container Live Migration. Each command helps you verify configuration, check eligibility, or monitor migration activity.

Check if a node supports live migration

To verify which nodes in your cluster are configured for Container Live Migration, query for the migration-enabled label. Only nodes provisioned from CLM-enabled node templates will have this label applied.

kubectl get nodes -l live.cast.ai/migration-enabled=true

If no nodes are returned, either CLM is not enabled in your node templates, or you need to trigger a rebalancing operation to replace existing nodes with CLM-enabled ones.

Check if a pod is eligible for live migration

The live controller automatically labels pods that meet all eligibility requirements. Use this command to see which pods in your cluster are currently eligible for live migration.

kubectl get pods -A -l live.cast.ai/migration-enabled=true

Pods that do not appear in this list either fail to meet the technical requirements or are running on nodes where the live controller has not yet completed its assessment.

Opt a workload out of live migration

If you need to exclude a specific workload from live migration while still allowing traditional eviction, add the opt-out label to your pod specification. This is useful for workloads that have known compatibility issues or require additional testing before enabling migration.

metadata:
  labels:
    autoscaling.cast.ai/live-migration-disabled: "true"

Monitor active migrations

During a rebalancing operation or when Evictor triggers migrations, you can watch migration resources in real-time to observe progress and catch any failures as they occur.

kubectl get migrations -A -w

The -w flag keeps the command running and displays updates as migration status changes. Look for MigrationFinished events to confirm successful migrations, or MigrationFailed events that may require investigation.

See also