🚧
Limited Availability Feature
This feature is currently available through feature flags. Contact us to enable access for your organization.

General

The Anomaly Rules Engine is a powerful tool Cast AI provided as part of our Kubernetes runtime security feature set. It allows you to define custom rules to detect and classify events as anomalies based on specific criteria. This enables proactive monitoring and alerting for potential security threats or unusual behavior within your Kubernetes cluster.

Rule types

The Anomaly Rules Engine supports two types of rules:

Built-In Rules: These rules are pre-defined by Cast AI and cover common security scenarios. They are readily available and can be quickly enabled or disabled as needed.
User-Defined CEL Rules: You custom-defined these rules using the Common Expression Language (CEL). CEL provides a flexible and expressive way to define complex matching criteria based on event properties and resource attributes.

Each rule consists of two main components:

Resource Selectors: Used to filter and select the relevant resources to which the rule should be applied.
Event Matching: Defines the conditions and criteria for identifying anomalous events.

📘
Note
Only the resource selectors can be modified for Built-In rules, while the event-matching logic is pre-defined by Cast AI. For User-Defined CEL rules, you have full control over both the resource selectors and event-matching logic.

Resource selectors

Resource selectors allow you to filter events based on specific resource attributes before applying the rule. This helps narrow down the scope of events to be analyzed.

Resource selectors are defined using CEL expressions. The CEL program has access to two variables:

cluster of type types.Cluster: Represents the Kubernetes cluster.
resource of type types.KubernetesObject: Represents the Kubernetes resource associated with the event.

These variables expose various properties that can be used in the CEL expressions to select the desired resources. Refer to the corresponding sections below for more details on the available properties.

Resource selectors examples

Include only some clusters and namespaces:

cluster.name in ["dev", "testing"] && resource.namespace == "apps"

Exclude pods with prefixes:

resource.namespace == "castai-agent" && resource.pod.startsWith("castai-imgscan")

User-defined CEL rules

User-defined CEL rules provide the flexibility to define custom anomaly detection logic based on your specific requirements. You can write CEL expressions to match events and trigger anomalies based on various conditions.

CEL rules Examples

Detect OOM:

event.type == event_process_oom_killed

Detect that container dropped new executed binary and process started with the name nginx:

event.type == event_magic_write && event.process.name.startsWith("nginx")

Detect TCP connection to non-standard ports for public IP:

event.type == event_tcp_connect &&
event.tcp.destination.ip.public() && 
!(event.tcp.destination.port in [80, 443])

Detect exec arguments matches:

cel.bind(bad_args, ["tigervnc", "novnc", "--vnc", "rfbport"],
event.type == event_exec &&
bad_args.exists_one(bad_arg, 
	event.exec.args.exists_one(arg, arg.lowerAscii().contains(bad_arg))
))

Allow dropped binary by hash:

event.type == event_exec && event.exec.is_upper_layer && 
!(hex(event.exec.sha256) in [
  "9f64a747e1b97f131fabb6b447296c9b6f0201e79fb3c5356e6c77e89b6a806a",
])

Allow dropped binary by hash from the custom list:

event.type == event_exec && event.exec.is_upper_layer && 
  !customLists("allowed-binaries").contains(match_sha256, event.exec.sha256)

NOTE: The allowed-binaries list needs to be created before. See the custom list sections of the docs for more details.

Detect SSH:

event.type == event_ssh

Exposed variables

In a User-defined CEL rule, the program has access to the event through the event variable of type types.Event. This variable represents the current event being evaluated and provides access to its properties.

Refer to the sections below for more information on the available exposed properties.

Helper Functions

The Anomaly Rules Engine provides several helper functions that can be used in CEL expressions to perform common operations:

Function	Description
IP(string) -> IP	Parses the given string into an IP address. If the string is not a valid IP address, it will fail with an error. Example: `IP("10.0.0.1") != IP("2345:425:2CA1:0000:0000:567:5673:23b5")`
CIDR(string) -> types.CIDR	Parses the given string formatted in prefix notation to a CIDR. Example: `CIDR("10.0.0.0/8") != CIDR("2001:1111:2222:3333::/64")`
hex(bytes) -> string	Encodes the given bytes as a hexadecimal string. The resulting string will be lowercase. Example: `hex(event.exec.sha256)`
fromHex(string) -> bytes	Inverse of `hex(bytes)` that takes a hex-encoded string and turns it into bytes. It will fail with an error if the given string is an invalid hex. Example: `fromHex('CAFE')`
UUID(string) -> bytes	Parses the given string as a UUID and returns the underlying bytes. Example: `event.cluster.id == UUID('ecb5cb1b-7e7f-4dad-b504-6d13b69ce62e')`
public(IP) -> bool	Checks if the given IP is a public IP.
bitSet(int, int) -> bool	Probes if the bit specified by the second argument is set in the first argument.
bitMaskSet(int, int) -> bool	Probes if all bits from the second argument are set in the first argument.
customLists(string) -> customListMatcher	Loads the values from the specified custom lists into a matcher that can be used to probe whether values exist in that list. For more details, see the section about `CustomLists`.
Standard library functions	https://github.com/google/cel-spec/blob/master/doc/langdef.md#list-of-standard-definitions
Other custom string functions	https://github.com/google/cel-go/blob/b66ac6c0896350d105d71bf1960eece62ebb0c3c/ext/strings.go#L41

These functions can be used in your CEL expressions to parse and manipulate data related to events and resources. They provide convenient ways to work with IP addresses, CIDR notations, hexadecimal representations, and UUIDs.

Common Types

The Anomaly Rules Engine uses various types to represent events, resources, and their associated properties. These types expose properties that can be accessed and used in CEL expressions to define resource selectors and event-matching logic. Here are the commonly used types.

types.Event

The types.Event type represents an event observed in the Kubernetes cluster. It contains information about the event type, timestamp, associated cluster, resource, and other event-specific details.

Property	Type	Description
type	types.EventType	Type of the event. See the `types.EventType` table below for available types.
timestamp	Timestamp (RFC3339 string)	Timestamp when the event was observed.
cluster	types.Cluster	Cluster in which the event was observed.
resource	types.KubernetesObject	The Kubernetes resource for which the event was observed.
process	types.Process	Process details.
dns	types.DNSPayload	Additional information about the observed DNS request if the event is of type `event_dns`.
exec	types.ExecPayload	Additional information about the observed command execution if the event is of type `event_exec`.
file	types.FilePayload	Additional information about the file if the event is of type `event_magic_write`.
socks5	types.SOCKS5Payload	Additional information about the observed SOCKS5 actions if the event is of type `event_socks5`.
stdio_via_socket	types.StdioViaSocketPayload	Additional information regarding the potential reverse shell if the event is of type `event_stdio_via_socket`.
tcp	types.TCPPayload	Additional information about the observed TCP actions if the event is any of the `event_tcp_\*` types.
ssh	types.SSHPayload	Additional information about an observed SSH connection, if the event is of type `event_ssh`.
git_clone	types.GitClone	Additional information about an observed Git Clone event, if the event is of type `event_git_clone`.
payload_digest	uint64	The field is used to group related events together.

types.EventType

The types.EventType type represents the different types of events that can be observed by the Anomaly Rules Engine. Each event type corresponds to a specific action or occurrence in the Kubernetes cluster.

Name	Description
event_exec	Triggered when the execution of a binary is observed.
event_dns	Triggered when a DNS-related request is observed.
event_tcp_connect	Triggered when a new TCP connection is observed.
event_tcp_listen	Triggered when a new TCP socket starts listening.
event_process_oom_killed	Triggered when a process is observed to have been killed due to an out-of-memory (OOM) condition.
event_magic_write	Triggered when a write operation on an ELF binary is observed at runtime.
event_stdio_via_socket	Triggered when the binding of any standard input/output (STDIO) file descriptors to a network socket is observed, which might indicate a reverse shell.
event_tty_detected	Triggered when the allocation of a new pseudo-terminal (PTTY) device is detected.
event_socks5_detected	Triggered when SOCKS5-related network traffic is observed.
event_ssh	Triggered when an SSH connection is observed.
event_git_clone	Triggered when a git clone was observed.

types.Cluster

The types.Cluster type represents a Kubernetes cluster and contains information about the cluster's identity and organization.

Property	Type	Description
id	UUID	Unique identifier of the cluster.
name	string	Name of the cluster.
organization_id	UUID	Identifier of the organization to which the cluster belongs.

types.KubernetesObject

The types.KubernetesObject type represents a Kubernetes resource associated with an event. It provides details about the container, pod, namespace, and workload related to the event.

Property	Type	Description
container	string	The container in which the event was observed.
container_id	string	ID of the container for which the event was observed.
namespace	string	The namespace in which the container is running.
pod	string	Pod, the event was observed for.
pod_annotations	map[string]string	Annotations set on the Pod that the event was observed for.
pod_labels	map[string]string	Labels set on the Pod that the event was observed for.
workload_id	UUID	ID of the workload related to the event.
workload_kind	string	Kind of workload the Pod belongs to (e.g., Deployment, StatefulSet).
workload_name	string	Name of the workload to which the Pod belongs.

types.Process

Property	Type	Description
name	string	Process name.
pid	int	Process ID as seen on the container.
host_pid	int	Process ID as seen on the host.
start_time	int	The time when the process was started in seconds.

types.DNSPayload

The types.DNSPayload type represents details about an observed DNS query. If the observed event was triggered by a DNS server, flow_direction will be set to flow_egress.

Property	Type	Description
question	string	Domain name to be resolved by the server.
answers	[]types.DNSAnswer	List of resolved answers for the given question.
flow_direction	types.FlowDirection	Direction of flow of the observed request.
network_details	types.NetworkDetails	Details about the observed DNS request.
remote	types.AddrPort	Address details about the remote end of the observed request.

types.DNSAnswer

The types.DNSAnswer type represents the answer received for a DNS query. It contains information about the type of the answer and the associated data.

Property	Type	Description
type	types.DNSAnswerType	Type of the DNS answer, indicating whether it is a public IP, private IP, or CNAME.
cname	string	Domain name returned if the DNS answer type is `dns_cname`.
ip	IP	IP address returned if the DNS answer type is either `dns_public_ip` or `dns_private_ip`.

types.DNSAnswerType

The types.DNSAnswerType type represents the different types of DNS answers that can be received.

Name	Description
dns_unknown	DNS type could not be determined.
dns_public_ip	DNS answer was classified as a public IP address.
dns_private_ip	DNS answer was classified as a private IP address.
dns_cname	DNS answer was classified as CNAME.

types.ExecPayload

The types.ExecPayload type contains additional information about an observed command execution event.

Property	Type	Description
args	[]string	List of arguments observed in the execute command.
file_details	types.FileDetails	Additional information about the executed file.
path	string	Path to the executed file.
sha256	types.SHA256Hash	SHA256 hash of the executed file.
is_upper_layer	bool	Execution from upperdir writable layer. Works only for overlays.
is_memfd	bool	Execution of a binary in `memfd`.
is_tmpfs	bool	Execution of a binary in `tmpfs`.
is_dropped_binary	bool	Execution of a binary that was observed to be dropped (this should also have triggered a `MAGIC_WRITE` event).

types.FilePayload

The types.FilePayload type represents additional information about a file related to the event.

Property	Type	Description
path	string	Path to the file related to the event.
sha256	types.SHA256Hash	SHA256 hash of the corresponding file (currently only supported by `magic write` events).

types.TCPPayload

The types.TCPPayload type contains additional information about observed TCP actions

Property	Type	Description
destination	types.AddrPort	Destination of the TCP-related packets.
network_details	types.NetworkDetails	Additional information about the destination based on IP set data.
ip_details	types.IPDetails	Additional information about the destination IP from third-party services (like AbuseIPDB).

types.StdioViaSocketPayload

The types.StdioViaSocketPayload type represents additional information about a potential reverse shell event.

Property	Type	Description
destination	types.AddrPort	The destination of the socket to which the standard input/output (STDIO) file descriptor is bound.
fd	uint32	The file descriptor bound to the socket (`0` = STDIN, `1` = STDOUT, `2` = STDERR).

types.SSHPayload

The types.SOCKS5Role type represents the role of the observed process in SOCKS5 communication.

Property	Type	Description
flow_direction	types.FlowDirection	Observed SSH connection flow.
remote	types.AddrPort	Address details of the remote part of the connection.

types.AddrPort

The types.AddrPort type represents an IP address and port combination.

Property	Type	Description
ip	IP	IP address related to the observed event (depends on the event type).
port	uint16	Port number related to the observed event (depends on the event type).

types.NetworkDetails

The types.NetworkDetails type contains additional information about a network.

Property	Type	Description
Category	types.Category	The category under which the network has been classified.

types.FlowDirection

The types.FlowDirection type represents the direction of network flow.

Name	Description
flow_unknown	Network flow direction is unknown.
flow_ingress	Network was classified as incoming.
flow_egress	Network was classified as outgoing.

types.Category

The types.Category type represents different categories to which events can be classified.

Name	Description
category_malware	Event was classified as being related to malware.
category_crypto	Event was classified as being related to cryptocurrency.

types.FileDetails

The types.FileDetails type contains additional details about a file.

Property	Type	Description
category	types.Category	Category to which the file has been classified.
malware_name	string	Name of the malware identified, if the file is related to malware.
malware_version	string	Version of the malware detected, if the file is related to malware.

types.IPDetails

The types.IPDetails type represents additional information about an IP address.

Property	Type	Description
abuse_confidence_score	int	A score from 0-100 indicating the confidence level of classifying the IP address as malicious.
country_code	string	Country code from which the IP address originates. In ISO 3166-1 alpha-2 format.
domain	string	Domain name related to the IP address.
hostnames	[]string	Host names associated with the IP address.
ip_address	string	IP address of the event as a string.
ip_version	int	Version of the IP address (`4`= IPv4, `6` = IPv6).
is_tor	bool	Flag indicating whether the IP address is related to the Tor network.
isp	string	Name of the Internet Service Provider (ISP) to which the IP belongs.

types.SOCKS5Payload

The types.SOCKS5Payload type contains additional information about observed SOCKS5 actions.

Property	Type	Description
destination	types.AddrPort	Destination details of the SOCKS5 communication. If the address type is `socks5_address_domain_name`, only the port field is populated.
flow_direction	types.FlowDirection	Direction of the observed SOCKS5 communication.
address_type	types.SOCKS5AddressType	Address type used in the SOCKS5 command. If the command or reply does not contain an address type, this field might be set to unknown.
command_or_reply	uint8	Command or reply identifier as specified by RFC1928.
destination_domain	string	Destination domain if the `address_type` is set to `socks5_address_domain_name`.
role	types.SOCKS5Role	Role of the observed process in the SOCKS5 communication.

types.SOCKS5CmdOrReply

The types.SOCKS5CmdOrReply type represents the different address types used in SOCKS5 commands or replies.

Name	Description
socks5_address_domain_name	A domain name was observed to be used.
socks5_address_ipv6	An IPv6 address was observed to be used.
socks5_address_unknown	The address type could not be determined.
socks5_address_ipv4	An IPv4 address was observed to be used.

types.SOCKS5Role

The types.SOCKS5Role type represents the role of the observed process in SOCKS5 communication.

Name	Description
socks5_role_unknown	Role could not be identified.
socks5_role_client	Event was triggered by a SOCKS5 client.
socks5_role_server	Event was triggered by a SOCKS5 server.

types.GitClone

Name	Type	Description
remote_type	types.GitCloneRemoteType	Type of the observed clone.
full_repo	string	Raw unparsed string of the cloned repo (e.g., `https://github.com/castai/kvisor.git`).
server	string	Detected server portion of the full repo string. (e.g., `github.com`).
repo	string	Detected repository portion of the full repo string (e.g.,`castai/kvisor`).

types.GitCloneRemoteType

Name	Description
git_remote_unknown	Unknown remote type.
git_remote_ssh	Git clone observed via SSH.
git_remote_git	Git clone observed via GIT protocol.
git_remote_http	Git clone observed via HTTP.
git_remote_https	Git clone observed via HTTPS.
git_remote_ftp	Git clone observed via FTP.
git_remote_ftps	Git clone observed via FTPS.
git_remote_local	Git clone observed from the local folder.

Custom Lists

It can often be useful to specify a list of values that can be quickly probed, e.g., to test if a given binary hash is malicious. Within the Cast AI security product, this can be achieved by the so-called CustomLists.

Currently, Custom Lists can only be managed via the corresponding REST API endpoints. To create a custom list, use the /v1/security/runtime/list endpoint. It expects only the list name, which will also be used to reference it from within CEL rules. On that note, list names must be unique (the API will return an error if one tries to create a list with an existing name).

Now that we have a list, we can add entries to it. Those can then later be probed from within the CEL rule. Adding items to the list is achieved by calling the /v1/security/runtime/list/{id}/add endpoint. As of right now, the following types of entries are supported:

Type	Description
LIST_ENTRY_KIND_SHA256	The value has to be a hex-encoded SHA256 hash.
LIST_ENTRY_KIND_IP	IPv4 or IPv6 address in string format (e.g., `1.2.3.4`, `fe80::42:bdff:feda:a32e`).
LIST_ENTRY_KIND_CIDR	IP address with a specified host mask (e.g., `1.0.0.0/8`, `fe80::42:bdff:feda:a32e/64`).
LIST_ENTRY_KIND_STRING	Simple string value.

Items from lists can be removed via the /v1/security/runtime/list/{id}/remove endpoint. All fields of the entry you want to delete must be specified and match.

To delete a whole custom list, you can use the /v1/security/runtime/list/delete endpoint. The ID to be used in any of the list-related endpoints can be retrieved by either storing it after a list create, or by querying via the /v1/security/runtime/list endpoint. NOTE: Lists that are referenced in CEL rules cannot be deleted.

Now that we have created a custom list, let's have a look at how to use it as part of a CEL rule. Let's have a look at a quick example:

event.type == event_exec &&
  customLists("known-malware", "likely-malware").contains(match_sha256, event.exec.sha256)

Custom lists can be loaded via the customLists function. This returns a custom list matcher that offers a contains method. The first argument specifies what type of value to match. Here is a list of all currently available matchers:

Matcher	Argument Type	Description
match_sha256	bytes (SHA256 hash)	Checks if any of the specified lists contain a matching SHA256 hash.
match_ip	IP	Checks if any of the specified lists contain the given IP.
match_cidr	IP	Checks if any of the CIDRs from the specified lists contain the given IP.
match_string	string	Checks if any of the specified lists contain the given string.

You can feed any value into the contains function, as long as it matches the argument type required by the matcher.

Manage using Terraform

If you manage your infrastructure as code, you can enable runtime security and manage anomaly rules using our Terraform modules for GKE, EKS, and AKS. To do this:

Enable Runtime Security Agent
Manage Runtime Security Rules via Terraform. Rules can be defined as Terraform resources using the Cast AI provider:

resource "castai_security_runtime_rule" "example_rule_dns_to_crypto_mining" {
  name              = "Example rule: DNS to crypto mining"
  severity          = "SEVERITY_LOW"
  enabled           = false
  rule_text         = <<EOT
event.type == event_dns && event.dns.network_details.category == category_crypto
EOT
  resource_selector = <<EOT
resource.namespace == "default"
EOT
  labels = {
    environment = "dev"
    team        = "security"
  }
}

This example creates a rule that detects DNS requests to cryptocurrency-related domains, but only for resources in the default namespace.

For complete examples and helpful scripts to import existing Runtime Security Rules, refer to our provider's detailed Terraform examples:

Anomaly rules engine

🚧
Limited Availability Feature

General

Rule types

📘
Note

Resource selectors

Resource selectors examples

User-defined CEL rules

CEL rules Examples

Exposed variables

Helper Functions

Common Types

types.Event

types.EventType

types.Cluster

types.KubernetesObject

types.Process

types.DNSPayload

types.DNSAnswer

types.DNSAnswerType

types.ExecPayload

types.FilePayload

types.TCPPayload

types.StdioViaSocketPayload

types.SSHPayload

types.AddrPort

types.NetworkDetails

types.FlowDirection

types.Category

types.FileDetails

types.IPDetails

types.SOCKS5Payload

types.SOCKS5CmdOrReply

types.SOCKS5Role

types.GitClone

types.GitCloneRemoteType

Custom Lists

Manage using Terraform

🚧Limited Availability Feature

General

Rule types

📘Note

Resource selectors

Resource selectors examples

User-defined CEL rules

CEL rules Examples

Exposed variables

Helper Functions

Common Types

types.Event

types.EventType

types.Cluster

types.KubernetesObject

types.Process

types.DNSPayload

types.DNSAnswer

types.DNSAnswerType

types.ExecPayload

types.FilePayload

types.TCPPayload

types.StdioViaSocketPayload

types.SSHPayload

types.AddrPort

types.NetworkDetails

types.FlowDirection

types.Category

types.FileDetails

types.IPDetails

types.SOCKS5Payload

types.SOCKS5CmdOrReply

types.SOCKS5Role

types.GitClone

types.GitCloneRemoteType

Custom Lists

Manage using Terraform

🚧
Limited Availability Feature

📘
Note