Setup Notification webhooks

How to setup Notification webhooks

To send notifications from the CAST AI Console to an external system, select the organization you want to configure the webhook.

  1. Click on the Notifications Icon -> View All

  2. Click on Webhooks

  3. Click on Add webhooks

  4. Create the Webhook

FieldDescription
NameThe name of the Webhook configuration
Callback UrlThe callback URL to send the requests to
Severity TriggersThe severity levels that will trigger that notification
TemplateThe template of the request that will be sent to the callback URL

The Request Template should be a valid JSON. We provide a better overview of how to customize the payloads in the next section.

Request Template Configuration

We allow users to fully customize the request sent to external systems, in that way, we can support almost any application out there. The Request Template is the payload sent within the webhook call. The following variables from notifications are available:

VariableDescriptionUsage
NotificationIDThe UUID of the notification, it is unique{{ .NotificationID }}
OrganizationIDThe organization that owns the notification{{ .OrganizationID }}
SeverityIndicates the severity of the impact on the affected system.{{ .Severity }}
NameName of the notification{{ .Name }}
MessageA high-level text summary message of the event.{{ .Message}}
DetailsFree-form details from the event can be parsed into JSON{{ toJSON .Details }}
TimestampWhen the Notification was created by CAST AI{{ toISO8601 .Timestamp }}
ClusterCluster information, might be empty, if the notification isn't specific{{ toJSON .Cluster }}
Cluster.IDThe unique identifier of the cluster on CAST AI{{ .Cluster.ID }}
Cluster.NameName of the cluster on CAST AI{{ .Cluster.Name }}
Cluster.ProviderTypeCloud provider of the cluster (eks, gke, aks, kops){{ .Cluster.ProviderType }}
Cluster.ProjectNamespaceIDCluster location where cloud provider organizes resources, eg.: GCP project ID, AWS account ID.{{ .Cluster.ProjectNamespaceID }}

As you can see, the variables are in go template style, and you can mix them anywhere you want in your Request Template.

Example of Request Template Slack

To send a notification on Slack we need a simple JSON request with payload in the body,

{
    "text": "CAST AI - {{ .Name }}",
    "blocks": [
     {
      "type": "section",
      "text": {
       "type": "mrkdwn",
       "text": "{{ .Cluster.Name }}<br> {{ .Message}}"
      }
     }
    ]
}

How to create the webhook URL isn't in the scope of this how-to. You can find information following the link https://api.slack.com/messaging/webhooks.

Example of Request Template PagerDuty

PagerDuty accepts Alerts in the endpoint https://events.pagerduty.com/v2/enqueue. The content is a simple JSON request with the payload in the body. You can find below an example of a request template with the available variables:

{
    "payload": {
        "summary": "{{ .Message }}",
        "timestamp": "{{ toISO8601 .Timestamp }}",
        "severity": "critical",
        "source": "CAST AI",
        "component": "{{ .Cluster.Name}}-{{ .Cluster.ProviderType}}-{{ .Cluster.ProjectNamespaceID }}",
        "group": "{{ .Name }}",
        "class": "kubernetes",
        "custom_details": {
            "details": "{{ toJSON .Details }}"
        }
    },
    "routing_key": "--routing_key--",
    "dedup_key": "{{ .NotificationID }}",
    "event_action": "trigger",
    "client": "CAST AI",
    "client_url": "https://console.cast.ai/external-clusters/{{ .Cluster.ID}}?org={{ .OrganizationID }}",
}

Note that dedup_key was set as the NotificationID. This field is unique in CAST AI and will ensure you won't produce an alert with the same content more than once.

How to create the routing_key isn't in the scope of this how-to. You can find more information at https://developer.pagerduty.com/docs/ZG9jOjExMDI5NTgx-send-an-alert-event.

Anomaly Detection Webhook

It is possible to configure a Webhook to receive notifications about newly detected Anomalies. For this select category Security and operation Anomalies. Provide an URL to your endpoint and configure the JSON request template like

{
    "details": {{ toJSON .Details }},
    "cluster": {{ toJSON .Cluster }}
}

The structure of the .Details JSON object is as follows:

{
  "anomalyID": "<UUID of the detected anomaly>",
  "ruleMetadata": {
    "name": "<name of the rule>",
    "type": "<type of the rule>"
  },
  "events": [
    {
      "ts": "<event timestamp in RFC3339 format>",
      "organizationID": "<ID of the organization the event was recorded in>",
      "clusterID": "<ID of the cluster the event was recorded in>",
      "type": "<type of the event>",

      "process": "<name of the process the event was recorded for>",
      "namespace": "<kubernetes namespace the event was recorded in>",
      "workloadName": "<name of the workload the event was recorded in>",
      "podName": "<name of the pod the event was recorded in>",
      "containerName": "<name of the container the event was recorded in>",

      "dstIP": "<destination IP of packets (only filled for network related events)>",
      "dstPort": <destination port of packets (only filled for network related events)>,
      "dstIsPublic": <boolean flag to indicate if destination IP is a public address (only filled for network related events)>,

      "dnsQuestionDomain": "<question domain for a DNS request (only filled for DNS related events)>",
      "dnsAnswerIPPublic": [
        "<list of public IPs in DNS answer (only filled for DNS related events)>"
      ],
      "dnsAnswerIPPrivate": [
        "<list of private IPs in DNS answer (only filled for DNS related events)>"
      ],
      "dnsAnswerCname": [
        "<list of CNAMEs in DNS answer (only filled for DNS related events)>"
      ],

      "filePath": "<path to file related to event (only filled for exec/write related events)>",
      "args": [
        "<array of arguments used in exec (only filled in exec events)>"
      ],
			"ipDetails": {
        <optional object containing detailed information about the dstIP in the event (see ipDetails description below)>
      }
    },
    <up to 10 events related to the anomaly>
  ],
}

Rule Types

The following rule types are currently supported:

TypeNameDescription
crypto_mining:binary_executedCrypto mining command line argumentsChecks for EXEC events and tries to identify if they are related to crypto miners. The check is based on matching the binary file name, as well as the arguments.
crypto_mining:dns_lookupDNS to crypto miningChecks for DNS events that try resolve a well known crypto related domains.
crypto_mining:tcp_connectTCP connection to crypto miningChecks for TCP connections to crypto related IPs.
network:tcp_public_non_standard_portSuspicious Internet connectionChecks for TCP connections to public IPs on non HTTP related ports (neither 80 nor 443).
network:suspicious_destination_ipSuspicious Destination IPChecks for network related events, that have a suspicious IP as destination.
suspicious_binary:nezha_serverProcess related to Nezha serverChecks EXEC events for execution of the Nezha Monitoring Tool.
suspicious_binary:vnc_serverProcess related to VNC serverChecks EXEC events for execution of VNC servers.
general:dropped_binary_executedDropped new binary (container drift)Checks for MAGIC_WRITE events (fires if ELF headers are written to any filesystem).
general:oom_killedProcess OOM killedChecks if a process was OOM killed.
ml:suspicious_container_statsSuspicious container statsLeverages Machine Learning to detect suspicious resource usage patterns of containers.

ipDetails

FieldDescription
ipAddressIP Address that triggered this anomaly.
ipVersionVersion of the IP address (either 4 or 6).
countryCodeCountry code of the IP address (e.g. US).
ispName of the Internet Service Provider that owns the IP address.
domainCorrelated domain name to the IP.
hostnameAdditional hostnames for this IP.
isTorFlag indicating if the given IP is a Tor node.

Example:

{
  "ipAddress": "118.25.6.39",
  "ipVersion": 4,
  "countryCode": "CN",
  "isp": "Tencent Cloud Computing (Beijing) Co. Ltd",
  "domain": "tencent.com",
  "hostnames": [],
  "isTor": false
}

Event Types

TypeDescription
execTriggered by any executed processes in a pod.
dnsTriggered by any DNS lookup in a pod.
file_changeTriggered by any write to a file in a pod.
tcp_connectTriggered by any TCP connection.
tcp_listenTriggered by any TCP socket listening in a pod.
tcp_connect_errorTriggered by any connection errors when trying to open a TCP conenction.
process_oom_killedTriggered by any process that got OOM killed.
magic_writeTriggered by any write event that writes an ELF binary header.