Alerting in Astronomer Software
You can use two built-in alerting solutions for monitoring the health of Astronomer:
- Deployment-level alerts, which notify you when the health of an Airflow Deployment is low or if any of Airflow’s underlying components are underperforming, including the Airflow scheduler.
- Platform-level alerts, which notify you when a component of your Software installation is unhealthy, such as Elasticsearch, Astronomer’s Houston API, or your Docker Registry.
These alerts fire based on metrics collected by Prometheus. If the conditions of an alert are met, Prometheus Alertmanager handles the process of sending the alert to the appropriate communication channel.
Astronomer offers built-in Deployment and platform alerts, as well as the ability to create custom alerts in Helm using PromQL query language. This guide provides all of the information you need to configure Prometheus Alertmanager, subscribe to built-in alerts, and create custom alerts.
In addition to configuring platform and Deployment-level alerts, you can also set email alerts that trigger on DAG and task-based events. For more information on configuring Airflow alerts, read Airflow alerts.
Anatomy of an alert
Platform and Deployment alerts are defined in YAML and use PromQL queries for alerting conditions. Each alert
YAML object contains the following key-value pairs:
expr
: The logic that determines when the alert will fire, written in PromQL.for
: The length of time that theexpr
logic has to be true for the alert to fire. This can be defined in minutes or hours (e.g.5m
or2h
).labels.tier
: The level of your platform that the alert should operate at. Deployment alerts have a tier ofairflow
, while platform alerts have a tier ofplatform
.labels.severity
: The severity of the alert. Can beinfo
,warning
,high
, orcritical
.annotations.summary
: The text for the alert that’s sent by Alertmanager.annotations.description
: A human-readable description of what the alert does.
By default, Astronomer checks for all alerts defined in the Prometheus configmap.
Subscribe to alerts
Astronomer uses Prometheus Alertmanager to manage alerts. This includes silencing, inhibiting, aggregating, and sending out notifications using methods such as email, on-call notification systems, and chat platforms.
You can configure Alertmanager to send built-in Astronomer alerts to email, HipChat, PagerDuty, Pushover, Slack, OpsGenie, and more by defining alert receivers in the Alertmanager Helm chart and modifying the Alertmanager email-config
parameter.
Create alert receivers
Alertmanager uses receivers to integrate with different messaging platforms. To begin sending notifications for alerts, you first need to define receivers
in YAML using the Alertmanager Helm chart.
This Helm chart contains groups for each possible alert type based on labels.tier
and labels.severity
. Each receiver must be defined within at least one alert type in order to reveive notifications.
For example, adding the following receiver to receivers.platformCritical
would cause platform alerts with critical
severity to appear in a specified Slack channel:
By default, the Alertmanager Helm chart includes alert objects for platform, critical platform, and Deployment alerts. To configure a receiver for a non-default alert type, such as Deployment alerts with high severity, add that receiver to the customRoutes
list with the appropriate match_re
and receiver configuration values. For example:
Note that if you have a platform
, platformCritical
, or airflow
receiver defined in the prior section, you do not need a customRoute
to route to them. They will automatically be routed to by the tier
label.
For more information on building and configuring receivers, refer to Prometheus documentation.
Push alert receivers to Astronomer
To add a new receiver to Astronomer, add your receiver configuration to your values.yaml
file and push the changes to your installation as described in Apply a config change. The receivers you add must be specified in the same order and format as they appear in the Alertmanager Helm chart. Once you push the alerts to Astronomer, they are automatically added to the Alertmanager ConfigMap.
Create custom alerts
In addition to subscribing to Astronomer’s built-in alerts, you can also create custom alerts and push them to Astronomer.
Platform and Deployment alerts are defined in YAML and pushed to Astronomer with the Prometheus Helm chart. For example, the following alert will fire if more than 2 Airflow schedulers across the platform are not heartbeating for more than 5 minutes:
To push custom alerts to Astronomer, add them to the AdditionalAlerts
section of your values.yaml
file and push the file with Helm as described in Apply a config change.
After you’ve pushed the alert to Astronomer, make sure that you’ve configured a receiver to subscribe to the alert. For more information, read Subscribe to Alerts.
Reference: Deployment alerts
The following table lists some of the most common Deployment alerts that you might receive from Astronomer.
For a complete list of built-in Airflow alerts, see the Prometheus configmap.