Configure metrics

Astro Private Cloud (APC) provides multiple options for collecting and exporting Airflow metrics including StatsD, OpenTelemetry (OTEL), and Prometheus integration.

StatsD configuration (default)

StatsD resource limits are managed at the API level via componentsConfig and apply to all components — they can’t be configured independently per component.

1resources:
2 requests:
3 cpu: "100m"
4 memory: "384Mi"
5 limits:
6 cpu: "100m"
7 memory: "384Mi"

Airflow configuration

1[metrics]
2statsd_on = True
3statsd_host = localhost
4statsd_port = 8125
5statsd_prefix = airflow

Prometheus integration

1prometheus:
2 enabled: true
3 retention: 15d
4 persistence:
5 enabled: true
6 size: 100Gi

Grafana dashboards

Access Grafana at:

https://grafana.<platform-domain>

Pre-built dashboards include:

  • Airflow Dag performance
  • Task execution metrics
  • Scheduler health
  • Worker utilization

Alerting

1alertmanager:
2 enabled: true
3 config:
4 route:
5 receiver: 'platform'
6 receivers:
7 - name: 'platform'
8 webhook_configs:
9 - url: 'http://houston:8871/v1/alerts'

Built-in alerts

  • AirflowDeploymentUnhealthy
  • AirflowSchedulerUnhealthy
  • AirflowTasksPendingIncreasing

Key metrics

MetricDescription
airflow_dagrun_duration_secondsDag run duration
airflow_ti_successesSuccessful task instances
airflow_ti_failuresFailed task instances
airflow_scheduler_heartbeatScheduler health
airflow_executor_queued_tasksQueued task count

Best practices

  • Set appropriate retention based on storage capacity.
  • Use OTEL for multi-backend export.
  • Configure alerts for critical health metrics.
  • Monitor task queue depth for scaling needs.