Universal Metrics Exporter metrics reference

This document lists every metric that Astro exports through the Universal Metrics Exporter. Use this reference to identify which metrics are available, the Prometheus labels you can query against, and how each Astro metric name maps to its upstream Apache Airflow name.

Astro exports two categories of metrics:

  • Airflow application metrics describe the health, success, and performance of Dag execution. Astro normalizes these from the StatsD format that Airflow emits before exporting them to your Prometheus endpoint.
  • Infrastructure metrics describe the resource usage and lifecycle health of the Kubernetes Pods that run each Airflow component.

Astro doesn’t export metrics outside of these tables. The mapping configuration drops any metric that doesn’t match a rule.

How Astro normalizes Airflow metrics

Astro applies the following transformations to Airflow metrics before they reach your Prometheus endpoint. For the source-of-truth mapping rules, see the Astro StatsD mappings file.

  • StatsD names become Prometheus names. Astro replaces dots with underscores. For example, airflow.dag_processing.import_errors becomes airflow_dag_processing_import_errors.
  • Variable name parts become Prometheus labels. High-cardinality identifiers move out of the metric name and into labels so that one metric name covers many dimensions. For example, Astro exports the Airflow metric airflow.dag.<dag_id>.<task_id>.duration as airflow_task_duration with dag_id and task_id labels.
  • Legacy and current Airflow names both flow through. Astro maps metrics that Airflow renamed across versions under both their legacy and current names so that Dags running on different Astro Runtime versions both emit. For example, both zombies_killed (Airflow 2.x) and task_instances_without_heartbeats_killed (Airflow 3 and later) export when Airflow emits them.
  • Astro adds default metadata labels. Every exported metric carries the standard Astro labels documented in Export metrics, such as deploymentId, organizationId, and workspaceId.

Airflow application metrics

Apache Airflow classifies metrics into three types based on how the value behaves over time: counters, gauges, and timers. The following tables use Airflow’s classification. For background on each type, see the Apache Airflow metrics reference.

Counters

A counter records the cumulative count of events that occur over time, such as task failures or scheduler heartbeats.

NameAirflow nameLabelsDescription
airflow_job_start<job_name>_startjob_nameStarted jobs, such as SchedulerJob or LocalTaskJob.
airflow_job_end<job_name>_endjob_nameCompleted jobs.
airflow_job_heartbeat_failure<job_name>_heartbeat_failurejob_nameHeartbeat failures for a given job type.
airflow_operator_successesoperator_successes_<operator>operatorSuccessful executions of a given operator type.
airflow_operator_failuresoperator_failures_<operator>operatorFailures of a given operator type.
airflow_scheduler_heartbeatscheduler_heartbeattypeScheduler heartbeat occurrences. Astro sets the type label to counter so dashboards can distinguish counter-typed values from older gauge-typed emissions.
airflow_dag_processor_heartbeatdag_processor_heartbeatNoneStandalone Dag processor heartbeat occurrences. Available on Airflow 3 and later.
airflow_triggerer_heartbeattriggerer_heartbeatNoneTriggerer heartbeat occurrences.
airflow_ti_startti.start.<dag_id>.<task_id>dag_id, task_idTask instance initiations within a Dag.
airflow_ti_finishti.finish.<dag_id>.<task_id>.<state>dag_id, task_id, stateTask instance completions, broken out by terminal state.
airflow_ti_failuresti_failuresNoneTotal task instance failures across all Dags.
airflow_ti_successesti_successesNoneTotal task instance successes across all Dags.
airflow_task_instance_createdtask_instance_created_<task_type>task_typeTask instances created, broken out by operator type.
airflow_scheduler_tasks_killed_externallyscheduler.tasks.killed_externallyNoneTasks terminated by external processes.
airflow_zombies_killedzombies_killedNoneZombie task instances terminated by the scheduler. Replaced by airflow_task_instances_without_heartbeats_killed on Airflow 3 and later.
airflow_task_instances_without_heartbeats_killedtask_instances_without_heartbeats_killedNoneTask instances terminated due to missing heartbeats. Replaces airflow_zombies_killed on Airflow 3 and later.
airflow_triggers_succeededtriggers.succeededNoneTriggers that successfully fired at least one event.
airflow_triggers_failedtriggers.failedNoneTriggers that failed before firing.
airflow_dataset_updatesdataset.updatesNoneDataset updates. Replaced by airflow_asset_updates on Airflow 3 and later.
airflow_dataset_triggered_dagrunsdataset.triggered_dagrunsNoneDag runs triggered by dataset updates. Replaced by airflow_asset_triggered_dagruns on Airflow 3 and later.
airflow_asset_updatesasset.updatesNoneAsset modifications. Available on Airflow 3 and later; replaces airflow_dataset_updates.
airflow_asset_triggered_dagrunsasset.triggered_dagrunsNoneDag runs initiated by asset updates. Available on Airflow 3 and later; replaces airflow_dataset_triggered_dagruns.
airflow_ol_emit_failedol.emit.failedNoneFailed attempts to emit OpenLineage events.
airflow_astro_logging_write_failedairflow.astro_logging.<provider>.write.failedproviderLog-write failures from the astronomer-providers-logging package, broken out by provider.
astro_bundle_backend_refresh_successAstronomer onlyinstance, mount_pathSuccessful refreshes of an Astro bundle backend mount.
astro_bundle_backend_refresh_failureAstronomer onlyinstance, mount_pathFailed refreshes of an Astro bundle backend mount.
astro_bundle_backend_download_urls_successAstronomer onlyinstance, mount_pathSuccessful download URL fetches by the Astro bundle backend.
astro_bundle_backend_download_urls_failureAstronomer onlyinstance, mount_pathFailed download URL fetches by the Astro bundle backend.

Gauges

A gauge measures a point-in-time value that can rise and fall, such as the number of running tasks or open executor slots.

NameAirflow nameLabelsDescription
airflow_dagbag_sizedagbag_sizeNoneNumber of Dags found during the last scheduler scan.
airflow_dag_processing_import_errorsdag_processing.import_errorsNoneNumber of errors encountered when parsing Dag files.
airflow_dag_processing_total_parse_timedag_processing.total_parse_timeNoneTotal seconds spent scanning and importing Dag files in the most recent cycle.
airflow_dag_processing_last_run_seconds_agodag_processing.last_run.seconds_ago.<dag_file>dag_fileSeconds elapsed since the named Dag file was last evaluated.
airflow_executor_open_slotsexecutor.open_slotsNoneAvailable execution slots on the executor.
airflow_executor_queued_tasksexecutor.queued_tasksNoneTasks awaiting execution on the executor.
airflow_executor_running_tasksexecutor.running_tasksNoneTasks currently executing on the executor.
airflow_pool_open_slotspool.open_slots.<pool>poolOpen slots in a named pool.
airflow_pool_used_slotspool.used_slots.<pool>poolSlots currently in use in a named pool. Available on Airflow 2.x.
airflow_pool_queued_slotspool.queued_slots.<pool>poolSlots held by queued tasks in a named pool.
airflow_pool_running_slotspool.running_slots.<pool>poolSlots held by running tasks in a named pool.
airflow_pool_deferred_slotspool.deferred_slots.<pool>poolSlots held by deferred tasks in a named pool.
airflow_pool_scheduled_slotspool.scheduled_slots.<pool>poolSlots held by scheduled tasks in a named pool.
airflow_pool_starving_taskspool.starving_tasks.<pool>poolTasks in a named pool that can’t proceed because pool resources are exhausted.
airflow_scheduler_tasks_runningscheduler.tasks.runningNoneTasks currently running according to the scheduler. Available on Airflow 2.x.
airflow_scheduler_tasks_starvingscheduler.tasks.starvingNoneTasks the scheduler can’t run because pool resources are exhausted.
airflow_triggers_runningtriggers.runningNoneTriggers currently executing on a triggerer host.
airflow_dataset_orphaneddataset.orphanedNoneDatasets no longer referenced by any Dag. Replaced by airflow_asset_orphaned on Airflow 3 and later.
airflow_asset_orphanedasset.orphanedNoneAssets no longer referenced by any Dag schedule or task output. Available on Airflow 3 and later.
airflow_runner_resourcesairflow.executor.runner_resources.<resource>resourcePercentage of a resource in use on an executor runner. Values for the resource label include slots, cpu, and memory.
airflow_executor_task_resourcesairflow.executor.task_resources.<resource_stat>resource_statTask-level resource statistics. Values for the resource_stat label include memory_rss and cpu_times_system.
astro_bundle_backend_num_filesAstronomer onlyinstance, mount_path, leHistogram of file counts in Astro bundle backend mounts.
astro_bundle_backend_tarball_sizeAstronomer onlyinstance, mount_path, leHistogram of tarball sizes downloaded by the Astro bundle backend.

Timers

A timer measures the duration of an event, such as how long a task or Dag run takes to complete. Astro exports timer values in milliseconds.

NameAirflow nameLabelsDescription
airflow_task_durationdag.<dag_id>.<task_id>.durationdag_id, task_idTotal duration of a task instance.
airflow_dagrun_durationdagrun.duration.success.<dag_id>dag_idDuration of a successful Dag run.
airflow_dagrun_faileddagrun.duration.failed.<dag_id>dag_idDuration of a failed Dag run.
airflow_dagrun_schedule_delaydagrun.schedule_delay.<dag_id>dag_idDelay between the scheduled and actual start of a Dag run.
airflow_dagrun_first_task_scheduling_delaydagrun.<dag_id>.first_task_scheduling_delaydag_idDelay between a Dag run’s start and the scheduling of its first task.
airflow_dagrun_dependency_checkdagrun.dependency-check, dagrun.dependency-check.<dag_id>dag_idTime required to evaluate Dag run dependencies. The dag_id label is present only when Airflow emits the Dag-scoped form.
airflow_dag_processing_last_durationdag_processing.last_duration.<dag_file>dag_fileTime required to parse the named Dag file in the most recent cycle.
airflow_dag_processing_last_runtimedag_processing.last_runtime.<dag_file>dag_fileLegacy name for airflow_dag_processing_last_duration. Astro retains this mapping so older Airflow versions still emit.
airflow_collect_db_dagscollect_db_dagsNoneTime spent fetching serialized Dags from the metadata database.
airflow_ol_emit_attemptsol.emit.attemptsNoneTime consumed by OpenLineage event emission attempts.
astro_bundle_backend_download_timeAstronomer onlyinstance, mount_path, leHistogram of download durations for Astro bundle backend tarballs.
astro_bundle_backend_extract_timeAstronomer onlyinstance, mount_path, leHistogram of extract durations for Astro bundle backend tarballs.

Astro event scheduler metrics

The Astro event scheduler emits metrics under the airflow.astro_event_scheduler.* namespace. Astro strips the airflow. prefix and exports each metric as astro_event_scheduler_<rest>. Because this is a catch-all mapping, the specific metric names emitted depend on the version of Astro Runtime running in your Deployment.

Infrastructure metrics

Infrastructure metrics describe the Kubernetes Pods that run each Airflow component. Use them to track CPU, memory, storage, and lifecycle health.

NameDescription
container_cpu_usage_seconds_totalCPU usage of each container.
container_memory_working_set_bytesMemory usage of each container.
kubelet_stats_ephemeral_storage_pod_usageEphemeral storage usage of each Pod.
kube_pod_status_*Kubernetes Pod status.
kube_pod_labelsKubernetes Pod labels.
kube_pod_container_resource_limitsCPU, memory, and storage limits for Celery workers, Kubernetes executors, KubernetesPodOperator Pods, the scheduler, and the Dag processor.
kube_pod_container_status_terminated_reasonReason a Kubernetes container terminated.
kube_resourcequotaResource quota usage and limits for namespaces in the cluster.