Enable Sub-Second Pipelines

Labs

This feature is in Labs and is only available for Airflow 3.2+ Deployments.

Sub-Second Pipelines are an Astro feature that improves the latency and throughput of triggered Dag runs. They pair the Astro executor with a dedicated event-driven scheduler, so that Astro picks up and dispatches Dag runs in under a second and sustains a higher rate of concurrent triggers than the standard Airflow scheduler. Use them for any workload that fires many API-triggered runs in a short window, not only workloads where individual run latency matters.

This document explains how to enable Sub-Second Pipelines on a Deployment, configure a sub-second worker queue, and route Dags to it.

Sub-Second Pipelines support API-triggered Dag runs only. Runs triggered by the Airflow standard scheduler (cron schedules and timetables), asset and data-aware scheduling, and message queue triggers continue to use the standard scheduler and don’t benefit from sub-second dispatch. Astronomer plans to support additional trigger types in a future release.

When to use Sub-Second Pipelines

Sub-Second Pipelines are designed for API-triggered workloads where throughput, startup latency, or both are critical. Common use cases include:

  • High-throughput triggered workflows: Applications, agents, or upstream systems that fire hundreds of Dag runs per second through the Airflow REST API. Sub-Second Pipelines sustain approximately 1,000 Dag runs per minute, where the Celery executor queues and falls behind before that point.
  • On-demand inference: Machine learning pipelines triggered by an application or service through the Airflow REST API, where end-to-end latency directly affects the user experience.
  • Programmatic workflow invocation: Backend services that call the Airflow REST API to start a pipeline and need it to start immediately.
  • Reverse ETL and operational pipelines: API-driven workflows where freshness budgets are measured in seconds rather than minutes.

For scheduled batch workloads, such as cron schedules and timetables, or for asset and event-driven runs, the default scheduling behavior applies and you don’t need this feature.

How it works

Sub-Second Pipelines introduce an Event Scheduler that runs alongside the Airflow standard scheduler in your Deployment. When you trigger a Dag run through the Airflow REST API, the Event Scheduler picks up the request from an internal event bus and immediately spawns the Dag run, bypassing the polling interval that the standard scheduler relies on.

Because the Event Scheduler is event-driven rather than poll-driven, it scales linearly with trigger volume instead of being bottlenecked by a fixed scheduling loop. Combined with the Astro executor’s centralized task assignment, this is what enables approximately 1,000 Dag runs per minute. Sizing the Event Scheduler with additional replicas lets it sustain higher throughput under bursty load.

A Dag uses the faster path only when both of the following are true:

  • The run was triggered through the Airflow REST API, or the Trigger Dag button in the Airflow UI.
  • The Dag’s tasks are routed to a worker queue that has the Sub-Second toggle enabled.

Runs triggered any other way, such as a cron schedule, timetable, asset update, or message queue, and Dags routed to non-sub-second queues, continue to use the standard scheduler. You can mix sub-second Dags and standard Dags in the same Deployment without affecting your existing workloads.

How to tell whether a Dag qualifies

Astro automatically tags each Dag based on whether it qualifies for sub-second scheduling, so that you can confirm a Dag’s status from its tags in the Airflow UI:

  • A qualifying Dag is tagged sub_second.
  • A Dag that doesn’t qualify is tagged sub_second_excluded, plus a second tag that names the reason, such as sub_second_excluded_mixed_queues.

Several conditions can disqualify a Dag, such as mixing sub-second and non-sub-second queues or setting depends_on_past=True. The Dag warning on the Dag’s page in the Airflow UI names the specific reason. A Dag that doesn’t qualify still runs; it uses the standard scheduler instead of the sub-second path.

Prerequisites

  • A Deployment running Astro Runtime 3.2 or later, which is based on Airflow 3.2 or later.
  • A Deployment that uses the Astro executor. Sub-Second Pipelines don’t support the Celery or Kubernetes executor.
  • Permission to edit Deployment settings.

Enable Sub-Second Pipelines on the Deployment

1

Open the Deployment

In the Astro UI, open your Deployment and click the Details tab.

2

Edit the execution settings

In the Execution section, click Edit.

3

Confirm the executor

Confirm that Executor is set to Astro Executor.

4

Enable the toggle

Set the Sub-Second Pipelines toggle to On.

Sub-Second Pipelines toggle in the Deployment Execution settings

Enabling the toggle provisions the Event Scheduler component for your Deployment but doesn’t change the behavior of any Dags. Dags use the fast path only after you route them to a sub-second worker queue, which you create next.

Create a sub-second worker queue

Sub-second behavior is opt-in for each worker queue. Astronomer recommends creating a dedicated queue for your latency-sensitive Dags instead of enabling it on the default queue, so that you can size and scale the queue independently.

1

Add a worker queue

In the Execution section, scroll to Worker Queues and click Add Queue.

2

Configure the queue

Configure the following settings:

  • Queue Name: Enter a short, descriptive name, such as fast-high-priority. You reference this name from your Dag code.
  • Worker Type: Choose a worker size appropriate for your tasks. For low-latency workloads, a smaller worker type with more workers is often a better fit than a large worker type with few workers.
  • Storage: Keep the default of 10 GiB unless your tasks need more ephemeral storage.
  • Concurrency: Set the number of tasks that a single worker can run in parallel.
  • Min # Workers: Set this to at least 1. Sub-second startup depends on having a worker already warm and ready, so scale-to-zero defeats the purpose of the feature.
  • Max # Workers: Size this for your expected peak concurrency.
3

Enable the Sub-Second toggle

Set the Sub-Second toggle in the queue row to On.

Sub-Second toggle on a worker queue

4

Save the configuration

Click Update Deployment.

Size the Event Scheduler

The Event Scheduler runs as one or more replicas in your Deployment. For production workloads, run at least two replicas so that the scheduler stays available during restarts and can absorb bursts.

1

Open the Advanced section

In the Deployment edit view, expand the Advanced section.

2

Set the replica count

Set Scheduler Replicas to 2, or higher if you expect heavy concurrent trigger volume.

Scheduler Replicas in the Advanced settings

For high trigger rates, also turn on API Server Autoscaling in the same section so that the API server can keep up with incoming triggers. See API server autoscaling.

3

Save the configuration

Click Update Deployment.

Autoscaling for the Event Scheduler is on the near-term roadmap. Until it ships, you set the replica count manually. If you expect highly variable load, choose a replica count that covers your peak rather than your average.

Route your Dag to the sub-second queue

A Dag runs on the sub-second path only if its tasks are assigned to the sub-second worker queue. Set the queue in default_args so that every task in the Dag inherits it. The queue value must exactly match the Queue Name you set in Create a sub-second worker queue. If a task is assigned to a queue that doesn’t exist or isn’t referenced properly, the task might remain in a queued state and fail to execute.

The following example assigns every task in the Dag to the fast-high-priority queue:

1from airflow.sdk import dag, task
2
3@dag(default_args={"queue": "fast-high-priority"})
4def payment_risk_check_on_demand():
5 @task
6 def score_transaction():
7 ...
8
9 score_transaction()
10
11payment_risk_check_on_demand()

To route specific tasks to different sub-second queues, pass a queue argument to the @task decorator instead:

1@task(queue="fast-high-priority")
2def score_transaction():
3 ...

Sub-Second Pipelines don’t support using sub-second and standard worker queues in the same Dag. If you assign tasks from the same Dag to both types of queues, the standard scheduler routes the Dag.

Deploy the Dag to your Deployment as you normally would, using astro deploy, a Git-based deploy, or the Astro IDE.

Verify sub-second behavior

After you deploy, trigger a run of your Dag through the Airflow REST API or the Trigger Dag button in the Airflow UI:

$curl -X POST \
> "https://<your-deployment-url>/api/v2/dags/payment_risk_check_on_demand/dagRuns" \
> -H "Authorization: Bearer $ASTRO_API_TOKEN" \
> -H "Content-Type: application/json" \
> -d '{}'

Replace <your-deployment-url> with your Deployment’s Airflow API URL and $ASTRO_API_TOKEN with a valid Deployment API token.

You can confirm that a run took the sub-second path in several ways. The Airflow UI is the quickest place to check, and the SLA Metrics API is available for programmatic access.

Check the Dag’s tags

In the Airflow UI, open the Dag and check its tags. A Dag on the sub-second path is tagged sub_second. If it’s tagged sub_second_excluded instead, the run uses the standard scheduler. See How to tell whether a Dag qualifies for the reason tags.

Check the Sub-Second Metrics tab in the Airflow UI

Each sub-second Dag has a Sub-Second Metrics tab in the Airflow UI that shows time to first task and per-task lag.

Open the tab from a Dag’s page for a windowed aggregate across recent runs, which is useful for spotting regressions. Select the time window with the Window menu.

Dag-level Sub-Second Metrics tab in the Airflow UI

Open the tab from an individual Dag run to see that run’s time to first task, average task lag, and a per-task lag breakdown.

Per-run Sub-Second Metrics tab in the Airflow UI

Check the Event Scheduler logs

In the Astro UI, open the Deployment’s Logs tab and set the Source filter to Event Scheduler. After you trigger a run, watch for log lines from the astronomer.event_scheduler component that confirm the run was received and scheduled, with timestamps milliseconds apart.

Event Scheduler logs in the Astro UI Logs tab

A successful sub-second run produces a sequence similar to the following:

Event received from Redis ... event_type: schedule_dagruns
Discovered DagRuns ... count: 1
DagRun found via Redis event ... dag_id: payment_risk_check_on_demand
DagRun scheduler added ... active_count: 1
DagRun scheduler started
DagRun transitioned to running
DagRun completed ... state: success
DagRun scheduler stopped ... duration_seconds: ~2

The duration_seconds value on the final line shows the total run time.

Query the SLA Metrics API

For programmatic access, such as monitoring dashboards or latency alerts, the Event Scheduler exposes two endpoints with the same metrics shown in the Sub-Second Metrics tab.

Use the per-run endpoint to retrieve metrics for a specific Dag run:

$curl -sS -X GET --location \
> "https://<your-deployment-url>/astro-event-scheduler/sla_metrics/dag_runs/<dag_id>/<run_id>" \
> --header "Authorization: Bearer $ASTRO_API_TOKEN" \
> --header "Accept: application/json"

The endpoint returns a response similar to the following:

1{
2 "dag_id": "payment_risk_check_on_demand",
3 "run_id": "manual__2026-05-22T17:36:50.487342+00:00",
4 "time_to_first_task": 0.691809,
5 "task_lag": [
6 {
7 "task_id": "load_payment_attempt",
8 "map_index": -1,
9 "task_start_lag": 0.691809
10 },
11 {
12 "task_id": "lookup_customer_history",
13 "map_index": -1,
14 "task_start_lag": 0.748826
15 },
16 {
17 "task_id": "score_transaction",
18 "map_index": -1,
19 "task_start_lag": 0.743206
20 }
21 ]
22}

The per-run endpoint returns scalar measurements, in seconds, for a single Dag run: one time_to_first_task value for the run, and one task_lag entry for each task instance (except for the root task), including each mapped task instance. In each task_lag entry, map_index is the dynamic map index, which is -1 for non-mapped tasks, and task_start_lag is the per-task scheduling lag.

Use the aggregate endpoint to retrieve metrics across a time window:

$curl -sS -X GET --location \
> "https://<your-deployment-url>/astro-event-scheduler/sla_metrics/aggregate?since=2026-05-21T00:00:00Z" \
> --header "Authorization: Bearer $ASTRO_API_TOKEN" \
> --header "Accept: application/json"

The endpoint returns a response similar to the following:

1{
2 "since": "2026-05-21T00:00:00Z",
3 "until": null,
4 "dag_id": "payment_risk_check_on_demand",
5 "is_sub_second": true,
6 "dag_runs_considered": 5,
7 "time_to_first_task": {
8 "count": 5,
9 "min": 0.678138,
10 "max": 0.714617,
11 "mean": 0.688747,
12 "p50": 0.680133,
13 "p95": 0.710055,
14 "p99": 0.713705
15 },
16 "task_lag": {
17 "count": 15,
18 "min": 0.688975,
19 "max": 0.748826,
20 "mean": 0.720769,
21 "p50": 0.722849,
22 "p95": 0.744892,
23 "p99": 0.748039
24 }
25}

The response echoes the filter parameters and returns aggregated distributions, in seconds, across all Dag runs in the window:

  • time_to_first_task: The distribution of time_to_first_task values across all Dag runs considered.
  • task_lag: The distribution of task_start_lag values across all task instances in those runs.

The aggregate endpoint accepts the following query parameters:

ParameterRequiredDescription
sinceYesA UTC ISO-8601 timestamp. Includes only Dag runs with queued_at >= since.
untilNoAn exclusive upper bound as a UTC ISO-8601 timestamp.
dag_idNoRestricts results to a single Dag.
is_sub_secondNoFilters by sub-second-tagged Dags. Defaults to true. Omit the parameter to include all runs.

Troubleshoot Sub-Second Pipelines

If the run doesn’t appear in the Sub-Second Metrics tab, the Event Scheduler logs, or the SLA Metrics API, confirm the following:

  • You triggered the run through the Airflow REST API or the Airflow UI Trigger Dag button, not through a cron schedule, timetable, or asset update.
  • The Dag’s queue value matches a worker queue that has Sub-Second enabled.
  • The Deployment-level Sub-Second Pipelines toggle is on.
  • The Dag isn’t tagged sub_second_excluded. Check the tag and the Dag warning for the reason.
  • The Deployment uses the Astro executor on Astro Runtime 3.2 or later.

Best practices

  • Don’t enable sub-second on every queue. The Event Scheduler is most efficient when it focuses on the workloads that need it. Keep batch and scheduled Dags on the default queue.
  • Size workers for throughput, not size. Many smaller workers usually outperform a few large workers for latency-sensitive workloads. Choose a worker type that matches the per-task resource needs of your Dag, not the largest type available.
  • Test under realistic load. Trigger your Dag through the Airflow REST API at the rate you expect in production and watch the Event Scheduler metrics. If you see queue buildup, increase Max # Workers on the queue or add more Event Scheduler replicas.

See also