WEBINAR

Datasets and Data-Aware Scheduling in Airflow

Recorded On March 12, 2025

  • Tamara Fingerlin
  • Constance Martineau

Datasets in Airflow are a powerful feature that allow you to define explicit dependencies between DAGs and the data they rely on. By leveraging datasets, you can establish clearer relationships between workflows, optimize scheduling for DAGs accessing the same data, and improve coordination across teams. This approach enhances pipeline efficiency, transparency, and overall data orchestration.

In this webinar, we’ll explore how to take full advantage of datasets and data-aware scheduling, including:

  • Implementing datasets effectively, from pipeline design to advanced scheduling techniques
  • Monitoring dataset-driven pipelines using the Airflow UI and Astro Observe
  • Upcoming enhancements in Airflow 3.0, including updates to datasets and additional functionality

Resources:

See More Resources

The State of Airflow 2025

Driving Next-Gen AI Applications with AWS and Astronomer

Airflow best practices: debugging and testing

What’s new in Airflow 2.9

Try Astro for Free for 14 Days

Sign up with your business email and get up to $500 in free credits.

Get Started

Build, run, & observe your data workflows. All in one place.

Build, run, & observe
your data workflows.
All in one place.

Try Astro today and get up to $500 in free credits during your 14-day trial.