WEBINAR

Data-Aware Scheduling with the Astro Python SDK

Recorded On December 6, 2022

  • Tamara Fingerlin
  • Benji Lampel

In Airflow 2.4 the Datasets feature was introduced.

This allows data-aware scheduling:

  • The DAG author (you) can tell Airflow that a task is updating a Dataset: outlets=[Dataset(“s3://my_bucket”)]
  • DAGs can be scheduled to run on these updates to Datasets: schedule=[Dataset(“s3://my_bucket”)]

You can find all the needed resources in this Github repository.

See More Resources

Airflow 101: Essential Tips For Beginners - Aug 2022

How to expertly organize your DAGs with task groups

Hands-on Workshop: automate your data ingestion with Fivetran and Astronomer

Introduction to Apache Airflow® 3.0: Data Assets and Event-Driven Scheduling Deep Dive

Try Astro for Free for 14 Days

Sign up with your business email and get up to $500 in free credits.

Get Started

Build, run, & observe your data workflows. All in one place.

Build, run, & observe
your data workflows.
All in one place.

Try Astro today and get up to $500 in free credits during your 14-day trial.