WEBINAR

Data-Aware Scheduling with the Astro Python SDK

Recorded On December 6, 2022

  • Tamara Fingerlin
  • Benji Lampel

In Airflow 2.4 the Datasets feature was introduced.

This allows data-aware scheduling:

  • The DAG author (you) can tell Airflow that a task is updating a Dataset: outlets=[Dataset(“s3://my_bucket”)]
  • DAGs can be scheduled to run on these updates to Datasets: schedule=[Dataset(“s3://my_bucket”)]

You can find all the needed resources in this Github repository.

See More Resources

How to expertly organize your DAGs with task groups

Data Lineage with OpenLineage and Airflow

Introducing Astro Observe: Pipeline-level Observability

Swinging for Success: How Big Data is Redefining the Texas Rangers' Playbook

Try Astro for Free for 14 Days

Sign up with your business email and get up to $500 in free credits.

Get Started

Build, run, & observe your data workflows. All in one place.

Build, run, & observe
your data workflows.
All in one place.

Try Astro today and get up to $500 in free credits during your 14-day trial.