Manage Dependencies Between Airflow Deployments, DAGs, and Tasks

WATCH ON DEMAND

Summary:

More often that not, your Airflow components will have a desired order of execution particularly if you are performing a traditional ETL process—for example, before the Transform step in ETL, Extraction had to have happened in an upstream pipeline. In this webinar we will discuss how to properly setup dependencies and define an order of execution or operation for your pipelines using dependencies.

What we will Cover:

  • Cross Deployment Dependencies
  • Cross DAG Dependencies
  • Basic Task Dependencies
  • Trigger Rules
  • Branching
  • Dependencies within task groups
  • Dependencies within dynamically generated tasks

Missed the Webinar? Sign up for the Recap

Recap Preview

Task-Level Dependencies

Simple dependencies

In Airflow if you have a task you can set its dependencies with syntax or bit-shift operators (see both on the slide). Best practice is not to mix these two in your code.

First you list your tasks: d1 = DummyOperator(task_id=”first_task”) d2 = DummyOperator(task_id=”second_task”) d3 = DummyOperator(task_id=”third_task”) d4 = DummyOperator(task_id=”fourth_task”)

And then create a dependency chain.

This (syntax): d1.set_downstream(d2) d2.set_downstream(d3) d3.set_downstream(d4)

Is the same as (bit-shift operators): d1 >> d2 >> d3 >> d4

This: d4.set_upstream(d3) d3.set_upstream(d2) d2.set_upstream(d1)

Is the same as: d4 << d3 << d2 << d1

All of these create a chain of dependencies such as: DAG simple_scheduling

Hosted By

Kenten Danas

Kenten Danas

Field Engineer

Kenten is a Field Engineer at Astronomer with a background in data engineering, data science, and consulting. She has first hand experience adopting and running Airflow as a consultant, and enjoys helping other data engineers scale and get the most out of their Airflow experience. When she isn't working with data she's typically outside trail running, skiing, and advocating for climate action

Chris Hronek

Chris Hronek

Data Engineer

Worked as a Data Engineer at a Fintech start-up before joining Astronomer where I now work as a Data Engineer on the Customer Success Team. I’m obsessed with making Airflow the industry de facto standard for Data Orchestration. Outside of work, I also enjoy the outdoors by Skiing, Mountain Biking, Backpacking, and Camping.