Webinar Recap

Using Airflow with Azure Data Factory

Welcome to the recap of a webinar on using Airflow with Azure Data Factory!

During the webinar we covered:

  1. What is ADF?
  2. What does ADF do?
  3. Adding Airflow to Azure Data Factory
  4. Demo
  5. Q&A

1. What is Azure Data Factory?

airflow-adf1

Long story short, ADF is an orchestration tool for data pipelines, similar to Airflow. However, it is specifically within the Azure realm and it integrates a lot of different data stores inside and outside of that. ADF powers a lot of the visualization and analysis tools in Azure.

2. What does Azure Data Factory do?

airflow-adf2

  1. It allows you to construct ETL and ELT processes code-free with a nice drag and drop UI. You can open each drag & drop piece and select data sources, or use these pieces as building blocks for operations on datasets - on their way to the destination.
  2. It has more than 90 built-in connectors.

3. Adding Airflow to Azure Data Factory

airflow-adf3

You can orchestrate ADF pipelines with Airflow - and there’s plenty of use cases (see slide above)! It’s orchestrating the orchestrator.

4. Demo: the pipelines in action!

airflow-adf4

During the demo, we demonstrated orchestrating multiple Azure Data Factory (ADF) pipelines using Airflow to perform classic ELT operations, based on the example of various currencies.

In order to see:

  • the entire pipline,
  • the corresponding docker files,
  • the requirements,
  • the code,
  • Airflow version we used, visit a site dedicated to Orchestrating Multiple Azure Data Factory Pipelines in Airflow on our Astronomer registry or Github.

In the Q&A we answered:

  • Can airflow replace the ADF completely?
  • How are you actually checking the status of the ADF pipeline?
  • And other technical questions - watch the video to find out.

Getting Apache Airflow Certified

Join the 1000s of other data engineers who have received the Astronomer Certification for Apache Airflow Fundamentals. This exam assesses an understanding of the basics of the Airflow architecture and the ability to create simple data pipelines for scheduling and monitoring tasks.