WEBINARS

The Airflow API

Watch Video On Demand

Recorded On

Hosted By

  • Kenten Danas
  • Viraj Parekh

Note: There is a newer version of this webinar available: Programmatic workflow management with the Airflow and Astro APIs

Webinar links:

Agenda:

  1. What is Apache Airflow®?
  2. Apache Airflow® 2
  3. Airflow API: How it Used to Be
  4. Airflow API
  5. Using the API
  6. Some Common Use Cases
  7. Event-Based DAGs with Remote Triggering
  8. Demo
  9. Appendix: API Calls Used

What is Apache Airflow®?

Apache Airflow® is one of the world’s most popular open-source data orchestrators — a platform that lets you programmatically author, schedule, and monitor your data pipelines.

Apache Airflow® was created by Maxime Beauchemin in late 2014. It was brought into the Apache Software Foundation’s Incubator Program in March 2016, and has seen growing success since. In 2019, Airflow was announced as a Top-Level Apache Project, and it is now considered the industry’s leading workflow orchestration solution.

Key benefits of Airflow:

Apache Airflow® 2

Airflow 2 was released in December 2020. The new version is faster, more reliable, and more performant at scale. Increased adoption across a wide range of use cases is now possible thanks to the new features, which include:

Stable REST API A new, fully stable REST API with increased functionality and a robust authorization and permissions framework.

TaskFlow API An API that makes for easier data sharing between tasks using XCom and provides decorators for cleanly writing your DAGs.

Deferrable Operators Deferrable operators that reduce infrastructure costs by releasing worker slots during long-running tasks.

HA Scheduler A highly available scheduler that eliminates a single point of failure, reduces latency, and allows for horizontal scalability.

Improved UI/UX A cleaner, more functional UI with additional views and capabilities. Check out the Calendar and DAG Dependencies Views!

Timetables Timetables that allow you to define your custom schedules, going beyond Cron where needed.

Airflow API

How it Used to Be:

Before Airflow 2.0, the Airflow APIwas experimental, and neither officially supported nor well documented. It was only used to trigger DAG runs programmatically.

How It Is Now:

The new full REST API is:

What Can You Do With It?

The REST API generally supports almost anything you can do in the Airflow UI, including:

None of these were possible with the old API.

Using the API

Authentication is required with the Airflow REST API.

Common Use Cases

The API is commonly used for things like:

airflow-api-1

Event-Based DAGs with Remote Triggering

The API can be used to trigger DAGs on an ad-hoc basis. At Astronomer, we often see use cases such as:

  1. Your website has a form page for potential customers to fill out. After the form is submitted, you have a DAG that processes the data. Building a POST request to the dagRuns endpoint into your website backend will trigger that DAG as soon as it’s needed.
  2. Your company’s data ecosystem includes many AWS services and Airflow for orchestration. Your DAG should run when a particular AWS state is reached, but sensors don’t exist for every service you need to monitor. Rather than writing your own sensors, you use AWS Lambda to call the Airflow API.
  3. Your team has analysts who need to run SQL jobs in an ad-hoc manner, but don’t know Python or how to write DAGs. You’ve built an Airflow plugin that allows the analysts to input their SQL, and a templated DAG is run for them behind the scenes. The API allows them to trigger the DAG run through the plugin.

Demo

The demo starts at minute 10 of the video and covers:

All the code from the demo is available in this webinar’s github repository.

Appendix: API Calls Used

We made the following Airflow API calls using Postman during this webinar (to Airflow running locally on localhost:8080 using the Astronomer CLI):

Note: when running locally with the Astronomer CLI, all these requests can use Basic Auth with admin:admin credentials.

Build, run, & observe your data workflows.
All in one place.

Get $300 in free credits during your 14-day trial.

Get Started Free