One of the best parts of the Apache Airflow project is the providers. As of today, Airflow has over 80 providers that contain more than 880 modules, giving users an easy, out-of-the-box way to connect with almost any external system.
Today, we’re excited to be adding to that ecosystem with Astronomer Providers — a set of Apache 2-licensed providers created and maintained by Astronomer and available for general use, to support workloads and use cases that benefit from running asynchronously.
A Brief Intro to and History of Airflow Providers
An Airflow provider typically consists of 3 types of modules: hooks, operators, and sensors.
- Hook - a high-level interface to an external platform that lets you quickly and easily talk to it without having to write low-level code that hits its API or uses special libraries.
- Operator - a single, ideally idempotent, task. An operator determines what actually executes when your DAG runs.
- Sensor - a special case of operator that allows you to check for a set of conditions to be met.
There’s room for additional modules (XCOM backends, UI plugins, etc.), but most providers consist of just the three above. There are providers for everything from the major clouds to more niche data tools (all conveniently searchable on the Astronomer Registry).
Airflow 2.0 allowed for providers to be independently versioned from each other and the core Apache/Airflow project. Users no longer had to upgrade their entire Airflow deployment just to get a new version of a hook or operator. Additionally, it allowed users to pip install the exact providers and versions they needed.
For users with long running tasks, Airflow 2.2 introduced async functionality as a generalization of AirBnB’s smart sensors. Standard operators and sensors take up a full worker slot for the entire duration of the task. Operators written to take advantage of the new asynchronous functionality can vacate the worker slot when they know they need to wait, ultimately saving resources.
We are introducing Astronomer Providers to help the community take advantage of this new asynchronous functionality and to continue building out the provider ecosystem. These providers are Apache 2 licensed and built for compatibility with OSS Airflow, and will be supported and maintained long-term by Astronomer. For the next few months, we’ll be focused on iteration speed and working with our customers and greater community to ensure a great experience for anyone using these providers.
To get started, check out this doc, and in the spirit of open source, always feel free to open and contribute to the discussion.