Apache Airflow ™
The open source standard for workflow orchestration.
Apache Airflow is a way to programmatically author, schedule and monitor your data pipelines using Python and SQL.
Created at Airbnb as an open-source project in 2014, Airflow was brought into the Apache Software Foundation’s Incubator Program 2016 and announced as Top-Level Apache Project in 2019. Now, it’s widely recognized as the industry’s leading data orchestration solution.
With over 140 integrations and more added each month, the power of the community ensures that Airflow has the most comprehensive coverage of data sources and other providers, whatever your use case.
Apache Airflow Core Principles
Airflow is built on a set of core ideals that allow you to leverage the most popular open source workflow orchestrator on the market while maintaining enterprise-ready flexibility and reliability.
Fully programmatic workflow authoring allows you to maintain full control of the logic you wish to execute.
Leverage a robust ecosystem of open source integrations to connect natively to any third party datastore or API.
Get the optionality of an open-source codebase while tapping into a buzzing and action-packed community.
Scale your Airflow environment to infinity with a modular and highly-available architecture across a variety of execution frameworks.
Integrate with your internal authentication systems and secrets managers for an platform ops experience that your security team will love.
Plug into your internal logging and monitoring systems to keep all of the metrics you care about in one place.
Discover new features of Apache Airflow 2
A New, Highly Available Scheduler
Expect faster performance with near-zero task latency. Launch Scheduler replicas to increase task throughput and ensure high-availability. Read more about the Airflow 2 scheduler.
Full REST API
Build programmatic services around your Airflow environment with Airflow's new API, now featuring a robust permissions framework. Airflow Docs
Easily accommodate long-running tasks with deferrable operators and triggers that run tasks asynchronously, freeing up worker slots and making efficient use of resources. Airflow Docs
Spin up as many parallel tasks as you need at runtime in response to the outputs of upstream tasks. Chain dynamic tasks together to simplify and accelerate ETL and ELT processing. Airflow Docs
Leverage dynamic tasks, sensors, and deferrable operators to create robust, event-driven workflows. Airflow Docs
Task Flow API
Pass information between tasks with clean, efficient code that's abstracted from the task dependency layer. Includes support for custom XCom backends. Airflow Docs
Replace SubDAGs with a new way to group tasks in the Airflow UI. Task Groups don't affect task execution behavior and do not limit parallelism. Airflow Docs
Make the most of the pod_override parameter for easy 1:1 overrides and the new yaml pod_template_file, which replaces configs set in airflow.cfg. Airflow Docs
Pull the latest version of any Airflow Provider at anytime, or follow an easy contribution process to build your own and install it as a Python package. Airflow Docs
Understand at a glance what is happening with your DAGs and tasks, quickly pinpoint task failures, and drill down into root causes with Airflow’s intuitive grid view. Airflow Docs