Apache Airflow

The open source standard for workflow orchestration.

Apache Airflow is a way to programmatically author, schedule and monitor your data pipelines using Python and SQL.

Created at Airbnb as an open-source project in 2014, Airflow was brought into the Apache Software Foundation’s Incubator Program 2016 and announced as Top-Level Apache Project in 2019. Now, it’s widely recognized as the industry’s leading data orchestration solution.

With over 140 integrations and more added each month, the power of the community ensures that Airflow has the most comprehensive coverage of data sources and other providers, whatever your use case.

modern data orchestration pipelines
modern data orchestration pipelines

Built on a strong and growing community.

Dag

Apache Airflow Core Principles

Airflow is built on a set of core ideals that allow you to leverage the most popular open source workflow orchestrator on the market while maintaining enterprise-ready flexibility and reliability.

Flexible

Fully programmatic workflow authoring allows you to maintain full control of the logic you wish to execute.

Extensible

Leverage a robust ecosystem of open source integrations to connect natively to any third party datastore or API.

Open Source

Get the optionality of an open-source codebase while tapping into a buzzing and action-packed community.

Scalable

Scale your Airflow environment to infinity with a modular and highly-available architecture across a variety of execution frameworks.

Secure

Integrate with your internal authentication systems and secrets managers for an platform ops experience that your security team will love.

Modular

Plug into your internal logging and monitoring systems to keep all of the metrics you care about in one place.

Discover new features of Apache Airflow 2

  • A New, Highly Available Scheduler

    Expect faster performance with near-zero task latency. Launch Scheduler replicas to increase task throughput and ensure high-availability. Read more about the Airflow 2 scheduler.

  • Full REST API

    Build programmatic services around your Airflow environment with Airflow's new API, now featuring a robust permissions framework. Airflow Docs 

  • Deferrable Operators

    Easily accommodate long-running tasks with deferrable operators and triggers that run tasks asynchronously, freeing up worker slots and making efficient use of resources. Airflow Docs 

  • Dynamic Tasks

    Spin up as many parallel tasks as you need at runtime in response to the outputs of upstream tasks. Chain dynamic tasks together to simplify and accelerate ETL and ELT processing. Airflow Docs 

  • Event-Driven Workflows

    Leverage dynamic tasks, sensors, and deferrable operators to create robust, event-driven workflows. Airflow Docs 

  • Task Flow API

    Pass information between tasks with clean, efficient code that's abstracted from the task dependency layer. Includes support for custom XCom backends. Airflow Docs 

  • Task Groups

    Replace SubDAGs with a new way to group tasks in the Airflow UI. Task Groups don't affect task execution behavior and do not limit parallelism. Airflow Docs 

  • Simplified KubernetesExecutor

    Make the most of the pod_override parameter for easy 1:1 overrides and the new yaml pod_template_file, which replaces configs set in airflow.cfg. Airflow Docs 

  • Independent Providers

    Pull the latest version of any Airflow Provider at anytime, or follow an easy contribution process to build your own and install it as a Python package. Airflow Docs 

  • UI/UX Improvements

    Understand at a glance what is happening with your DAGs and tasks, quickly pinpoint task failures, and drill down into root causes with Airflow’s intuitive grid view. Airflow Docs 

Screenshot of Apache Airflow 2.0 UI