We’re excited to present Data Pipelines with Apache Airflow — a comprehensive guide to Apache Airflow that covers every aspect of building, maintaining, and managing data pipelines. This 455-page eBook focuses on the practical usage of Airflow and provides an excellent overview of Airflow concepts and best practices.
Inside, you’ll learn how to:
- Get started with Airflow and set it up in production environments
- Build, test, and deploy Airflow DAGs
- Automate data transformations
- Build custom components for Airflow that are used to coordinate data flows across systems
- And more!
About the Authors:
Bas Harenslak is a Solutions Data Architect at Astronomer and Apache Airflow committer.
Julian de Ruiter is a data engineer with extensive experience in using Airflow at different companies.
Apache Airflow is the open source standard for workflow orchestration, offering a flexible and scalable way to programmatically author, schedule, and monitor your data pipelines using Python and SQL. Created at Airbnb as an open-source project in 2014, Airflow was brought into the Apache Software Foundation’s Incubator Program in 2016 and announced as a Top-Level Apache Project in 2019. Now, it’s widely recognized as the industry’s leading data orchestration solution.
Data Pipelines with Apache Airflow
is divided into 4 parts:
- Part 1: Getting Started
- Part 2: Beyond the Basics
- Part 3: Airflow in Practice
- Part 4: In the Clouds