Apache Airflow™ Guides
Topics
- Airflow UI
- Astro
- AWS
- Azure
- Basics
- Best Practices
- Components
- Concurrency
- Connections
- DAGs
- Data Quality
- Database
- Dependencies
- ETL
- Executors
- Hooks
- Infrastructure
- Integrations
- Kubernetes
- Lineage
- Logging
- Machine Learning
- Operators
- Parallelism
- Plugins
- Resources
- Secrets
- Sensors
- SQL
- Subdags
- Task Groups
- Tasks
- Templating
- Testing
- Windows
- Workers
- XCom
Dynamic Tasks in Airflow
How to dynamically create tasks at runtime in your Airflow DAGs.
- Tasks
OpenLineage and Airflow
Using OpenLineage and Marquez to get lineage data from your Airflow DAGs.
- Lineage
Orchestrating Redshift Operations from Airflow
Setting up a connection to Redshift and using available Redshift modules.
- Database
- SQL
- DAGs
- Integrations
- AWS
Astro for ETL
Using the astro library to implement ETL use cases in Airflow.
- Astro
- ETL
- SQL
Introduction to Airflow Decorators
An overview of Airflow decorators and how they can improve the DAG authoring experience.
- DAGs
- Basics
- Astro
Deferrable Operators
How to implement deferrable operators to save cost and resources with Airflow.
- Operators
- Concurrency
- Resources
- Sensors
- Workers
Debugging DAGs
A beginner's guide to figuring out what's going wrong with your Airflow DAGs
- DAGs
- Basics
Rerunning Airflow DAGs
How to use catchup, backfill, and cleared task instances in Airflow.
- DAGs
Scheduling and Timetables in Airflow
Everything you need to know about scheduling your Airflow DAGs.
- DAGs
Airflow Data Quality Checks with SQL Operators
Executing queries in Apache Airflow DAGs to ensure data quality.
- Database
- SQL
- DAGs
- Data Quality
Airflow Pools
Using pools to control task parallelism in Airflow.
- Parallelism
- Tasks
Integrating Airflow and dbt
Running dbt models in your Airflow DAGs.
- DAGs
- Integrations
Using Airflow with SageMaker
Methods for orchestrating SageMaker machine learning pipelines with Airflow.
- DAGs
- Integrations
- Machine Learning
Executing Notebooks with Airflow
Methods for orchestrating commonly used notebooks with Airflow.
- DAGs
- Integrations
- Machine Learning
Cross-DAG Dependencies
How to implement dependencies between your Airflow DAGs.
- DAGs
- Subdags
- Dependencies
- Sensors
Testing Airflow DAGs
How to apply test-driven development practices to your Airflow DAGs.
- DAGs
- Best Practices
- Testing
Using Task Groups in Airflow
Using Task Groups to build modular workflows in Airflow.
- DAGs
- Subdags
- Task Groups
- Best Practices
Custom XCom Backends
Creating a custom XCom backend with Airflow 2.0.
- Plugins
- XCom
Passing Data Between Airflow Tasks
Methods for sharing metadata and information between tasks in your Apache Airflow DAGs, including XCom.
- DAGs
- XCom
- Tasks
- Dependencies
Deploying Kedro Pipelines to Apache Airflow
How to use the kedro-airflow plugin to change your Kedro pipelines into Apache Airflow DAGs and deploy them to a production environment.
- Plugins
- Integrations
Orchestrating Databricks Jobs with Airflow
Orchestrating Databricks Jobs from your Apache Airflow DAGs.
- Integrations
- DAGs
Executing Azure Data Factory Pipelines with Airflow
Triggering remote jobs in Azure Data Factory from your Apache Airflow DAGs.
- Integrations
- Azure
Executing Azure Data Explorer Queries with Airflow
Executing Azure Data Explorer queries from your Apache Airflow DAGs.
- Integrations
- Azure
- DAGs
Orchestrating Azure Container Instances with Airflow
Orchestrating containers with Azure Container Instances from your Apache Airflow DAGs.
- Integrations
- Azure
- DAGs
Get Started with Apache Airflow 2.0
Test Apache Airflow 2.0 on your local machine with the Astro CLI.
- Resources
- Basics
Using Airflow to Execute SQL
Executing queries, parameterizing queries, and embedding SQL-driven ETL in Apache Airflow DAGs.
- Database
- SQL
- DAGs
Integrating Airflow and Great Expectations
Using the Great Expectations provider natively in your Airflow DAGs.
- DAGs
- Integrations
Understanding the Airflow Metadata Database
An structural walkthrough of Apache Airflow's metadata database, with a full ERD.
- Database
- SQL
- Components
Executing Talend Jobs with Airflow
Triggering remote jobs in Talend from your Apache Airflow DAGs.
- Integrations
Integrating Airflow and Hashicorp Vault
Pull connection information from your Hashicorp Vault to use in your Airflow DAGs.
- DAGs
- Secrets
- Integrations
Importing Custom Hooks & Operators
How to correctly import custom hooks and operators.
- Hooks
- Operators
- Plugins
- Basics
Scaling Out Airflow
How to tune your Airflow environment so it scales with your DAGs.
- Workers
- Concurrency
- Parallelism
- DAGs
Airflow Executors Explained
A thorough breakdown of Apache Airflow's Executors: Celery, Local and Kubernetes.
- Executors
- Basics
- Kubernetes
- Concurrency
- Parallelism
Logging in Airflow
Demystifying Airflow's logging configuration.
- Logging
- Best Practices
- Basics
Introduction to Kubernetes
High-level overview of introductory concepts in Kubernetes.
- Kubernetes
- Infrastructure
Best Practices Calling AWS Lambda from Airflow
A few tips, guidelines, and best practices for calling Lambda from Airflow
- Best Practices
- Integrations
Using Kerberos in Apache Airflow
How to use Kerberos and Kerberized hooks in Airflow
- Integrations
- Connections
KubernetesPodOperator on Astronomer
Use the KubernetesPodOperator on Astronomer
- Kubernetes
- Operators
Running scripts using the BashOperator
Learn and troubleshoot how to run shell scripts using the Bash Operator in Airflow
- DAGs
- Operators
Using SubDAGs in Airflow
Using SubDAGs to build modular workflows in Airflow.
- DAGs
- Subdags
Templating in Airflow
How to leverage the power of Jinja templating when writing your DAGs.
- Templating
- Best Practices
- Basics
Branching in Airflow
Use Apache Airflow's BranchPythonOperator and ShortCircuitOperator to execute conditional branches in your workflow
- DAGs
- Operators
- Basics
- Tasks
Airflow's Components
Learn about the core components of Apache Airflow's infrastructure.
- Components
- Executors
- Database
- Basics
Useful SQL queries for Apache Airflow
A home for SQL queries that we frequently run on our Airflow postgres database.
- Database
- SQL
- DAGs
- Tasks
The Airflow UI
A high-level overview of the Airflow UI
- DAGs
- Airflow UI
- Basics
- XCom
- Tasks
- Connections
Running Airflow on Windows 10 & WSL
How to spin up Airflow on your Windows system.
- Windows
Managing your Connections in Apache Airflow
An overview of how connections work in the Airflow UI.
- Connections
- Basics
- Hooks
- Operators
DAG Writing Best Practices in Apache Airflow
How to create effective, clean, and functional DAGs.
- DAGs
- Best Practices
- Basics
- Templating
- Tasks
Intro to Apache Airflow DAGs
What are DAGs and how they are constructed in Apache Airflow?
- Airflow UI
- DAGs
- Basics
Dynamically Generating DAGs in Airflow
Using a base DAG template to create multiple DAGs.
- DAGs
- Best Practices
Editing Task and DAG Metadata
What are DAGs and how they are constructed in Apache Airflow?
- DAGs
- Database
- Tasks
Error Notifications in Airflow
Methods for managing notifications in your Airflow DAGs.
- DAGs
- Integrations
- Operators
From Operators to DagRuns
From Operators to DagRuns
- Hooks
- Operators
- Tasks
- DAGs
Introduction to Apache Airflow
Everything you need to know to get started with Apache Airflow.
- Basics
- DAGs
Managing Airflow Code
Guidelines for Working with Multiple Airflow Projects
- DAGs
- Best Practices
- Basics
Managing Dependencies in Apache Airflow
An overview of dependencies and triggers in Airflow.
- Best Practices
- Dependencies
- Basics
Using Apache Airflow Plugins
A crash-course in using Airflow Plugins.
- Best Practices
- Plugins
- Basics
Hooks 101
An introduction to Hooks in Apache Airflow.
- Hooks
- Operators
- Tasks
- Basics
Sensors 101
An introduction to Sensors in Apache Airflow.
- Operators
- Tasks
- Basics
- Sensors
Operators 101
An introduction to Operators in Apache Airflow.
- Hooks
- Operators
- Tasks
- Basics