WEBINAR

Intro to Airflow

Recorded On May 3, 2021

  • Kenten Danas
  • Viraj Parekh

Note: There is a newer version of this webinar available: Airflow 101: How to get started writing data pipelines with Apache Airflow®.

Topics to be discussed:

  • What is Airflow
  • Core Components
  • Core Concepts
  • Flexibility of Pipelines as Code
  • Getting Airflow up and Running
  • Demo DAGs

What is Airflow?

Definition: Apache Airflow® is a way to programmatically author, schedule and monitor data pipelines.

  • Airflow is the De-facto Standard for Data Orchestration
  • Born inside AirBnB, open-sourced, and graduated to a Top-Level Apache Software Foundation Project
  • Leveraged by 1M+ data engineers around the globe to programmatically author, schedule, and monitor data pipelines
  • Deployed by 1000s of companies as the unbiased data control plane, translating business rules to power their data processing fabric

Core Components

intro-airflow-1

Executors

  • Local- good for local and development environments
  • Celery- good for high volume of short tasks
  • Kubernetes- good for autoscaling and task-level configuration

Plus Sequential, but no parallelism with this one

intro-airflow-2

intro-airflow-3

Task

  • Instance of an Operator

Task Instance

  • Represents a specific run of a task: DAG + TASK + Point in time

Flexibility of Data Pipelines-as-Code

intro-airflow-4

intro-airflow-5

Getting Started with Apache Airflow®

The easiest way to get started with providers and Apache Airflow® 2.0 is by using the Astronomer CLI. To make it easy you can get up and running with Airflow by following our Quickstart Guide.

Join the 1000’s of other data engineers who have received the Astronomer Certification for Apache Airflow® Fundamentals. This exam assesses an understanding of the basics of the Airflow architecture and the ability to create basic data pipelines for scheduling and monitoring tasks.

To access the demo DAGs used in the Intro to Airflow webinar, visit this Github repository

See More Resources

Optimizing ML/AI Workflows with Essential Airflow Features

Scheduling in Airflow: A Comprehensive Introduction

Introducing Astro: Data Centric DAG Authoring

Airflow best practices: debugging and testing

Try Astro for Free for 14 Days

Sign up with your business email and get up to $500 in free credits.

Get Started

Build, run, & observe your data workflows. All in one place.

Build, run, & observe
your data workflows.
All in one place.

Try Astro today and get up to $500 in free credits during your 14-day trial.