Debunking myths about Airflow’s architecture and performance

Airflow fact vs fiction, Part 2

  • Kenten Danas

Introduction

Apache Airflow has significantly evolved from its origins at Airbnb in 2014, from mostly orchestrating ETL pipelines into the industry standard for complex data workflows, powering everything from machine learning and GenAI, infrastructure management to mission-critical analytics. Yet outdated narratives and misconceptions persist, despite major advancements, including in the recent Airflow 3 release. In this series, we’re separating fact from fiction to clarify what’s true, what’s changed, and what’s simply misunderstood.

In Part 1 of this series, we looked at some of the most persistent myths about Airflow’s user experience. We unpacked claims that DAGs are hard to write, that local development is painful, and that Airflow can’t support dynamic pipelines. As we saw, many of these critiques are misleading—or simply no longer true. Thanks to features like the TaskFlow API, Assets, dynamic task mapping, and features of managed services like Astro, working with Airflow is more Pythonic and flexible than ever.

In this second post, we’ll turn our attention to another common theme in discussions about Airflow: its architecture and performance. If you’ve spent any time in community forums, you’ve likely heard concerns that “Airflow doesn’t scale,” that it’s difficult to manage, and that it should be used purely as a job scheduler. These narratives have stuck around for years—and while some stem from real pain points in earlier versions of Airflow, others are misconceptions or simplifications that miss the nuance of how Airflow works today.

The truth is that Airflow’s architecture has evolved just as much as its user experience. From its high availability to its flexible deployment options, modern Airflow is capable of powering orchestration at massive scale, and it is used daily by some of the largest data teams in the world.

In this post, we’ll break down the most common architectural and performance myths we hear about Airflow, explain where the grain of truth lies, and show how the current state of the project addresses them. Let’s get into it.

Statement: The Airflow scheduler is unreliable

Verdict: Fiction

It’s hard to talk about Airflow myths without bringing up the scheduler. For years, “the scheduler is unreliable” has been one of the most commonly repeated critiques. And it’s not hard to see why—anyone who used Airflow prior to 2.0 likely encountered at least some version of this. Manually restarting the Airflow scheduler periodically to maintain performance was commonplace (a brief personal anecdote - I used to do this all the time with Airflow 1.9, prior to working at Astronomer). When something went wrong, visibility was limited, and it wasn’t uncommon to see community posts where the answer boiled down to "try turning it off and on again."

But here’s the thing: that version of the scheduler doesn’t exist anymore.

With the release of Airflow 2.0 in 2020, the scheduler was completely rewritten as part of a broader initiative to improve scalability and reliability. It became highly available, allowing for multiple scheduler replicas to be run in an active / active model, and was optimized for scalability. With these changes, performance and reliability increased significantly (check out this blog post for more stats). Other changes in versions since (like fixing the issue of tasks getting stuck in a queued state) have further improved the situation.

Now, this isn’t to say that every Airflow scheduler instance is flawless. Like any distributed system, misconfiguration, resource constraints, or architectural anti-patterns can still lead to issues. Writing DAGs with best practices in mind can help ensure your scheduler functions normally. And even moreso, using a managed service like Astro, that allows you to easily choose your scheduler size and replicas and offers >99.9% uptime, can help ensure that you’re never worried about the scheduler impacting your pipelines. Generally, with these improvements and the availability of managed services, scheduler reliability has not been a major concern since the end of 2020.

Statement: Airflow is difficult to scale

Verdict: Misleading

Related to the complaints about the unreliability of the scheduler in Airflow 1.X is the critique that Airflow can’t handle large workloads or is too hard to scale in production. There’s a grain of truth here (ok, maybe even a couple grains)—scaling Airflow effectively does require a deep understanding of its architecture and the systems that support it. But the idea that it’s fundamentally hard to scale? That’s more about operational burden than technical limitation.

The truth is that Airflow has been a distributed system designed for scalability from the beginning. Its core components—the scheduler, API server (or the webserver in Airflow 1 and 2), and metadata database—are all decoupled and independently scalable. Through your Airflow configuration, there are a massive number of nobs you can tune to ensure good performance, regardless of how many DAGs and tasks you are running. With the right executor and infrastructure, teams regularly run Airflow at impressive scale, orchestrating thousands or even tens of thousands of tasks per day (see 200,000 data pipelines at Uber, PBs of weekly data at Ford, or 1 million monthly deploys at LinkedIn).

What makes this feel hard isn’t the software—it’s the responsibility. Managing that scale on your own means handling Airflow configuration and performance tuning, autoscaling infrastructure used for Airflow workers, provisioning executor infrastructure like Kubernetes or Celery queues, maintaining metadata database performance, handling log storage, monitoring the health of Airflow worker nodes, accounting for disaster recovery, and more. None of this is impossible—but it is a lot to own. For small teams or companies without dedicated platform engineers, that operational complexity can be overwhelming, and it’s where most “Airflow doesn’t scale” stories tend to originate.

That’s why DataOps platforms like Astro exist. With Astro, all of the operational concerns around scaling Airflow are handled for you. You still write and manage your DAGs the same way, but the underlying infrastructure scales up and down automatically based on workload. And when something does go wrong, built-in observability makes it easier to pinpoint issues without needing to dig through raw logs or restart services manually.

So yes, if you're deploying and managing Airflow entirely on your own, scaling can be a challenge—but that’s true of any distributed system. What matters is that Airflow is purpose-built with scalability in mind, and the ecosystem now includes services that make it easier than ever to do so reliably.

Statement: You can’t process data in Airflow tasks

Verdict: Misleading

“Don’t process data in Airflow” is a statement that has been repeated for years. In fact, I myself (and the rest of Astronomer’s DevRel team) have preached this as a best practice in the past. And to be fair, it made sense—at least historically. Airflow was designed as an orchestrator, not a compute engine. The design philosophy was clear: delegate the heavy lifting, keep tasks lightweight, and let specialized systems handle the actual processing.

But that guidance—while still useful in some cases—no longer tells the full story.

Airflow today is much more capable than it used to be in regards to processing data, and Astronomer no longer recommends “don’t do it, ever” as a best practice. There are a couple of features that make this the case:

  • Custom XCom backends, which allow you to pass larger amounts of data between Airflow tasks without overloading your metadata database.
  • Dynamic task mapping, which allows you to break up monolithic tasks into multiple task instances at runtime. While this feature alone doesn’t make processing data within Airflow safe, it can give you better visibility and more control over your resource utilization when dealing with large amounts of data.
  • Remote execution using the Edge executor or the Astro executor, which allows you to distribute tasks to workers in different locations, so you can choose infrastructure appropriate for resource-intensive tasks.

In addition, Astro users can leverage worker queues and Remote Execution Agents to further customize the infrastructure used to run particular tasks. So you can run tasks that process more data on larger machines or on purpose-built hardware (like GPUs/TPUs for GenAI workloads), without worrying about the increased cost of running your entire Airflow deployment that way.

All that said, Airflow still isn’t a Spark or Ray replacement—you probably shouldn’t use it to batch-transform terabytes of raw data in memory. But saying “you can’t process data in Airflow” is no longer accurate. As always, the key is using the right tool for the job. Airflow’s role as the orchestrator hasn’t changed—but its flexibility has improved. And for many teams, that flexibility is what allows them to keep orchestration and lightweight processing tightly integrated within the same platform.

Statement: Airflow lacks native pipeline versioning

Verdict: Fiction

In Airflow’s history, one of the most common and community-requested features was native DAG versioning. Developers wanted a way to track how their pipelines evolved over time, understand what code was responsible for a given DAG run, and confidently make changes without losing visibility into historical behavior.

This makes sense - lack of DAG versioning caused some significant pain points. Since Airflow only considered the current version of DAG code, debugging past DAG runs could be a real challenge. For example, if you removed a task from a DAG, it would be completely removed from the Airflow UI, and you would lose all history of that task. And if you deployed a change to your DAG code while that DAG was running, it could cause some tasks to be run with the old code, and some with the new.

But this has been completely resolved in Airflow 3, which delivered DAG versioning as a first-class feature. No setup is required to use the most basic implementation of this feature, and all structural changes to your DAG code will be tracked in the Airflow UI. You can also configure a Git DAG bundle, which gives you more functionality like rerunning previous versions of the DAG.

With this feature, auditability is greatly improved - no more guessing which code ran last week, or worrying that a minor change might retroactively affect the visibility of past runs. For more on using this feature, you can check out our DAG versioning guide.

Conclusion

Frequently cited issues with Airflow’s architecture and performance are outdated. A lot of these critiques were rooted in real challenges from early versions of the project, especially Airflow 1. But over the past several years, the Airflow community has addressed them head-on, delivering a more stable scheduler, support for DAG versioning, and a flexible architecture capable of supporting serious workloads at scale.

Today, whether you’re running Airflow yourself or using a managed platform like Astro, you have the tools to run dynamic, resource-intensive, and high-throughput pipelines with confidence. This post covered the following statements:

Statement

Verdict

Modern features

The Airflow scheduler is unreliable

❌ Fiction

High availability in Airflow 2.0+, Astro managed Airflow infrastructure

Airflow is difficult to scale

⚠️ Misleading

Decoupled and independently scalable core components in Airflow, Hypervisor and auto-scaling in Astro

You can’t process data in Airflow tasks

⚠️ Misleading

Remote execution and Edge executor in Airflow 3+, Remote Execution Agents and task-optimized worker queues on Astro, custom XCom backends

Airflow lacks native pipeline versioning

❌ Fiction

DAG versioning in Airflow 3+

By now in this series, we’ve fully or partially debunked many of the common Airflow critiques and misconceptions, focused heavily on Airflow architecture and features. But we haven’t touched on the higher level question of what use cases Airflow is suited for in the first place. In the final post of this series, we’ll tackle a misconception that may not even be often publicly stated, but is still attached to the Airflow narrative: that Airflow is only for batch ETL workflows. We’ll take a closer look at how teams today are using Airflow for machine learning, AI, event-driven orchestration, and more—and how the platform is evolving to support those use cases even better in the future.

Stay tuned.

Build, run, & observe your data workflows.
All in one place.

Build, run, & observe
your data workflows.
All in one place.

Try Astro today and get up to $500 in free credits during your 14-day trial.