Blog |

Airflow in Action: Orchestrating AI At Duolingo To Unlock 99% Lower Costs, 10x User Growth

6 min read |

In her session at the Airflow Summit, Belle Romea, Software Engineer at Duolingo, describes how the company built DuoFactory, an orchestration ecosystem that powers large-scale generative AI (GenAI) workloads using Apache Airflow®.

The talk explains why Duolingo needed a unified orchestration platform, how DuoFactory is architected, and what it enabled in real production use cases such as Duoradio. Viewers learn practical design patterns for orchestrating LLM pipelines at scale and see the engineering choices behind Duolingo’s AI content ecosystem.

Duolingo, its mission, and the content problem

Duolingo is one of the most widely used learning platforms in the world. It takes a gamified approach to teaching languages so learners practice reading, writing, listening, and speaking inside a gamified, highly interactive experience. Over time, the company has expanded beyond language into subjects like math, music, and chess.

The mission is simple to state and hard to execute: deliver the best education in the world and make it accessible to everyone. The bottleneck is content. Each course contains a curriculum broken down into units, lessons, and associated exercises. Even creating basic sentence-level exercises for a single course required roughly 600 hours of work. Richer modalities, such as listening exercises or story-based content, are significantly more expensive to build.

This created tradeoffs that everyone recognized internally. Duolingo had to focus most of its production resources on flagship courses, and some combinations of language direction never shipped at all. Learners saw inconsistencies across courses and often asked why a language they cared about did not have the same features they saw elsewhere in the app. The challenge was not only how to teach well, but how to expand and maintain that experience across many languages and subjects.

LLMs accelerated creation, then exposed scaling limits

Around two years ago, LLM-generated content began to meet Duolingo’s quality bar. Teams experimented with using language models to generate course material, and the productivity improvements were immediate. Duoradio, a short radio-show style listening feature, is a good illustration. Fully manual production took about a month per episode. With LLMs and human review, Duolingo produced an episode in around six hours and scaled output to hundreds of episodes per year.

However, new challenges emerged:

  • Engineering teams independently stitched LLMs into their production pipelines, each using their own approach.
  • Fragmented tools increased tech debt.
  • Reuse was poor.
  • Engineering effort grew even as content generation costs dropped.

The company needed a common orchestration layer.

Figure 1: Airflow becomes the AI orchestration standard for Duolingo’s engineering teams. Image source.

Why Airflow became the foundation for AI workflows

DuoFactory is Duolingo’s orchestration layer for GenAI workloads. At its core, it is a wrapper around Apache Airflow. As soon as the company standardized on Airflow, they immediately solved a series of hard platform problems:

  • Scheduling, dependency management, and visibility across complex AI workflows.
  • Retries and idempotency for safe regeneration.
  • Clean integration with the company’s existing infra, including AWS, Terraform, S3, and internal systems.

Today Airflow tasks become a library of reusable building blocks. Teams share common operators for reading course data, interacting with Google Sheets, running LLM prompts, and writing output for downstream consumption. Because pipelines are modular, engineers can replace parts of the workflow as models improve or prompting strategies change without rebuilding everything from scratch. Idempotency is essential as steps are retried safely without corrupting outputs, enabling fully automated pipelines instead of fragile, human-curated jobs.

This is the real value for AI workloads. Airflow provides structure and control around long-running, multi-step, evaluation-heavy pipelines that produce and refine content, instead of simply triggering a single model call.

Duoradio: Quantifying AI results

The Duoradio pipeline illustrates an AI workflow fully orchestrated by Airflow:

  1. Generate multiple script candidates from curriculum data and creative guidelines.
  2. Assess them with LLM-based evaluators.
  3. Select the best content.
  4. Repeat the same generate-and-evaluate loop for exercises.
  5. Format for app ingestion, trigger TTS and lip-sync generation, and publish to S3.

Figure 2: Airflow Dags for script generation and evaluation Image source.

The results achieved with AI workflows orchestrated by Airflow are material for the business:

  • Human time reduced from months → hours → zero-touch automation.
  • 99% cost reduction per episode.
  • Duoradio launched 243 courses totalling over 70,000 episodes.
  • Daily learners listening to Duoradio episodes grew 10x, from 500k → 5M in six months.

Expanding the ecosystem beyond engineers

To broaden adoption within the company, the engineering team has layered tools on top of Airflow:

  • Prompt Editor: versioned prompt management and testing without touching code.
  • Google Sheets inputs: editable Dag and prompt parameters for non-engineers.
  • Content management interface: Airflow API–backed UI for browsing outputs.
  • Workflow Builder: codeless, block-based Dag creation for one-time workflows

These tools let non-engineers safely orchestrate and iterate on LLM pipelines without needing to know YAML or Python.

Editor’s note: For users less familiar with Airflow, the Astro IDE enables pipeline authoring using natural language. They can write, test, and release production-ready pipelines from the browser with context-aware AI that is trained on Airflow best practices, zero local setup, and one-click deploys to Astro, the fully managed Airflow service from Astronomer. Coming soon, Astro IDE will also feature low-code templates for Dag authoring for those who don’t want to interact with code at all.

For those comfortable with YAML, Astronomer maintains the open source Dag Factory, dynamically generating Airflow Dags from YAML.

Getting Started with AI and Airflow

Duolingo’s experience shows that LLMs unlock multimodal content generation, but orchestration unlocks scale and repeatability. Airflow provides a reliable, observable, and evolvable backbone for AI workloads, enabling a leap in Duolingo’s output volume and learner reach. To see the full architecture, examples, and demonstrations, watch the session replay Creating DuoFactory: An Orchestration Ecosystem with Airflow

If you’re building AI-driven applications, download our eBook “Orchestrate LLMs and Agents with Apache Airflow®” for actionable patterns and code examples on how to scale AI pipelines, event-driven inference, multi-agent workflows, and more.

Build, run, & observe your data workflows.
All in one place.

Build, run, & observe
your data workflows.
All in one place.

Try Astro today and get up to $20 in free credits during your 14-day trial.