Day 2 Operations for LLMs with Apache Airflow: Going Beyond the Prototype, Part 1

  • Michael Gregory

In late 2023, the world is undergoing a clear shift from an “AI Summer” of endless possibilities with Large Language Models (LLMs) to an “AI Autumn” marked by pragmatism and hard work as prototypes run into the challenges of enterprise adoption. A new crop of LLM application development frameworks are simplifying development but lack features that operational teams have come to expect to build reliable, sustainable, and auditable workflows.  

Meanwhile, Apache Airflow is at the core of many of these teams’ technology stacks, and when combined with vector databases, LLMs, and the LLM development frameworks, it facilitates the creation of enterprise-grade workflows that feed a new category of applications that are creating real business value.

In this blog, we discuss the challenges of LLM application development beyond the prototype and why Apache Airflow was highlighted by Andreessen Horowitz’s Emerging Architectures for LLM Applications for its ability to enable day-2 operations for LLM applications.

LLM Application Paradigms

A large and rapidly growing ecosystem of publicly available LLMs and LLM application development frameworks (e.g. LangChain, LLamaIndex, Unstructured, Haystack) enable organizations to build application prototypes at a fraction of the cost and time it took even a year earlier. Much of the current application development efforts can be bucketed into three high-level approaches:

This blog focuses on RAG-based LLM development as it allows enterprises to tap into their intellectual property and private data to add unique, value-added context to applications. RAG-based applications do, however, add operational complexity in the form of processes to load (i.e. extract, transform, vectorize, index, and import), as well as to maintain, the unstructured data which forms the basis of this competitive advantage.

Day-2 Operations

LLM development frameworks, such as LangChain, have created simple and extensible frameworks for building prototype pipelines for document processing and loading. However, as these applications become business-critical or as the prototype moves to production, additional aspects are required to create the reliability, availability, serviceability, scalability, explainability, and auditability expected of enterprise-grade applications.

To summarize, LLM application development frameworks are incredibly useful tools for building LLM prototypes but lack features for day-2 operations.

There’s an app for that…

It may be obvious at this point that these are all challenges the industry has seen before. In reality, LLM application loading is almost identical to the normal extract, transform, and load (ETL/ELT) processes of other data pipelines and should be treated with the same rigor and engineering discipline. Important differences arise, however, with pipelines for RAG-based applications due to the “living” nature of document embeddings which change over time, rapid evolution of models, standards and techniques, and the importance of feedback loops. These factors make it even more important to build operational workflows with an enterprise-grade orchestration framework.

Leading analysts highlight Apache Airflow for its ability to bring day-2 operations to LLM development frameworks. In addressing the challenges listed above, Apache Airflow provides:

Apache Airflow has become the glue that holds the modern data stack together. This is as true for MLOps as for traditional DataOps. As a Python-based tool, Airflow integrates well with all of the most popular LLM development frameworks and enables enterprises to not only prototype LLM applications quickly but also to operationalize them and build production-quality workflows.

If you’re interested in getting started with Apache Airflow for MLOps, you can spin up Airflow in less than 5 minutes with a free trial of Astro.

Other articles in this series

Ready to Get Started?

Get Started Free

Try Astro free for 14 days and power your next big data project.