>
WEBINARS

ML in Production with Airflow

Watch Video On Demand

Hosted By

  • Kenten Danas
  • Santona Tuli

1. What is Machine Learning?

One way of thinking about machine learning is that it replaces a classical programming paradigm, where you would take rules plus data and get answers by combining the two. With machine learning, you invert the process, starting with the known answers plus the data, and discovering the rules. Those rules, in turn, can be used to predict answers that produce value.

2. What isMLOps?

Machine learning operations (or MLOps) can be thought of as getting your machine learning algorithm, which already solves a business need, to deliver its value at scale.

With MLOps, you need to:

Planning how to do each of these steps is what gets your machine learning model to produce its value both sustainably and at scale.

3. Example of an ML Workflow

ml-in-production-image2

4. Lifecycle of an ML Application

ml-in-production-image3

To break down the steps of developing a machine learning application:

You start by clarifying the business requirements — not just the purpose of the model but also how available the model needs to be. Common questions you need to ask: How effective does the model need to be? What are the acceptable ranges of its performance? To what degree do the steps need to be automated?

Next, you assess the available data and begin to develop the pipeline. This part usually takes the longest.

Then you deploy the model, and go on to ensure the model is auditable and that the entire cycle is reproducible.

It’s also essential that your machine learning models be explainable, so that you can understand all the steps leading to the predicted output.

5. How Do We Achieve an ML Application in Production?

There isn’t currently a standard way to productionize your machine learning application — you can achieve the same value in different ways. It’s all about your needs and what standards you have to meet.

Different approaches include:

You can leverage Airflow for all the approaches above. Airflow handles orchestration as part of machine learning operationalization at a level of reliability not always offered by other tools.

6. Typical End-to-End ML Pipeline

ml-in-production-image1

A typical end-to-end machine learning pipeline:

  1. Starts with data ingestion, which usually is the responsibility of data engineering.
  2. Advances to data exploration, feature engineering, and model building — all parts of the more experimental phase, which usually falls under the category of data science.
  3. Move on to MLOps, where the model is trained, retrained, deployed, served, and monitored.

In each of the three steps above, it’s important to work with your business counterparts to define the requirements and ensure that your pipeline adds value for users. Once you have settled on a model, the next step is featurization — going straight from data ingestion to model training, as illustrated in the example pipeline above.

7. Why Airflow?

Airflow has several features and qualities that make constructing and automating machine learning pipelines easier.

Ready to Get Started?

Get Started Free

Try Astro free for 14 days and power your next big data project.