Info
This page has not yet been updated for Airflow 3. The concepts shown are relevant, but some code may need to be updated. If you run any examples, take care to update import statements and watch for any other breaking changes.
Cohere is a natural language processing (NLP) platform that provides an API to access cutting-edge large language models (LLMs). The Cohere Airflow provider offers modules to easily integrate Cohere with Airflow.
In this tutorial, you use Airflow and the Cohere Airflow provider to generate recipe suggestions based on a list of ingredients and countries of recipe origin. Additionally, you create embeddings of the recipes and perform dimensionality reduction using principal component analysis (PCA) to plot recipe similarity in two dimensions.
Cohere provides highly specialized out-of-the box and custom LLMs. Countless applications use these models for both user-facing needs, such as to moderate user-generated content, and internal purposes, like providing insight into customer support tickets.
Integrating Cohere with Airflow into one end-to-end machine learning pipeline allows you to:
This tutorial takes approximately 15 minutes to complete (cooking your recommended recipe not included).
To get the most out of this tutorial, make sure you have an understanding of:
Create a new Astro project:
Add the following lines to your requirements.txt file to install the Cohere Airflow provider and other supporting packages:
To create an Airflow connection to Cohere, add the following environment variables to your .env file. Make sure to provide <your-cohere-api-key>.
In your dags folder, create a file called recipe_suggestions.py.
Copy the following code into the file.
This DAG consists of five tasks to make a simple MLOps pipeline.
get_ingredients task fetches the list of ingredients that the user found in their pantry and wants to use in their recipe. The input pantry_ingredients param is provided by Airflow params when you run the DAG.get_countries task uses Airflow params to retrieve the list of user-provided countries to get recipes from.get_a_recipe task uses the CohereHook to connect to the Cohere API and use the /generate endpoint to get a tasty recipe suggestion based on the user’s pantry ingredients and one of the countries they provided. This task is dynamically mapped over the list of countries to generate one task instance per country. The recipes are saved as .txt files in the include folder.get_embeddings task is defined using the CohereEmbeddingOperator to generate vector embeddings of the recipes generated by the upstream get_a_recipe task. This task is dynamically mapped over the list of recipes to retrieve one set of embeddings per recipe. This pattern allows for efficient parallelization of the vector embedding generation.plot_embeddings task takes the embeddings created by the upstream task and performs dimensionality reduction using PCA to plot the embeddings in two dimensions.
Run astro dev start in your Astro project to start Airflow and open the Airflow UI at localhost:8080.
In the Airflow UI, run the recipe_suggestions DAG by clicking the play button. Then, provide Airflow params for:
Countries of recipe origin: A list of the countries you want to get recipe suggestions from. Make sure to create one line per country and to provide at least two countries.pantry_ingredients: A list of the ingredients you have in your pantry and want to use in the recipe. Make sure to create one line per ingredient.type: Select your preferred recipe type.max_tokens_recipe: The maximum number of tokens available for the recipe.randomness_of_recipe: The randomness of the recipe. The value provided is divided by 10 and given to the temperature parameter of the of the Cohere API. The scale for the param ranges from 0 to 50, with 0 being the most deterministic and 50 being the most random.
Go to the include folder to view the image file created by the plot_embeddings task. The image should look similar to the one below.

include folder.Congratulations! You used Airflow and Cohere to get recipe suggestions based on your pantry items. You can now use Airflow to orchestrate Cohere operations in your own machine learning pipelines.