Configure OpenLineage for a Remote Execution Agent

Airflow 3

This feature is only available for Airflow 3.x Deployments.

OpenLineage enables you to access data lineage and provenance across your Airflow workflows for your Remote Execution Agent. Features like Observe and Astro Alerts require that you enable OpenLineage for your data pipelines.

When you create your Remote Execution Agent, Astro automatically generates a Helm values.yaml file with OpenLineage configurations pre-filled. To set up OpenLineage, you need to configure an access credential for OpenLineage. This can be an Astro Deployment API token used as your OpenLineage API key. There are three methods you can use to add your API key to your Helm values:

  • Method 1 - Configure the API key as plain text. This stores your API key in your values.yaml file as plaintext, which is the simplest but least secure option. It’s appropriate for development or testing environments.
  • Method 2 - Use a pre-created Kubernetes secret. This procedure stores your API key separately from your values.yaml file, which provides more security than storing as plaintext. This option provides security with standard Kubernetes features.
  • Method 3 - Inject your API key with a secrets manager. This approach uses an init container to inject the Agent token into the Remote Execution Agent component Pods. This example uses the HashiCorp Vault Agent, but you can use your own secrets manager. This option provides enhanced security with the potential for secret rotation.

Prerequisites

Setup

Step 1: Retrieve your OpenLineage namespace and URL

1

Open the Remote Agent registration

In the Astro UI, go to the Deployment page and choose Agent. Then click Register Remote Agent.

Download the values file

Click Download values.yaml file.

The downloaded Helm values file includes the OpenLineage namespace and URL pre-filled, so you only need to configure the OpenLineage API key.

Step 2: Configure the OpenLineage API key

This method stores an Astro Deployment API token as your OpenLineage API key in plain text in your values.yaml file, so that the Remote Execution Agent Helm chart can use it to create a Kubernetes secret named openlineage-api-key-secret. This API key is base64-encoded in the Kubernetes secret.

All Remote Execution Agent components — the worker, Dag processor, and triggerer — use this API key to authenticate with the OpenLineage endpoint.

1

Update your values file

Add the following OpenLineage configuration to your values.yaml file:

1openLineage:
2 # Enable OpenLineage integration
3 enabled: true
4
5 # Set your OpenLineage API key directly in the values file
6 apiKey: "<OPENLINEAGE_API_KEY>"
7
8 # Do NOT set apiKeySecret when using apiKey
9 # apiKeySecret: ~
10
11 # Astro OpenLineage URL endpoint (prefilled in the downloaded values.yaml from the Astro UI)
12 url: "<OPENLINEAGE_ENDPOINT_URL>"
13
14 # Astro Deployment's namespace (prefilled in the downloaded values.yaml from the Astro UI)
15 namespace: "<ASTRO_DEPLOYMENT_NAMESPACE>"

Install the Helm chart

Apply the chart using the values.yaml file with the following command:

$helm install astro-agent astronomer/astro-remote-execution-agent -f values.yaml

Step 3: Set OpenLineage environment variables on the orchestration plane

1

Open the Deployment

In the Astro UI, open your Deployment.

Open the Environment Variables tab

Go to the Environment Variables tab.

Add the OpenLineage variable

Add the following environment variable:

$OPENLINEAGE_DISABLED=False

Apply changes

Save your changes to apply them.

Setting this variable ensures that all required OpenLineage events, including task and Dag run events, are collected from the scheduler, workers, Dag processor, and triggerer components. This provides complete lineage in Observe and Astro Alerts.