Configure DAG sources

Remote Execution Agents require configuration to access your DAG code. This guide covers configuring DAG bundles, which are collections of DAG files and supporting code introduced in Airflow 3.

This feature requires Airflow 3.x Deployments. Configuring multiple DAG bundles in a single Deployment is only supported in Remote Execution mode.

DAG bundle types

Choose between two types of DAG bundles:

  • GitDagBundle: Dags stored in a Git repository (recommended for production)
  • LocalDagBundle: Dags stored in the container image or persistent volume (default)

When to use each bundle type

Use GitDagBundle when:

  • Running production deployments
  • Tracking DAG versions with full rerun capabilities
  • Storing dags in version control systems
  • Managing multiple teams or DAG repositories

Use LocalDagBundle when:

  • Running development or testing environments
  • Building dags into container images
  • Using existing PVC-based DAG management
  • Preferring simpler configuration

See GitDagBundle compared to LocalDagBundle for functional differences.

Configure GitDagBundle

GitDagBundle fetches dags from Git repositories and provides automatic versioning capabilities.

GitDagBundle is recommended for production Remote Execution deployments.

Supported authentication methods

GitDagBundle supports the following authentication methods:

  • Access tokens (personal access tokens, OAuth tokens)
  • SSH keys
  • SSH agent

Choose the method that aligns with your security requirements and infrastructure.

Configure public repository

For public repositories, no authentication configuration is required. Configure only the repository URL and tracking reference:

1dagBundleConfigList: '[{"name": "public-dags", "classpath": "airflow.providers.git.bundles.git.GitDagBundle", "kwargs": {"repo_url": "https://github.com/your-org/public-dags", "tracking_ref": "main", "subdir": "dags"}}]'

Configure private repository

For private repositories, configure both the DAG bundle and an Airflow connection for authentication.

1

Create Git connection

Add an Airflow connection environment variable in values.yaml. The connection name suffix must match the git_conn_id value in your DAG bundle configuration.

1commonEnv:
2 - name: AIRFLOW_CONN_GIT_REPO
3 value: >-
4 {
5 "conn_type": "git",
6 "login": "<git-username>",
7 "password": "<personal-access-token>",
8 "host": "github.com",
9 "schema": "https",
10 "extra": {
11 "repo": "<your-org>/<private-repo>",
12 "branch": "main"
13 }
14 }

The connection name AIRFLOW_CONN_GIT_REPO creates a connection with ID git_repo. This ID must match the git_conn_id value in your DAG bundle configuration.

For production environments, store connection credentials in a secrets backend instead of values.yaml.

2

Configure DAG bundle

Configure the DAG bundle with matching git_conn_id:

1dagBundleConfigList: '[{"name": "private-dags", "classpath": "airflow.providers.git.bundles.git.GitDagBundle", "kwargs": {"tracking_ref": "main", "subdir": "dags", "repo_url": "https://github.com/<your-org>/<private-repo>.git", "git_conn_id": "git_repo"}}]'

Note that git_conn_id: "git_repo" matches the connection ID from the AIRFLOW_CONN_GIT_REPO environment variable.

3

Update Helm release

Apply the configuration:

$helm upgrade astro-agent astronomer/astro-remote-execution-agent -f values.yaml

Configure refresh interval

Control how frequently agents check for repository updates using the refresh_interval parameter:

1dagBundleConfigList: '[{"name": "private-dags", "classpath": "airflow.providers.git.bundles.git.GitDagBundle", "kwargs": {"repo_url": "https://github.com/your-org/private-dags", "tracking_ref": "main", "subdir": "dags", "git_conn_id": "git_repo", "refresh_interval": 300}}]'

The default refresh interval is 300 seconds. Reducing this value across many bundles may increase the risk of hitting Git provider rate limits.

Configure LocalDagBundle

LocalDagBundle reads dags from the local filesystem. This is the default dag bundle type.

DAG storage options

Choose one of two methods to provide dags to agents:

Option 1: Include dags in container image

Build a custom agent image that includes your DAG files. Copy dags into the /dags folder during image build.

Option 2: Mount Persistent Volume Claim

Create a PVC containing your dags and mount it into all agent components (Dag Processor, Worker, and Triggerer) at the same path.

Configure DAG path

LocalDagBundle looks for dags in /dags by default. Specify a different path using the path parameter:

1dagBundleConfigList: '[{"name": "local-dags", "classpath": "airflow.dag_processing.bundles.local.LocalDagBundle", "kwargs": {"path": "/opt/airflow/dags"}}]'

Configure with container image

1

Build custom image

Create a Dockerfile extending the base agent image:

1FROM images.astronomer.cloud/baseimages/astro-remote-execution-agent:3.1-3-python-3.12-astro-agent-1.2.0
2
3# Copy dags into the image
4COPY dags/ /dags/
5
6# Install additional dependencies if needed
7COPY requirements.txt /tmp/requirements.txt
8RUN pip install -r /tmp/requirements.txt
2

Update values file

Reference your custom image in values.yaml:

1workers:
2 - name: default-worker
3 image: your-registry.example.com/custom-agent:1.0.0
4
5dagProcessor:
6 image: your-registry.example.com/custom-agent:1.0.0
7
8triggerer:
9 image: your-registry.example.com/custom-agent:1.0.0
10
11dagBundleConfigList: '[{"name": "local-dags", "classpath": "airflow.dag_processing.bundles.local.LocalDagBundle", "kwargs": {"path": "/dags"}}]'
3

Update Helm release

Apply the configuration:

$helm upgrade astro-agent astronomer/astro-remote-execution-agent -f values.yaml

Configure with Persistent Volume Claim

1

Create PVC

Create a PersistentVolumeClaim in your Kubernetes namespace:

1apiVersion: v1
2kind: PersistentVolumeClaim
3metadata:
4 name: dags-pvc
5 namespace: <your-namespace>
6spec:
7 accessModes:
8 - ReadWriteMany
9 resources:
10 requests:
11 storage: 20Gi
12 storageClassName: <your-storage-class>

Apply the PVC:

$kubectl apply -f pvc.yaml
2

Configure volume mounts

Update values.yaml to mount the PVC into all components:

1workers:
2 - name: default-worker
3 volumes:
4 - name: dags-volume
5 persistentVolumeClaim:
6 claimName: dags-pvc
7 volumeMounts:
8 - name: dags-volume
9 mountPath: /opt/airflow/dags
10 readOnly: true
11
12dagProcessor:
13 volumes:
14 - name: dags-volume
15 persistentVolumeClaim:
16 claimName: dags-pvc
17 volumeMounts:
18 - name: dags-volume
19 mountPath: /opt/airflow/dags
20 readOnly: true
21
22triggerer:
23 volumes:
24 - name: dags-volume
25 persistentVolumeClaim:
26 claimName: dags-pvc
27 volumeMounts:
28 - name: dags-volume
29 mountPath: /opt/airflow/dags
30 readOnly: true
31
32dagBundleConfigList: '[{"name": "local-dags", "classpath": "airflow.dag_processing.bundles.local.LocalDagBundle", "kwargs": {"path": "/opt/airflow/dags"}}]'
3

Update Helm release

Apply the configuration:

$helm upgrade astro-agent astronomer/astro-remote-execution-agent -f values.yaml

GitDagBundle compared to LocalDagBundle

Both bundle types support DAG versioning in the Airflow UI, but GitDagBundle provides additional capabilities:

ScenarioLocalDagBundleGitDagBundle
Viewing previous DAG runsDisplays DAG as it existed at run timeDisplays DAG as it existed at run time
Creating new DAG runsUses current DAG codeUses current DAG code
Rerunning whole previous DAG runUses current DAG codeUses DAG version from original run time
Rerunning individual tasksUses latest version for rerun tasksUses task code from original run time
Code changes during DAG runUses current DAG code at task start timeCompletes using bundle version from run start
Running backfillsUses current DAG codeUses latest bundle version
Version creationEvery structural DAG change creates new versionEvery committed structural change creates new version

DAG versioning

Airflow 3 automatically tracks DAG versions when you use DAG bundles. Each DAG run associates with a specific DAG version visible in the Airflow UI.

Key behaviors:

  • New versions are created for structural changes (tasks, dependencies, schedules)
  • The scheduler uses the latest DAG version to create new runs
  • You can view code for any previous DAG version in the UI
  • GitDagBundle allows rerunning tasks with their original code version

See Airflow DAG versioning for detailed information about versioning behavior.

Next steps

After configuring DAG sources: