Deploy DAGs with NFS

You can use an external Network File System (NFS) Volume to deploy Dags to an Airflow Deployment on Astro Private Cloud (APC).

Unlike deploying Dags with the Astro CLI, deploying Dags to an NFS volume doesn’t require rebuilding a Docker image and restarting your underlying Airflow service. When a Dag is added to an NFS volume, it automatically appears in the Airflow UI without requiring additional action or causing downtime.

How NFS deploys work

When you configure an NFS volume for a Deployment:

  1. APC validates the NFS location format (SERVER_IP:PATH).
  2. APC creates a Kubernetes PersistentVolume (PV) pointing to your NFS server.
  3. APC creates a PersistentVolumeClaim (PVC) bound to the PV.
  4. The NFS volume is mounted read-only to the scheduler and workers at /usr/local/airflow/dags.
  5. Dags are synced by writing files directly to your NFS server.
NFS Server (/dags)
┌──────────────┐
│ Kubernetes │
│ PV + PVC │
└──────┬───────┘
├──► Scheduler (/usr/local/airflow/dags)
└──► Workers (/usr/local/airflow/dags)

Implementation considerations

If you configure NFS for a Deployment, you can’t use the Astro CLI or service accounts to deploy DAGs to that Deployment. NFS becomes the exclusive deployment mechanism.

Before configuring NFS deploys:

  • Namespace pools limitation: NFS deploys will not work if you use namespace pools and set global.clusterRoles to false. The NFS deploy feature requires creating PersistentVolumes, which are cluster-scoped resources.
  • Dags only: NFS volumes deploy only Dags. To add Python dependencies or system packages, update your requirements.txt and packages.txt files and deploy using the CLI or CI/CD.
  • Airflow version: NFS volumes require Airflow 2.0 or later.
  • Read-only mount: The NFS volume is mounted read-only to Airflow components. Write operations must happen directly on the NFS server.

Prerequisites

  • APC 1.0 or later installed
  • An NFS server accessible from your Kubernetes cluster
  • Network connectivity between cluster nodes and the NFS server
  • Read access configured for UID/GID 50000 on the NFS share

Enable NFS volume storage

A System Admin must enable NFS deploys on the platform. Update your values.yaml:

1houston:
2 config:
3 deployments:
4 configureDagDeployment: true
5 nfsMountDagDeployment: true

Apply the configuration change:

$helm upgrade astronomer astronomer/astronomer \
> -f values.yaml \
> --namespace astronomer

Provision an NFS volume

  1. Create an EFS file system.
  2. Configure security groups to allow NFS traffic (port 2049) from your EKS nodes.
  3. Create an access point or use the root directory.
  4. Note the file system DNS name: fs-xxxxxxxx.efs.region.amazonaws.com.

Configure NFS for a Deployment

  1. In the APC UI, create a new Deployment or open an existing one.
  2. Go to DAG Deployment in the Deployment settings.
  3. Select NFS Volume Mount as the mechanism.
  4. Enter the NFS location in IP:PATH format:
    • AWS EFS: 10.0.0.1:/
    • GCP Filestore: 10.0.0.1:/dags
    • Azure Files: storage-account.file.core.windows.net:/storage-account/share-name
  5. Click Save or Deploy Changes.

Deploy Dags to NFS volume

Once configured, deploy Dags by copying files to your NFS server. The method depends on your infrastructure:

Direct copy

$# Copy DAGs to NFS mount point
$cp -r dags/* /mnt/nfs/dags/

Using kubectl

If you have a pod with NFS access:

$kubectl cp dags/ deployment-namespace/nfs-sync-pod:/dags/

CI/CD integration

Example GitHub Actions workflow:

1name: Deploy DAGs to NFS
2on:
3 push:
4 branches: [main]
5 paths: ['dags/**']
6
7jobs:
8 deploy:
9 runs-on: self-hosted # Runner with NFS access
10 steps:
11 - uses: actions/checkout@v4
12 - name: Sync DAGs
13 run: |
14 rsync -av --delete dags/ /mnt/nfs/airflow-dags/

Sync from cloud storage

For cloud-native workflows, sync from object storage:

$# AWS S3 to EFS
$aws s3 sync s3://my-bucket/dags/ /mnt/efs/dags/
$
$# GCS to Filestore
$gsutil -m rsync -r gs://my-bucket/dags/ /mnt/filestore/dags/
$
$# Azure Blob to Azure Files
$azcopy sync "https://account.blob.core.windows.net/dags" "/mnt/azure/dags"

Verify NFS configuration

Check that the PV and PVC were created:

$# List PersistentVolumes
$kubectl get pv | grep dags
$
$# List PersistentVolumeClaims in the deployment namespace
$kubectl get pvc -n <deployment-namespace> | grep dags

Verify the volume is mounted in Airflow pods:

$kubectl exec -n <deployment-namespace> <scheduler-pod> -- \
> ls -la /usr/local/airflow/dags

Troubleshooting

Dags aren’t appearing

  1. Check NFS connectivity:

    $kubectl exec -n <namespace> <pod> -- \
    > showmount -e <nfs-server-ip>
  2. Verify mount permissions:

    $kubectl exec -n <namespace> <scheduler-pod> -- \
    > ls -la /usr/local/airflow/dags
  3. Check PV/PVC status:

    $kubectl describe pv <deployment>-dags-<hash>
    $kubectl describe pvc -n <namespace> <deployment>-dags-<hash>

Permission denied errors

Ensure your NFS export allows access for UID/GID 50000:

$# On NFS server
$chown -R 50000:50000 /srv/airflow-dags
$chmod -R 755 /srv/airflow-dags

Stale file handle

If pods report stale NFS handles after server restart:

$# Restart affected pods
$kubectl rollout restart deployment -n <namespace> <scheduler>
$kubectl rollout restart statefulset -n <namespace> <workers>

Network connectivity issues

Verify NFS port (2049) is accessible:

$kubectl run nfs-test --rm -it --image=busybox -- \
> nc -zv <nfs-server-ip> 2049

Security considerations

  • Network isolation: Use network policies to restrict which pods can access the NFS server.
  • Access control: Configure NFS exports to allow only Kubernetes node IPs.
  • Read-only mounts: APC mounts NFS volumes read-only to prevent accidental modifications from Airflow.
  • Audit logging: Enable NFS server audit logging for compliance requirements.

Alternative: Git-sync deploys

If NFS infrastructure isn’t available, consider git-sync deploys which pull DAGs from a Git repository. Git-sync provides:

  • Version control for Dags.
  • No external storage infrastructure required.
  • Webhook-based or polling synchronization.
  • Branch-based deployment strategies.