Launch a Pod in a GKE cluster on GCP

If some of your tasks require specific resources such as a GPU, you might want to run them in a different cluster than your Airflow instance. In setups where both clusters belong to the same Google Cloud project, you can manage separate clusters with roles and permissions.

This document shows how to configure a Google Kubernetes Engine (GKE) cluster on Google Cloud and run a Pod on it from an Airflow instance where cross-project access isn’t available.

To launch Pods in external clusters from a local Airflow environment, you must have valid authentication for the external cluster. For managed Kubernetes services from public cloud providers, authentication is federated through the native IAM service. To grant the Astro role permissions to launch Pods on your cluster, you can either include static credentials or use workload identity to authorize the Astro role to your cluster.

Prerequisites

  • Network connectivity between your Airflow execution environment and the external Kubernetes cluster:
    • Hosted execution mode: A network connection between your Astro Deployment and the external cluster.
    • Remote execution mode: Network connectivity between the environment where your Remote Execution Agent runs and the external cluster. You are responsible for managing this connectivity. A direct network connection between Astro and the external cluster isn’t required.

Setup

1

Set up your external GKE cluster

Follow Google Cloud’s documentation to prepare a GKE cluster that your Astro Deployment can authenticate to:

  1. Create a GKE cluster if you don’t already have one.

  2. Authorize your Astro Deployment to Google Cloud by following the Deployment workload identity setup.

  3. Grant the service account IAM and Kubernetes RBAC permissions in the namespace where your KubernetesPodOperator tasks run.

    At a minimum, provision the following permissions for your service account in your specified namespace:

    • container.clusters.get
    • container.events.list
    • container.pods.get
    • container.pods.getLogs
    • container.pods.list
    • container.pods.create
    • container.pods.delete
    • container.pods.update

    If your Dag uses do_xcom_push=True, also grant the container.pods.exec permission.

2

Install dependencies in your Astro Runtime Docker image

To connect to your external GKE cluster, the gcloud CLI and the gke-gcloud-auth-plugin must be available inside your Astro Runtime image.

Add the following to your Dockerfile:

1USER root
2
3RUN apt-get update && apt-get install -y apt-transport-https ca-certificates gnupg curl \
4 && curl -sL https://packages.cloud.google.com/apt/doc/apt-key.gpg \
5 | gpg --dearmor -o /usr/share/keyrings/cloud.google.gpg \
6 && echo "deb [signed-by=/usr/share/keyrings/cloud.google.gpg] https://packages.cloud.google.com/apt cloud-sdk main" \
7 > /etc/apt/sources.list.d/google-cloud-sdk.list \
8 && apt-get update \
9 && apt-get install -y google-cloud-cli google-cloud-cli-gke-gcloud-auth-plugin \
10 && rm -rf /var/lib/apt/lists/*
11
12USER astro

For production deployments, consider pinning the google-cloud-cli-gke-gcloud-auth-plugin version for build reproducibility, or using a multi-stage build with the google/cloud-sdk:slim image to copy only the plugin binary into your final image and reduce its size.

Add the following line to your requirements.txt to include the CNCF Kubernetes provider:

apache-airflow-providers-cncf-kubernetes
3

Configure your kubeconfig file

The following sample Kubernetes kubeconfig file allows the Kubernetes command-line tool, kubectl, or other clients to connect to a remote Kubernetes cluster using Google Cloud workload identity for authentication.

1# Specifies the version of the Kubernetes API for this configuration file.
2# v1 is the standard version used for kubeconfig files.
3apiVersion: v1
4# List of Kubernetes clusters that the configuration can connect to.
5clusters:
6- cluster:
7 # base64-encoded certificate for the Kubernetes API server to verify SSL communication.
8 certificate-authority-data: <base64-public-certificate>
9 # Endpoint of the remote cluster you want to interact with.
10 server: https://<cluster-endpoint>
11 # Name of the cluster, which is referenced in the contexts section.
12 name: <GKE cluster>
13# List of contexts that define which cluster and user combination to use when interacting with Kubernetes.
14contexts:
15# Describes the context for connecting to the cluster.
16- context:
17 # References the cluster from the clusters section.
18 cluster: <GKE cluster>
19 # Associates the user configuration to be used for authentication with the cluster.
20 user: <user>
21 # The name of the context, which is referenced by current-context.
22 name: <GKE cluster>
23# Specifies the active context that will be used by default when running kubectl commands.
24current-context: <GKE cluster>
25# Identifies the file type as a Kubernetes Config.
26kind: Config
27preferences: {}
28# List of users and the method they use for authentication.
29users:
30# Defines the user that is being used in the context.
31# This user is responsible for authenticating with the Kubernetes cluster.
32- name: <user>
33 user:
34 exec:
35 apiVersion: client.authentication.k8s.io/v1beta1
36 command: gke-gcloud-auth-plugin
37 provideClusterInfo: true

Fetch the certificate-authority-data and cluster-endpoint fields from the GKE cluster details page or using the Google Cloud SDK.

4

Create an Airflow connection to use the kubeconfig file

To use the kubeconfig file, create a new Kubernetes Airflow connection.

There are multiple ways to pass the kubeconfig file to your Airflow connection. If your kubeconfig file contains any sensitive information, Astronomer recommends storing it as JSON inside the connection, as described in the JSON format tab.

5

Configure your task

In your KubernetesPodOperator task, set kubernetes_conn_id to the connection you created, namespace to the namespace in your GKE cluster where the Pod should run, and in_cluster=False so that the operator uses the connection’s kubeconfig instead of looking for an in-cluster service account.

1from airflow.decorators import dag
2from airflow.providers.cncf.kubernetes.operators.pod import KubernetesPodOperator
3from pendulum import datetime
4
5
6@dag(
7 dag_id="remote_kpo",
8 start_date=datetime(2024, 1, 1),
9 schedule=None,
10 catchup=False,
11 tags=["kubernetes", "gke"],
12)
13def remote_kpo():
14 KubernetesPodOperator(
15 task_id="run_on_gke",
16 kubernetes_conn_id="<my-gke-connection>",
17 namespace="<my-namespace>",
18 image="ubuntu:latest",
19 cmds=["echo"],
20 arguments=["External KPO is working!"],
21 name="example-pod",
22 get_logs=True,
23 in_cluster=False,
24 )
25
26
27remote_kpo()