Configure tasks to run with the Kubernetes executor
The Kubernetes executor runs each Airflow task in a dedicated Kubernetes Pod. On Astro, you can customize these Pods on a per-task basis using a pod_override
configuration. If a task doesn’t contain a pod_override
configuration, it runs using the default Pod as configured in your Deployment resource settings.
Prerequisites
- An Astro Deployment using Astro Runtime version 8.1.0 or later.
airflow_local_settings.py
and will fail to start up new Pods.Customize a task’s Kubernetes Pod
For each task running with the Kubernetes executor, you can customize its individual worker Pod and override the defaults used in Astro by configuring a pod_override
file.
-
Add the following import to your dag file:
-
Add a
pod_override
configuration to the dag file containing the task. See thekubernetes-client
GitHub for a list of all possible settings you can include in the configuration. -
Specify the
pod_override
in the task’s parameters.
See the following example of a pod_override
configuration.
Example: Set CPU or memory limits and requests
You can request a specific amount of resources for a Kubernetes worker Pod so that a task always has enough resources to run successfully. When requesting resources, make sure that your requests don’t exceed the resource limits in your Deployment’s max pod size.
The following example shows how you can use a pod_override
configuration in your dag code to request custom resources for a task:
When this dag runs, it launches a Kubernetes Pod with exactly 0.5m of CPU and 1024Mi of memory, as long as that infrastructure is available in your Deployment. After the task finishes, the Pod terminates gracefully.
For Astro environments, if you set resource requests to be less than the maximum limit, Astro automatically requests the maximum limit that you set. This means that you might consume more resources than you expected if you set the limit much higher than the resource request you need. Check your Billing and usage to view your resource use and associated charges.
Use secret environment variables in worker Pods
On Astro Deployments, secret environment variable values are stored in a Kubernetes secret called env-secrets
. These environment variables are available to your worker Pods, and you can access them in your tasks just like any other environment variable. For example, you can use os.environ[<your-secret-env-var-key>]
or os.getenv(<your-secret-env-var-key>, None)
in your dag code to access the variable value.
However, if you can’t use Python, or you are using a pre-defined code that expects specific keys for environment variables, you must pull the secret value from env-secrets
and mount it to the Pod running your task as a new Kubernetes Secret.
-
Add the following import to your dag file:
-
Define a Kubernetes
Secret
in your dag instantiation using the following format: -
Specify the
Secret
in thesecret_key_ref
section of yourpod_override
configuration. -
In the task where you want to use the secret value, add the following task-level argument:
-
In the executable for the task, call the secret value using
os.environ[env_name]
.
In the following example, a secret named MY_SECRET
is pulled from env-secrets
and printed to logs.