Run images from Amazon Elastic Container Registry (ECR)

Policy-based setup is available only on Astro dedicated clusters. To run images from a private registry on Astro standard clusters, follow the steps in Private Registry.

By default, the KubernetesPodOperator expects to pull container images that are hosted publicly. If your images are hosted on the container registry native to your cloud provider, you can grant access to the images directly.

Prerequisites

Setup

1

Add Amazon ECR repository permissions

If your Docker image is hosted in an Amazon ECR repository, add a permissions policy to the repository to allow the KubernetesPodOperator to pull the Docker image. You don’t need to create a Kubernetes secret, or specify the Kubernetes secret in your dag. Docker images hosted in Amazon ECR repositories can only be pulled from AWS clusters.

  1. Log in to the Amazon ECR Dashboard and then select Menu > Repositories.

  2. Click the Private tab and then click the name of the repository that hosts the Docker image.

  3. Click Permissions in the left menu.

  4. Click Edit policy JSON.

  5. Copy and paste the following policy into the Edit JSON pane:

    1{
    2 "Version": "2008-10-17",
    3 "Statement": [
    4 {
    5 "Sid": "AllowImagePullAstro",
    6 "Effect": "Allow",
    7 "Principal": {
    8 "AWS": "arn:aws:iam::<AstroAccountID>:role/EKS-NodeInstanceRole-<ClusterID>"
    9 },
    10 "Action": [
    11 "ecr:GetDownloadUrlForLayer",
    12 "ecr:BatchGetImage"
    13 ]
    14 }
    15 ]
    16}
    • Replace <AstroAccountID> with your Astro AWS account ID.
    • Replace <ClusterID> with your Cluster ID. To find the Cluster ID, click Organization Settings, then click Clusters in the Organization section of the Astro UI and then select the cluster. The ID is displayed at the top, along with other information about the Astro Cluster.
  6. Click Save to create a new permissions policy named AllowImagePullAstro.

2

Set up the KubernetesPodOperator

The following snippet is the minimum configuration you’ll need to create a KubernetesPodOperator task on Astro:

1from airflow.configuration import conf
2from airflow.providers.cncf.kubernetes.operators.kubernetes_pod import KubernetesPodOperator
3
4namespace = conf.get("kubernetes", "NAMESPACE")
5
6KubernetesPodOperator(
7 namespace=namespace,
8 image="<your-docker-image>",
9 cmds=["<commands-for-image>"],
10 arguments=["<arguments-for-image>"],
11 labels={"<pod-label>": "<label-name>"},
12 name="<pod-name>",
13 task_id="<task-name>",
14 get_logs=True,
15 in_cluster=True,
16)

For each instantiation of the KubernetesPodOperator, you must specify the following values:

  • namespace = conf.get("kubernetes", "NAMESPACE"): Every Deployment runs on its own Kubernetes namespace within a cluster. Information about this namespace can be programmatically imported as long as you set this variable.
  • image: This is the Docker image that the operator will use to run its defined task, commands, and arguments. Astro assumes that this value is an image tag that’s publicly available on Docker Hub. To pull an image from a private registry, see Pull images from a Private Registry.
  • in_cluster: If a Connection object is not passed to the KubernetesPodOperator’s kubernetes_conn_id parameter, specify in_cluster=True to run the task in the Deployment’s Astro cluster.
3

Add Amazon ECR repository URI

Replace <your-docker-image> in the instantiation of the KubernetesPodOperator with the Amazon ECR repository URI that hosts the Docker image. To locate the URI:

  • In the Amazon ECR Dashboard, click Repositories in the left menu.
  • Open the Private tab and then copy the URI of the repository that hosts the Docker image.