WEBINARS

Using Airflow with Multiple AWS Accounts

Watch Video On Demand

Recorded On

Hosted By

  • Tony Huinker
  • Viraj Parekh

In this webinar video we cover:

Why Would We Need Multiple AWS Accounts?

In AWS, it’s common for organizations to use multiple AWS accounts for various reasons, from Dev, Stage, Prod accounts to accounts being dedicated to LOBs. What do you do when your Data Pipeline needs to span AWS accounts? This webinar shows how you can run a single DAG across multiple AWS accounts in a secure manner.

DAG Overview

airflow-aws-1

  1. Astronomer Airflow Running on EKS Cluster in AWS Account for shared services (“Referred to as AWS Account 3”)
  2. EMR Job running in AWS dedicated to raw data processing (“AWS Account 1”)
  3. Athena Query run in AWS account for data query (“AWS Account 2”)
  4. AWS Permissions granted to Airflow using IAM Cross Account Role, no Access Keys/Secret Access Keys needed! (Although the same setup can be completed using IAM User Access Key/Secret Access Key if preferred)

Why are companies writing DAGs that span multiple AWS Accounts?

airflow-aws-2

How to write DAGs that span multiple AWS Accounts

airflow-aws-3

  1. In both AWS-Account-1 and AWS-Account-2
    • Create a cross account role with the account ID of the DS Shared Account
    • Attach an IAM policy to the role granting appropriate permissions
    • Make Note of the role ARN

airflow-aws-4

  1. In the AWS-Account-3 (airflow account)
    • Find the IAM role that airflow will be running as
    • Attach an IAM policy granting permissions to assume role
    • In the resource field, put the role ARNs for cross account roles created in step 1

Which IAM Role does Airflow Run as before assuming a role?

Depends on a few factors:

Astronomer Enterprise - Additional options for IAM on EKS

airflow-aws-5

airflow-aws-6

  1. In Airflow
    • Unde Admin, create a new connection
    • Under extras, provide the Role ARN of the cross account role created in AWS-Account-1
    • Repeat a & b for AWS-Account -2

airflow-aws-7

airflow-aws-8

  1. In DAGs
    • Reference appropriate AWS accounts in tasks
    • For most AWS Operators, this is done using aws_connd_id

What about copying data between S3 buckets in two different AWS accounts?

airflow-aws-9

What if we’d prefer to use Access Keys instead of Cross Account Roles?

Access Key Method

airflow-aws-10

  1. In both DSC (account 1) & Pre-Prod (account 2)

    • Get a access key/secret access key for a peculiar user
    • Ensure the IAM policy assigned to the user has appropriate permission in AWS
  2. In Airflow

    • Admin, create new connection
    • Paste Access Key in the Login field
    • Paste Secret Access Key in the password field

Join the 1000’s of other data engineers who have received the Astronomer Certification for Apache Airflow® Fundamentals. This exam assesses an understanding of the basics of the Airflow architecture and the ability to create basic data pipelines for scheduling and monitoring tasks.

Build, run, & observe your data workflows.
All in one place.

Get $300 in free credits during your 14-day trial.

Get Started Free