Google Analytics API to Redshift with Airflow

In this guide, we’ll explore how you can use Apache Airflow to move your data from Google Analytics to Redshift. Note that this is an effective and flexible alternative to point-and-click ETL tools like Segment, Alooma, Xplenty, Stitch, and ETLeap.

Before we get started, be sure you have the following on hand:

  • A Google Analytics account
  • An S3 bucket with a valid aws_access_key_id and aws_secret_access_key
  • A Redshift instance with a valid host IP and login information
  • An instance of Apache Airflow. You can either set this up yourself if you have devops resources or sign up and get going immediately with Astronomer’s managed Airflow service. Note that this guide will use commands using the Astronomer CLI to push dags into production and will assume you’ve spun up an Airflow instance via Astronomer, but the core code should work the same regardless of how you’re hosting Airflow
  • Docker running on your machine

This DAG generates a report using v4 of the Google Analytics Core Reporting API. The dimensions and metrics are as follows. Note that while these can be modified, a maximum of 10 metrics and 7 dimensions can be requested at once.


  • pageView
  • bounces
  • users
  • newUsers
  • goal1starts
  • goal1completions


  • dateHourMinute
  • keyword
  • referralPath
  • campaign
  • sourceMedium

Not all metrics and dimensions are compatible with each other. When forming the request, please refer to the official Google Analytics API Reference docs

1. Add Connections in Airflow UI

Begin by creating all of the necessary connections in your Airflow UI. To do this, log into your Airflow dashboard and navigate to Admin-->Connections. In order to build this pipeline, you’ll need to create a connection to your Google Analytics account, your S3 bucket, and your Redshift instance. For more info on how to fill out the fields within your connections, check out our documentation here.

2. Clone the plugin

If you haven't done so already, navigate into your project directory and create a plugins folder by running mkdir plugins in your terminal.Navigate into this folder by running cd plugins and clone the Google Analytics Plugin using the following command:

git clone

This will allow you to use the Google analytics hook to establish a connection to Google Anlaytics and extract data into a file. You will also be able to use the appropriate operators to transfer the Google Analytics data to S3 and then from S3 to Redshift.

3. Copy the DAG file

Navigate back into your project directory and create a dags folder by running mkdir dags. Copy the Google Analytics to Redshift DAG file into this folder.

4. Customize

Open up the file that you just copied in a text editor of your choice and input the following credentials into lines 46-50:

S3_CONN_ID = ''
S3_BUCKET = ''

5. Test + Deploy

Once you have those credentials plugged into your DAG, test and deploy it!

If you don't have Airflow already set up in your production environment, head over to our getting started guide to get spun up with your own managed instance!

Ready to run production-grade Airflow?

Astronomer is the easiest way to run Apache Airflow. Choose from a fully hosted Cloud option or an in-house Enterprise option and run a production-grade Airflow stack, including monitoring, logging, and first-class support.