Join us for Astro Days: NYC on Sept 27!
globe

Join us for upcoming online events!

Past Events

A Deep Dive into the Airflow UI

A Deep Dive into the Airflow UI

In this webinar, we’ll take an in-depth tour of the Airflow UI and cover the many features that users may not be aware of.

Data Transformations with the Astro Python SDK

Data Transformations with the Astro Python SDK

On September 13, Live with Astronomer will dive into implementing data transformations with the Astro Python SDK. The Astro Python SDK is an open source Python package that allows for clean and rapid development on ELT workflows. We’ll show how you can use the transform and dataframe functions to easily transform your data using Python or SQL and seamlessly transition between the two.

Implementing Data Quality Checks in Airflow

Implementing Data Quality Checks in Airflow

Executing SQL queries — one of the most common use cases for data pipelines — is a simple way to implement data quality checks. In this webinar, we’ll cover everything you need to know about using SQL for data quality checks.

The Astro Python SDK Load File Function

The Astro Python SDK Load File Function

The next Live with Astronomer will dive into the Astro Python SDK load_file function. The Astro Python SDK is an open source Python package that allows for clean and rapid development on ELT workflows. We’ll show how you can use load_file for the ‘Extract’ step of your pipeline to easily get data from your filesystems into your data warehouse, without any operator-specific knowledge.

The Astro Python SDK

The Astro Python SDK

Astronomer is excited to announce the release of the Astro Python SDK version 1.0. The Astro Python SDK is an open source tool powered by Airflow and maintained by Astronomer, that allows for rapid and clean development of ETL workflows using Python.

The SQL Table Check Operator

The SQL Table Check Operator

In this session we’ll dive into the new Common SQL provider package and show how to use the SQLTableCheckOperator. We’ll show how you can easily use this operator to implement data quality checks in your DAGs, ensuring that errant data never makes it to production.

Airflow 101: Essential Tips For Beginners

Airflow 101: Essential Tips For Beginners

What is Airflow? Apache Airflow is a platform used to programmatically author, schedule, and monitor data pipelines.

The SQL Column Check Operator

The SQL Column Check Operator

In this session we’ll show how you can easily use the SQLColumnCheckOperator operator to implement data quality checks in your DAGs, ensuring that errant data never makes it to production.

Using Airflow with Tensorflow and MLFlow

Using Airflow with Tensorflow and MLFlow

We’ll delve further into how Airflow can be integrated with Tensorflow and MLFlow specifically to manage ML pipelines in production, using a worked example to demonstrate.

Reusable DAG Patterns with TaskGroups

Reusable DAG Patterns with TaskGroups

In this session we’ll show how Astronomer’s data and intelligence team uses TaskGroups to reduce the amount of code the team has to write while adhering to DAG authoring best practices.

Anatomy of an Operator

Anatomy of an Operator

Operators are the building blocks of Apache Airflow. In this webinar we’ll look under the hood, covering everything you need to know about operators to tailor them for your use cases.

Using the Snowflake Deferrable Operator

Using the Snowflake Deferrable Operator

Live with Astronomer will dive into using the Snowflake Deferrable Operator. We’ll show how with a very small update to your DAGs, you can start saving money when orchestrating your Snowflake queries with Airflow.

Writing Functional DAGs with Decorators

Writing Functional DAGs with Decorators

In this webinar, we’ll demystify decorators and show you everything you need to know to start using decorators in your DAGs.

The Python Task Decorator

The Python Task Decorator

Live with Astronomer will dive into the Python task decorator. We’ll show how to easily turn your Python functions into tasks in your DAG using functional programming, and how using the Python task decorator can limit the boilerplate code needed in your DAGs.

ML in Production with Airflow

ML in Production with Airflow

Although often regarded as a data engineering and pipelining tool, Airflow is also wildly popular among machine learning teams. In this webinar, we’ll dive into how Airflow can consolidate various ML tools into dependable production systems.

Intro: Getting Started with Airflow

Intro: Getting Started with Airflow

What is Airflow? Apache Airflow is a platform used to programmatically author, schedule, and monitor data pipelines.

Astronomer Providers

Astronomer Providers

This webinar will dive into the Astronomer Providers repository, which includes Airflow Providers containing Deferrable Operators and Sensors created by Astronomer. We’ll go beyond the basics to look at key implementation details and best practices.

What’s New in Airflow 2.3

What’s New in Airflow 2.3

The Airflow project is rapidly evolving, with frequent releases bringing advancements in DAG authoring, observability, and project stability. We’re super excited for the release of Airflow 2.3, which comes with big changes in the flexibility of DAG creation, improvements to the Airflow UI, and much more.

Using Airflow as a Data Analyst

Using Airflow as a Data Analyst

Airflow is sometimes thought of as primarily a data engineering tool, but its use cases are really much broader. A data analyst’s workflow typically involves ingesting and transforming data to extract insights, then presenting the insights in a manner that allows business stakeholders to easily interpret trends and take appropriate action. Airflow’s ease of use and extensive provider ecosystem make it an ideal tool for orchestrating such analytics workflows.

OpenLineage and Airflow: A Deeper Dive

OpenLineage and Airflow: A Deeper Dive

Data lineage is the complex set of relationships between your jobs and datasets. Using OpenLineage with Apache Airflow, you can observe and analyze these relationships, allowing you to find and fix issues more quickly. This webinar will provide a deeper dive on OpenLineage, extending beyond the basics into key implementation details and best practices.

Improve Your DAGs with Hidden Airflow Features

Improve Your DAGs with Hidden Airflow Features

Apache Airflow is flexible and powerful. It has a rich ecosystem and an incredibly active community. But are you sure you haven’t missed anything? A new feature or concept that could put your DAGs at another level? It can be challenging to keep up with the latest Airflow features, and sometimes we miss the most useful ones. For this webinar, I'd like to introduce you to a couple of lesser-known features of Apache Airflow that can dramatically improve your data pipelines.

Scaling Out Airflow

Scaling Out Airflow

Airflow is purpose-built for high-scale workloads and high availability on a distributed platform. Since the advent of Airflow 2.0, there are even more tools and features to ensure that Airflow can be scaled to accommodate high-throughput, data-intensive workloads. In this webinar, Alex Kennedy will discuss the process of scaling out Airflow utilizing the Celery and Kubernetes Executor, including the parameters that need to be tuned when adding nodes to Airflow and the thought process behind deciding when it’s a good idea to scale Airflow, horizontally and vertically. Consistent and aggregated logging is key when scaling Airflow, and we will also briefly discuss best practices for logging on a distributed Airflow platform, as well as the pitfalls that many Airflow users experience when designing and building their distributed Airflow platform.

Data Quality Use Cases with Airflow and Great Expectations

Data Quality Use Cases with Airflow and Great Expectations

At this webinar, Benji Lampel (Enterprise Platform Architect @ Astronomer) and Tal Gluck (Software Engineer @ Superconductive) will present several Airflow DAGs using Great Expectations that cover more advanced DAG patterns and data quality checking cases.

The Airflow API

The Airflow API

Did you know that Airflow has a fully stable REST API? In this webinar, we’ll cover how to use the API, and why it’s a great tool in your Airflow toolbox for managing and monitoring your data pipelines.

Introducing Astro: Data Centric DAG Authoring

Introducing Astro: Data Centric DAG Authoring

With Airflow 2.0, we introduced the concept of providers. We’re taking that to the next level with Astro, a new DAG writing experience, brought to you by Astronomer.

Data Lineage with OpenLineage and Airflow

Data Lineage with OpenLineage and Airflow

If one out of your hundreds of DAGs fails, how do you know which downstream datasets have become out-of-date? The answer is data lineage. Data lineage is the complex set of relationships between your jobs and datasets. In this webinar, you'll learn how to use OpenLineage to collect lineage metadata from Airflow and assemble a lineage graph - a picture of your pipeline worth way more than a thousand words.

Best Practices for Writing DAGs in Airflow 2

Best Practices for Writing DAGs in Airflow 2

Because Airflow is 100% code, knowing the basics of Python is all it takes to get started writing DAGs. However, writing DAGs that are efficient, secure, and scalable requires some Airflow-specific finesse. In this webinar, you’ll learn the best practices for writing DAGs that will ensure you get the most out of Airflow. We’ll include a reference repo with DAGs you can run yourself with the Astro CLI.

Iterative Data Quality in Airflow DAGs

Iterative Data Quality in Airflow DAGs

Data quality is an often overlooked component of data pipelines. Learn why it is a valuable part of data systems and how to get started integrating data quality checks into existing pipelines with a variety of tools.

Intro To Data Orchestration With Airflow

Intro To Data Orchestration With Airflow

What is Airflow? Definition: Apache Airflow is a way to programmatically author, schedule and monitor data pipelines.

Scheduling In Airflow

Scheduling In Airflow

The flexibility and freedom that Airflow offers you is incredible, but to really take advantage of it you need to master some concepts first, one of which has just been released in Airflow 2.2 By the end of the webinar, you will be able to define schedule intervals that you thought were impossible before.

Everything you Need to Know About Airflow 2.2

Everything you Need to Know About Airflow 2.2

In this informative webinar we will cover everything you need to know about Airflow 2.2. We'll go through all of the new features large and small, as well as show you how to leverage all of the new features and how you can get cleaner and more efficient DAGs as a result

Testing Airflow to Bullet Proof Your Code

Testing Airflow to Bullet Proof Your Code

Airflow, by nature, is an orchestration framework, not a data processing framework. At first sight it can be unclear how to test Airflow code. Are you triggering DAGs in the UI to validate your Airflow code? In this webinar we'll demonstrate various examples how to test Airflow code and integrate tests in a CI/CD pipeline, so that you're certain your code works before deploying to production.

Manage Dependencies Between Airflow Deployments, DAGs, and Tasks

Manage Dependencies Between Airflow Deployments, DAGs, and Tasks

More often that not, your Airflow components will have a desired order of execution particularly if you are performing a traditional ETL process—for example, before the Transform step in ETL, Extraction had to have happened in an upstream pipeline. In this webinar we will discuss how to properly setup dependencies and define an order of execution or operation for your pipelines using dependencies.

Create Powerful Data Pipelines by Mastering Sensors

Create Powerful Data Pipelines by Mastering Sensors

Do you use Sensors in your data pipelines? Do you need to wait for a file before executing the next step? Are you looking to execute your task after a task completes in another DAG? Would you like to wait for an import in your SQL table before executing the next task? The answer...

Using Airflow with Azure Data Factory

Using Airflow with Azure Data Factory

While Airflow and ADF (Azure Data Factory) have pros and cons, they can be used in tandem for data pipelines across your organization. In this webinar, we’ll cover how using the two together can really get you the best of both worlds!

Monitor Your DAGs with Airflow Notifications

Monitor Your DAGs with Airflow Notifications

Anytime you’re running business critical pipelines, you need to know when something goes wrong. Airflow has a built in notification system that can be used to throw alerts when your DAGs fail, succeed, or anything in between. In this webinar, we’ll do a deep dive into how you can customize your notifications in Airflow to meet your needs.

Intro to Airflow for ETL With Snowflake

Intro to Airflow for ETL With Snowflake

ETL is one of the most common data engineering use cases, and it's one where Airflow really shines. In this webinar, we'll cover everything you need to get started as a new Airflow user, and dive into how to implement ETL pipelines as Airflow DAGs.

Getting Started With the Official Airflow Helm Chart

Getting Started With the Official Airflow Helm Chart

The official helm chart of Apache Airflow is out! The days of wondering what Helm Chart to use in production are over. Now, you only have one chart maintained and tested by Airflow PMC members as well as the community. It’s time to get your hands on it and take it for a spin! At the end of the webinar, you will have a fully functional Airflow instance deployed with the Official Helm Chart and running within a Kubernetes cluster locally.

Dynamic DAGs

Dynamic DAGs

In this webinar, we'll talk about when you might want to dynamically generate your DAGs, show a couple of methods for doing so, and discuss problems that can arise when implementing dynamic generation at scale.

Using Airflow with Multiple AWS Accounts

Using Airflow with Multiple AWS Accounts

In AWS, it's common for organizations to use multiple AWS accounts for various reasons, from Dev, Stage, Prod accounts to accounts being dedicated to LOBs. What do you do when your Data Pipeline needs to span AWS accounts? This webinar will show how you can run a single DAG across multiple AWS accounts in a secure manner.

Intro to Airflow

Intro to Airflow

Learn about the core concepts, components, and benefits of working with Airflow. Watch this Intro to Airflow webinar today!

Airflow 2.0 + Kubernetes

Airflow 2.0 + Kubernetes

Learn more about using Airflow 2.0 with Kubernetes.

Airflow 2.0 Providers

Airflow 2.0 Providers

Learn everything about Airflow 2.0 providers including what defines a provider, how to create your own provider, and customizing provider packages.

TaskFlow API in Airflow 2.0

TaskFlow API in Airflow 2.0

Watch the webinar recap and learn how Taskflow API can help simplify DAGs that make heavy use of Python tasks and XComs.

Secrets Management in Airflow 2.0

Secrets Management in Airflow 2.0

Watch the webinar recording to learn the best practices for managing secrets with various backends in Apache Airflow 2.0.

DAG Writing Best Practices in Apache Airflow

DAG Writing Best Practices in Apache Airflow

Learn the best practices for writing DAGs in Apache Airflow with a repo of example DAGs that you can run with the Astro CLI.