Astronomer Webinars

Join us for upcoming online events!

Streamlining Data Pipelines with Sophi.io: An Airflow Journey

Sophi.io, a leading AI-powered content optimization platform, faced scale challenges as their data pipeline grew using Apache Airflow. Learn how Sophi optimized their data pipelines after migrating from Amazon Managed Workflows for Apache Airflow (MWAA) to Astronomer and dramatically improved their data pipeline scalability and reliability.

Register Now

Past Webinars

Improve Your DAGs with Hidden Airflow Features

Apache Airflow is flexible and powerful. It has a rich ecosystem and an incredibly active community. But are you sure you haven’t missed anything? A new feature or concept that could put your DAGs at another level? It can be challenging to keep up with the latest Airflow features, and sometimes we miss the most useful ones. For this webinar, I'd like to introduce you to a couple of lesser-known features of Apache Airflow that can dramatically improve your data pipelines.

Continue Reading

Scaling Out Airflow

Airflow is purpose-built for high-scale workloads and high availability on a distributed platform. Since the advent of Airflow 2.0, there are even more tools and features to ensure that Airflow can be scaled to accommodate high-throughput, data-intensive workloads. In this webinar, Alex Kennedy will discuss the process of scaling out Airflow utilizing the Celery and Kubernetes Executor, including the parameters that need to be tuned when adding nodes to Airflow and the thought process behind deciding when it’s a good idea to scale Airflow, horizontally and vertically. Consistent and aggregated logging is key when scaling Airflow, and we will also briefly discuss best practices for logging on a distributed Airflow platform, as well as the pitfalls that many Airflow users experience when designing and building their distributed Airflow platform.

Continue Reading

The Airflow API

Did you know that Airflow has a fully stable REST API? In this webinar, we’ll cover how to use the API, and why it’s a great tool in your Airflow toolbox for managing and monitoring your data pipelines.

Continue Reading

Data Lineage with OpenLineage and Airflow

If one out of your hundreds of DAGs fails, how do you know which downstream datasets have become out-of-date? The answer is data lineage. Data lineage is the complex set of relationships between your jobs and datasets. In this webinar, you'll learn how to use OpenLineage to collect lineage metadata from Airflow and assemble a lineage graph - a picture of your pipeline worth way more than a thousand words.

Continue Reading

Best Practices for Writing DAGs in Airflow 2

Because Airflow is 100% code, knowing the basics of Python is all it takes to get started writing DAGs. However, writing DAGs that are efficient, secure, and scalable requires some Airflow-specific finesse. In this webinar, you’ll learn the best practices for writing DAGs that will ensure you get the most out of Airflow. We’ll include a reference repo with DAGs you can run yourself with the Astro CLI.

Continue Reading

Iterative Data Quality in Airflow DAGs

Data quality is an often overlooked component of data pipelines. Learn why it is a valuable part of data systems and how to get started integrating data quality checks into existing pipelines with a variety of tools.

Continue Reading