Apache Airflow 2.3 — Everything You Need to Know
Dynamic task mapping, a new local executor, an improved grid view… Find out what’s new in Apache Airflow 2.3.
Apache Airflow for Data Scientists
What are the common challenges data scientists face, and how can Apache Airflow help? Today, we explore the role of a data scientist.
10 Best Practices for Modern Data Orchestration with Airflow
Today, we’re identifying best practices that will allow you to stand up, scale, and grow Airflow in support of both operational data integration and modern data orchestration.
Airflow and dbt, Hand in Hand
Whether it’s Cloud or Core, using Airflow and dbt together makes life better for everyone. Learn about all the ways they can be combined, as well as a new dbt Cloud provider.
What Is Data Lineage and Why Does It Matter?
To operate in today’s distributed data ecosystems, you need a complete and up-to-date picture of your environment at all times. Learn how data lineage can help you make sense of your data.
Letter from the CEO: Our Story So Far
On the heels of our Series C and our acquisition of Datakin, Joe Otto reflects on Astronomer’s history, and looks to a future powered by the combination of orchestration, lineage, and observability.
Astronomer Acquires Datakin, the Data Lineage Tool
Joining forces will accelerate our shared goal: To help organizations build and manage reliable data ecosystems that deliver trusted data and drive business-critical decisions.
TechCrunch on Astronomer’s Big News
The site covered our recent acquisition of Datakin and our Series C round.
Apache Airflow for Data Leaders — How to Empower Data Teams
What are the common mistakes data leaders make? What goals should they prioritize? How does Astronomer help them overcome these challenges? We asked expert Steven Hillion, VP of Data at Astronomer.
Airflow Summit 2022 — Join the Airflow Event of the Year!
The biggest community-driven event around Apache Airflow returns May 23–27, 2022.
Apache Airflow at Astronomer—Taking Data Orchestration to the Next Level
Learn how Astronomer drives the Apache Airflow project together with the community.
10 Best Practices for Airflow Users
Discover ten best practices that will help all Apache Airflow users ensure their data pipelines run smoothly and efficiently.
Top Data Management Trends for 2022
We talked to our Airflow experts about what data management trends to look out for in 2022.
Adding Data Quality to DAGs ft. Great Expectations
Adding data quality to DAGs is an iterative process, and Great Expectations is a preferred tool to use for that process.
Astronomer and Uturn Partner to Drive Innovation and Better Business Outcomes
We're excited to announce our partnership with Uturn!
Apache Airflow for Data Engineers—How to Leverage Data Orchestration
How has the role of a data engineer evolved over the years? What are their main responsibilities and how can Airflow help?
How to Select the Best ETL Tool to Integrate With Airflow? Our 3 Picks
Find out if choosing the best ETL tool is easy and which three ETL tools we like to combine with Airflow.
Every Company Nowadays Becomes a Data Company—Interview with Bolke de Bruin
An interview with VP of Enterprise Data Services at Astronomer on everything data and Airflow.
Machine Learning Pipeline Orchestration
Everything you need to know about MLO, told by the expert Santona Tuli—Staff Data Scientist at Astronomer.
How to Build a Modern Data Stack
Breaking down what a modern data stack means in practice. We discuss four core components, five reasons to set it up, and how to orchestrate it.
Apache Airflow vs. Apache Beam
Apache Airflow or Apache Beam? Or both, working together? Let's have a closer look at two popular data management open source tools.
Democratizing the Data Stack—Airflow for Business Workflows
Learn how Hightouch drives action in marketing & sales teams with Reverse ETL, SQL, and Apache Airflow
Machine Learning Pipelines: Everything You Need to Know
Learn what is the process of building a ML pipeline, what are the steps, and how to do it with Airflow and Astronomer.
What is Reverse ETL and How Can It Improve Data Flow?
Find out what is reverse ETL and how to use Census and Airflow together to improve data orchestration.
Airflow at BBC—Data Orchestration Solution in Media
A conversation with the BBC's Principal Data Engineer about how Apache Airflow helps them deliver personalized experiences to the audience.
Everything You Need to Know About Apache Airflow 2.2.0
It's alive! Discover the major Airflow 2.2.0 features including customizable timetables, deferrable tasks, Airflow standalone CLI command, and many more.
Big Data Architecture: Core Components, Use Cases, and Limitations
Is Big Data Architecture the answer to major business problems, or just a crucial piece of a bigger puzzle? Discover our insights on the topic in this short blog post!
The Future of Banking: How Can Apache Airflow Help?
Learn what are the challenges of the banking industry today, and how Apache Airflow can help with digital transformation.
Apache NiFi vs. Apache Airflow
Overview and comparison study of two popular ETL tools for managing the golden asset of most organizations: data. Can these two be compared at all?
Airflow at Wise: Data Orchestrator in Machine Learning
A talk with Alexandra Abbas—a Machine Learning Engineer at Wise—about how they leverage Apache Airflow in their ML initiatives.
How to Build an ETL Process?
Extract, transform, load. Discover the vital steps and methods of building an ETL process for your business.
Data Silos: What Are They and How to Fix Them?
Everything you need to know about data silos – how they influence your business, where they come from, and how to fix them.
Airflow at Societe Generale: Data Orchestration Solution in Banking
A conversation with Societe Generale about their Airflow implementation and development of the data orchestration solution.
Data Pipeline: Components, Types, and Best Practices
What is all the data pipelines fuss about? Learn the basics and follow our best practices.
Building a Scalable Analytics Architecture With Airflow and dbt: Part 3
Learn how to build a scalable analytics architecture with Apache Airflow and dbt – in the third and final part of our series.
What Is Data Orchestration and Why Is It Essential for Business
Discover what data orchestration is, learn the most significant pain points it addresses, and find out how to help your business grow.
Airflow Summit 2021 Highlights
Learn about the biggest community-driven event around Apache Airflow 2021 and the power of the Airflow community.
How Data Pipelines Drive Improved Sales in E-commerce
Our Field CTO, Viraj Parekh, shares insights on how sales and marketing operations in e-commerce can benefit from running functional data pipelines.
Airflow and Ray: a Data Science Story
We're pleased to announce a Ray provider for Apache Airflow that allows users to transform their Airflow DAGs into scalable machine learning pipelines.
Everything You Need to Know About the Airflow Summit 2021
Join Airflow Summit 2021 – a free online conference for the worldwide community of developers and users of Apache Airflow.
Validate Your Apache Airflow Skills With the Astronomer Certification
Boost your career and learn to run a data pipeline by getting Apache Airflow certified with the Astronomer Certification for Apache Airflow Fundamentals.
The New KubernetesExecutor
We give you a tour of the new features in the KubernetesExecutor 2.0. Spoiler alert – it's faster, more flexible, and easier to understand.
Announcing the Astronomer Registry
Today, we're excited to release our discovery and distribution hub for Apache Airflow integrations.
Airflow 2.0 TaskFlow API and Its Features
Learn how the TaskFlow API in Airflow 2.0 enables a better DAG authoring experience.
Secrets Management in Airflow 2.0
Secrets are sensitive information that are used as part of your DAG. Here are some best practices for managing them in Apache Airflow 2.0.
Change Data Capture With Apache Airflow: Part 1
Implementing production-grade change data capture in near real-time on Google CloudSQL with Apache Airflow.
Building a Scalable Analytics Architecture With Airflow and dbt: Part 2
Now that we have these DAGs running locally and built from our dbt `manifest.json` file, the natural next step is to evaluate how these should look in a production context.
Building a Scalable Analytics Architecture With Airflow and dbt
Implementing an ideal development experience at the intersection of two popular open-source tools, written in collaboration with our friends at Updater.
The Airflow 2.0 Scheduler
A technical deep-dive into Apache Airflow's refactored Scheduler, now significantly faster and ready for scale.
A Great Expectations Provider for Apache Airflow
We're pleased to announce an official integration that allows users to leverage Great Expectations natively in their DAGs.
Introducing Airflow 2.0
A breakdown of the major features incorporated in Apache Airflow 2.0, including a refactored, highly-available Scheduler, over 30 UI/UX improvements, a new REST API and much more.
Crash Course: Remote Work for Parents
Self-help guide to working from home with kids.
Introducing KEDA for Airflow
Using KEDA (Kubernetes Event-Driven Autoscaler), we've developed a robust method to scale Apache Airflow workers to be faster and more versatile than any previous architecture.
Profiling the Airflow Scheduler
Ash explains how he's been benchmarking and profiling the Airflow scheduler using py-spy and Flame Graphs.
Airflow continues to win due to an active and expanding community, and very deep, proven functionality.
The Next Generation of Astronomer Cloud
A new release of Astronomer Cloud built to support our latest features and designed to be a first step towards multi-cloud and multi-region support.
Announcing v0.10 of the Astronomer platform.
7 Common Errors to Check When Debugging Airflow DAGs
Tasks not running? DAG stuck? Logs nowhere to be found? We’ve been there. Here’s a list of common snags and some corresponding fixes to consider when you’re debugging your Airflow deployment.
Astronomer v0.8.0 Release Notes
Release notes for v0.8 of the Astronomer Platform.
Airflow Design Principles: Multi-tenant vs. Monolithic Architecture
Why we decided that a multi-tenant Airflow architecture would be the most efficient and reliable way to run our DAGs.
Astronomer on Astronomer: Loading Thousands of Files Into Redshift With Apache Airflow
Here's the story of why we chose Airflow, how we use it, what we've learned, and what we're building to make it better.
Astronomer v0.7.0 Release Notes
Release notes covering the features released with v0.7.0 of the Astronomer platform.
Astronomer v0.6.0 Release
Release notes for v0.6.0 of the Astronomer platform.
Astronomer v0.5.0 Release
Release notes from our recent platform update to v0.5.0.
Astronomer v0.4.1 Release
Release notes on v0.4.1 of the Astronomer platform.
The Future of Apache Airflow
Discussing the potential future direction of the Apache Airflow project.
Astronomer v0.3.2 Release
A rundown of features and product improvements since our v0.3.0 release.
Announcing Astronomer v0.3
Announcing the latest iteration of our Airflow offering
Astronomer Is the Airflow Company
Moving forward, Astronomer will do just one thing: help organizations adopt Apache Airflow. Our entire company will rally around this objective.
Astronomer Enterprise Edition 0.2.0
Announcing the latest iteration of our Airflow offering.
Announcing the Astronomer Platform, a Managed Service for Apache Airflow
Managed Apache Airflow for complex ETL orchestration.
Announcing Astronomer Enterprise Edition
Our latest platform deployable in your cloud.
Announcing Astronomer SpaceCamp
The fastest way to ramp up your data team.
Live Blogging Through a Migration
A live play-by-play of our company slack during a recent product update.
Announcing The Airflow Podcast
A podcast focused on sharing the open-source community's knowledge about Apache Airflow.
An Airflow Story: Cleaning and Visualizing Our Github Data
How we used Airflow to clean up a Github mess.
Improving Government Services With Apache Airflow: a Q&A With San Diego’s Chief Data Officer
Applying Airflow in the public sector to operationalize public data.
From Behavioral Analytics to Data Science With Astronomer
Being a stand-out company requires elevating product innovation within the organization and making sure that innovation isn’t reactive, but predictive.
Using Apache Airflow to Create Data Infrastructure in the Public Sector
When ARGO began exploring the technology required to build, operate, and maintain data infrastructure in the public sector, it’s no surprise they landed on Apache Airflow.
Automating Salesforce Reports in Slack With Airflow: Part 3
We realized we could replicate a second reporting mechanism that is also fairly manual but covers essentially the same data.
Automating Salesforce Reports in Slack With Airflow: Part 2
In this post, we''ll cover how we process that data and what tools we use to build and publish the reports the Slack.
5 Ways to Make Sure Your Analytics Spark Growth
Through their collection of war stories and experiences, Mr Wolf has gained insider insight to the deliberate preparation it takes to do analytics in a way that is meaningful.
Automating Salesforce Reports in Slack with Airflow: Part 1
At Astronomer, we compile and share a daily report on our Solutions Directors and marketing channels. We decided to automate it.
3 Reasons Why Astronomer Is Betting On GraphQL
There are many things that set GraphQL apart, but the main difference is that the requester has more control over exactly which parts of the response they want.
Data Formats 101
Business analysts generally encounter four main formats of data: JSON, XML, CSV, and TSV. So what are these types and why would we use them?
5 Sci-Fi Movies That Overlap With Modern Technology
Let's explore some of the ways science fiction and real life overlap (as far as we can tell) and how these “fantastic” ideas can play out in business.
Why Every Data Scientist Needs a Data Engineer
The data scientist, the sexiest role of the 21st century, isn't actually very sexy. But it could be.
What Exactly Is a DAG?
What exactly is a DAG and what does it tell us that the term “data pipeline” can't?
Data Engineering Platform Astronomer Closes $3.5M Financing
The Astronomer platform collects, processes and unifies data, allowing organizations to scale analytics, data science and insights.
Normalizing Data for Warehouse Centralization
Knowing the options for storing data will help you make the right decisions for your company when you’re ready to take this step.
6 Dashboards Every Marketer Needs
With Astronomer and Chartio, marketers can consolidate all their data sources and visualize their data.
Why In-Depth Analytics Are Easier Than You Think
Ready to get product analytics started? Download Astronomer’s free guide to product analytics that discusses what’s possible and how to map next steps.
Every Great Startup's Secret Sauce: Design
There’s a major trend happening in tech companies today: design teams are growing like mad. And there's a very good reason.
Data Wrangling 101: Using Python to Fetch, Manipulate & Visualize NBA Data
This is meant to be used as a general tutorial for beginners with some experience in Python or R.
Remote Working Guide: Reykjavik
If you’re looking for a new place to work remotely (and explore a little on nights and weekends), you should seriously consider Reykjavik.
Is Your Organization Insane?
Every organization is at a slightly different place with how they use data. See where you are.
Five Ways to Lead Well
Leading is hard, even if you know all the right ways to do it. Here are five ways to lead in your organization.
Building An Org Chart That Scales
At the beginning of February, my co-founder and I attended the SaaStr conference in San Francisco...
Apache Airflow and the Future of Data Engineering: A Q&A with Maxime Beauchemin
I reached out to Max about doing an interview post, and to my delight, he agreed. Here are thoughtful answers to questions about Airflow and data engineering.
Translating Real-World Randomness to Create Digital Security
Every year, our data becomes more accessible as the world is increasingly interconnected and more services are available online. Our best defense? Randomness.
Scaling off AWS: Exploring Go for High Performance Services
To handle an arbitrarily large number of requests, we need a language built for maximum concurrency and performance. We think that language could be Go.
Six Principles for Sending Surveys
Here are some basic principles to keep in mind to collect data that is actually usable.
Remote Working Guide: Nashville
Our team hit the road in November for our first-ever Astronomer remote-working week. Here is our collective guide to working remotely in Nashville.
Ask RBK: Tell Me About Google Analytics
In this video-edition of 'Ask RBK,' we answer some of the most common questions we hear about Google Analytics.
Remote Working Guide: Louisville
This edition of Astronomer’s Remote Working Guide takes you to Louisville, Kentucky. Or Loo-a-vul. Or Loo-ee-ville?
Our Open Source Philosophy
The world where technology is open sourced is the world we want to live in. Our CTO explains why.
Why Is My Data Playing Hard to Get?
When we talk about "hard to reach" data, what kind of data are we talking about, and why exactly is it so hard to access, organize, and store?
Approach The Next Data Initiative Like A Data Analyst
I’ve experienced great successes ... and epic failures (big data can be a big challenge). From those failures, I’ve developed a few guiding principles.
Astronomer Takes PLOTCON: Day 3
Day 3 of PLOTCON''s wrap-up turned into an open fanboy letter to the analytics community...
Astronomer Takes PLOTCON: Day 2
Day 2 of PlotCon has wrapped, so it’s time for the Astronomer team to debrief from the day.
Astronomer Takes PLOTCON: Day 1
Ben and Viraj just got done with day 1 of PLOTCON 2016. After all the presentations and exhibitions, they sat down to go over their thoughts.
Astronomer Takes PLOTCON 2016: Why We''re Excited
Plotly is one of Astronomer’s favorite open source tools and always gets great reactions from our clients.
Ask RBK: Why Do I Need Astronomer?
The question I’ve been asked most frequently over the past few years is this: Why do I need Astronomer?
Airflow at Astronomer
To extract and monitor all types of data pipelines, we needed a unified scheduling system. Airflow was our answer— and a whole lot more.
From Superstar Culture to Moneyball: How Data is Changing the NBA
In honor of the tip-off of the 2016-2017 NBA season, let’s take a look at how data is revolutionizing basketball.
Remote Working Guide: Denver
This is the first of a series of Remote Working Guides in our favorite cities to work in, outside of Astronomer’s hometown, Cincinnati.
Press Release: Astronomer Announces Seed Financing
Press Release: Astronomer Closes $1.9M in Seed Financing
Our Unique Path to Raising $2M Seed in the Midwest
Our path to raising $2M is a series of short stories with some amazing protagonists.
A New Way to Look at NPS
Astronomer wouldn’t be where it is today without the NPS (Net Promoter Score, a customer satisfaction metric).
Lessons Learned Writing Data Pipelines
I know first-hand how challenging data pipelines can be. Here's a peek under the hood of Astronomer at what makes our growing platform unique.
4 Pillars to Becoming Data Driven
When I wanted to improve the health of my company, I took this approach: just get started.
Why We Chose “Fun” as a Core Value
From the outside, startups can seem like a lot of fun. The reality, though, is that an early-stage startup is mostly grueling work.
Why We Built Our Data Platform on AWS, and Why We Rebuilt It With Open Source
As Astronomer’s CTO, I’m going to chronicle our journey, from a technical perspective, as we grow our platform and home in on how to meet our users’ real needs.
An Almost Acquisition Story
Coming out of AngelPad’s 2015 Demo Day, we found ourselves vacillating between an acquisition and Series A, though we were arguably too early for either.
6 Open Source Dashboards to Organize Your Data
At Astronomer, we believe every organization benefits from having data properly centralized, organized, and cleaned. We’re building a company to do just that.
How to Succeed in the Data Revolution
Organizations of all sizes in every industry are uncovering groundbreaking, data-driven insights and are putting data to extraordinary uses.
Announcing Astronomer v0.9
Release notes for v0.9 of the Astronomer Platform
A Culture of Customer
Astronomer is a customer-first organization. But what does that mean, really?
How We Quadrupled Our Blog Traffic in A Month
While we don’t pretend to be experts, last month, our analytics exceeded our goals.
Does Your Logo Really Matter?
Two weeks ago, Instagram changed their logo. The Internet blew up.
What if data science could be done in one ubiquitous language – one that is already often being used in the current data stack for data visualization?
Branding is a Relationship
If there’s one consistent thing about a startup, it’s change.
What We Learned After We Discovered Our Target
"I’ve learned that I still have a lot to learn." - Maya Angelou
What I Learned From Analyzing 1700 Blog Posts: Part 2
Part 2 of a series where we use common data extraction, analysis, and machine learning techniques to make our business smarter.
A Logo Story
Astronomer's Head of Design, Chris Hendrixson, explains how he created the design aesthetic to encompass data, futurism, and a little bit of fun.
Syncing MongoDB Collections With Amazon Redshift
When it came time to scale up our reporting, we realized we were missing some crucial data. Good thing we built a connector between MongoDB and Redshift.
How Astronomer Found Its Target Customer
How did Astronomer discover who it's true target is? How did they get on calls with early targets?
Setting Up Your Redshift Cluster
Redshift is popular but you still need to know what you''re doing when spinning up your first cluster. In this tutorial, we walk you through the process.
What I Learned From Analyzing 1700 Blog Posts: Part 1
We analyzed over 1700 blog posts from our competitors to uncover the elements that make a great data blog.
The Growing Data Opportunity
We’re entering a new Internet era—“a rise of the machines”—due to a confluence of trends.
When Should You Start to Warehouse Your Data?
These days, startups want to be data-driven, and web and mobile apps can generate quite a bit of data.
Having past the tipping point of too much data, businesses will begin to feel the effects of the slow, yet unstoppable force of liberated analytics.
The Email Wall
Product emails tend to fall into one of a few categories... What happens when you start making connections between them?
Why We Drove to NY and Back Over the Past 48 Hours for a 15-Minute Meeting
The Astronomer team drove from Cincinnati, OH to New York, NY for a fifteen minute meeting with the top accelerator in the world. Now... Why did we do that?