<img height="1" width="1" style="display:none" src="https://www.facebook.com/tr?id=293842391037690&amp;ev=PageView&amp;noscript=1">

Astronomer Blog

Culture / Data Science / Dev / Growth

Airflow and the Future of Data Engineering: A Q&A with Maxime Beauchemin


*Estimated reading time: 8 minutes and 42 seconds

Every once in a while I read a post about the future of tech that resonates with clarity.

A few weeks ago it was The Rise of the Data Engineer by Maxime Beauchemin, a data engineer at Airbnb and creator of their data pipeline framework, Airflow. At Astronomer, Airflow is at the very core of our tech stack: our integration workflows are defined by data pipelines built in Airflow as directed acyclic graphs (DAGs). A post like that gives validation as to why right now is the best time for a company like Astronomer to exist.

After reading the post, I reached out to Max about doing an interview post, and to my delight he entertained the request with thoughtful answers to our questions about Airflow and the future of data engineering. You’ll find his answers below, but first I’d like to add a little context.

Topics: dev

Scaling off AWS: Exploring Go for High Performance Services

Estimated reading time: 14 minutes, 22 seconds

Several months ago, our CTO wrote about our transition from AWS services to open source counterparts. In that post, he discussed the reasons for building on AWS in the first place, as well as why we now had to move off of it. In essence, Astronomer needs the ability to run in any cloud, so breaking our dependence on any one cloud provider is a must. In addition to the SaaS version of Astronomer, we have an enterprise edition that gives enterprises full control of running the Astronomer platform in their own cloud deployments.

Topics: dev

Our Open Source Philosophy

Estimated reading time: 3 minutes, 10 seconds

Last year, I wrote a blog post about why we built our platform on AWS and rebuilt it with open source. It chronicles our journey to build our ideal unified system, one that checks all our boxes (like cross-infrastructure, secure, efficient, highly available, self-healing and able to execute long running processes as well as spin up one-off processes and specialized clusters of machines on the side). 

But this post is different.

Topics: culture dev

Airflow at Astronomer

Estimated reading time: 8 minutes, 47 seconds

At Astronomer, we're creating a modern platform for organizations to build data pipelines to power their analytical and machine learning workloads. Our goal is to make it extremely simple for anyone to set up a data pipeline without having to worry about everything involved with keeping that pipeline running. We pride ourselves on being adaptable enough to extract data from anywhere and get it into your data lake or data warehouse.

Topics: dev

Our Deep Roots with Meteor.js

Where do we come from? Who were our ancestors? What was important to them?

Machines don’t ask these types of questions, but exploring our lineage is a concept that’s deeply ingrained in humans. That sense of longing to know who and what we are is one of the main reasons we explore our universe.

At Astronomer, the machines go about their daily lives without regard to where they come from, but when building the future, our humans have to consider our product lineage, and Meteor.js has been central to our product since day one. Actually, since before day one.

Topics: dev

Lessons Learned Writing Data Pipelines

TL;DR It’s hard.

We are in the midst of a data revolution. The sheer volume of data being generated daily is staggering and organizations are trying desperately to take advantage of it. However, there are a number of barriers stopping them from being able to successfully gain insights from their data.

Topics: dev

Why We Built Our Data Platform on AWS, and Why We Rebuilt It with Open Source

Astronomer is a modern platform built to outfit organizations with a solid data infrastructure to support machine learning and analytical workloads. In other words, we help you organize, centralize and clean your data through a personalized data engineering experience. We exist because we believe the internet age is just a precursor to something much larger, something with the potential to push the world forward in the same ways the Agricultural and Industrial Revolutions did: the Data Revolution.

Topics: big data startups dev

Aries: A Source for Your Data Pipeline

Few things have been as challenging and rewarding as learning the ropes here at Astronomer. I’ve been a process engineer for nearly a decade, following a very nonlinear career progression that ultimately led to a fun fusion of electrical/mechanical aptitude, problem solving and software development. Contracting with a team that delights in reverse engineering solutions for customers was the perfect fit!

Topics: dev

Sneak Peak at our New Backend

Now that Astronomer has expanded outside of clickstream data processing, we run a lot of batch jobs. Our "minimal viable product" solution was to use Amazon Simple Workflow Service (SWF) to keep track of the workflows, but we felt a bit uneasy being so tied into Amazon, and people experienced with existing big data tools told us it was "weird."

Topics: big data dev

5 JavaScript Tools to go from Developer to Data Scientist

In 2011, the consulting firm McKinsey & Co caught headlines when they predicted that in a mere seven years the newly minted "Data Scientist” role would have a 200,000-person talent deficit. This prediction gave enormous credibility to the idea that the economy as a whole was moving toward becoming more data driven, and the study continues to be quoted even today, appearing as recently as last December in TechCrunch. Although different research firms and publications may give slightly different deficit predictions and timelines, there is a consistent message that can be found throughout all of them. Namely, we don’t have enough people with the skills to do the job. And we won’t anytime soon.

Topics: data science dev