Blog|

Airflow in Action: From Steel City to Data City. Pittsburgh, Astro and Open Government Data

6 min read |

At Airflow Summit, Alida Laney, Data Engineer for the City of Pittsburgh, showed how a two-person engineering team uses Apache Airflow® and Astro to modernize government data infrastructure, deliver open datasets to the public, and drive real operational impact across city departments. Watch the session to see how a small public-sector team architected a scalable, cloud-based data platform under real government constraints, and the concrete outcomes that followed.

Small Team Working with Open Government Data

Pittsburgh is a mid-sized post-industrial city of 300,000 people in the middle of a tech-driven resurgence. The city's Data Services team sits within the Innovation and Performance department and has a clear mandate: fulfill the city's 2014 Open Data Ordinance, which established that government data should be open by default. The team has eight people total, two of whom are data engineers. Those two engineers are responsible for the data needs of an entire city.

The challenges the team faces are largely unique to government. With around 20 departments, each heads-down in their own work, relationship-building is as much a part of the job as writing code. Getting departments to share data requires education, trust, and persistence. Vendors add another layer of friction. Most are not built for the kind of structured, accessible data the team needs. Some offer APIs that are shaky at best. Others provide direct database access so restricted that only a single engineer's laptop is whitelisted. And because nearly every department manages its own vendor relationships, there is no standardization across the data the team ingests. Government procurement makes all of this harder, slowing down access and adding process overhead at every step.

From Google Cloud Composer to Astro: A Critical Upgrade

When the team stood up its cloud data infrastructure in 2020, it started with Google Cloud Composer running Airflow 1. Managing that environment proved challenging, and when the time to upgrade to Airflow 2 came, the team had a decision to make. They found Astronomer's documentation and resources instrumental in evaluating the path forward, and proposed a switch to Astro, the fully managed Airflow service.

The payoff was immediate. The switch saved an estimated 15% of engineering time. On a two-person team, that is roughly a third of one full-time engineer. Time that had been absorbed by infrastructure management could now go toward building and maintaining the pipelines that actually serve the city and its citizens.

Data Ecosystem at the City of Pittsburgh

Pittsburgh's data universe now spans parks and recreation, human resources, criminal justice, permits, tax data, and more. The team manages dozens of pipelines through Airflow and Astro. That number might seem modest compared to larger Airflow deployments, but each pipeline represents its own cross-departmental project, its own champion, and its own negotiation with a city department.

Figure 1: City of Pittsburgh data universe, orchestrated by Airflow and Astro. Image source.

Data flows to multiple destinations. The Western Pennsylvania Regional Data Center (WPRDC), housed at the University of Pittsburgh, publishes over 100 open datasets maintained by Pittsburgh's Data Services and GIS teams. Those datasets are used by Carnegie Mellon and University of Pittsburgh classes, community organizations like the Black Equity Coalition, and neighborhood groups proving infrastructure needs to secure city funding. One historically underserved neighborhood used WPRDC data to secure funding for two simultaneous traffic calming projects, an outcome that would have been nearly impossible without accessible, reliable data.

The team also built OneStopPGH Insights, an internal and public-facing tool that maps parcels and street segments across the city and surfaces permit data from multiple departments. Over a dozen Dags power it. One department alone contributes 80,000 permits that normalize into roughly 4 million rows, with about a million rows added each month. Without Airflow and Astro, that scale would not be achievable.

Figure 2: City of Pittsburgh OneStopPGH Insights tool. Image sourced from the City of Pittsburgh and Astronomer case study.

From Weeks to Minutes: Astro Delivers

The internal impact is just as significant. The city’s finance department was manually cross-checking property purchase applicants against permit violations and tax delinquency records, a process that took anywhere from a few days to several weeks. The team had already built pipelines on Astro that pulled permit violation and tax delinquency data into BigQuery as part of their broader data infrastructure. When the finance department described their manual checking process, the team realised the data they needed was already there, reliable and up to date. Automating the workflow was a matter of querying those tables and returning the results, no new pipeline required. The same check now takes 10 minutes.

That kind of impact compounds at a city level too. The What Works Cities Certification is a national standard for data-driven local government in the United States, recognising cities that use data and evidence effectively to improve services and deliver results for residents. Pittsburgh recently earned Gold-level certification, placing it alongside Boston, Los Angeles, and Seattle despite being a fraction of the size and operating with significantly fewer resources. Airflow and Astro directly support a third of the criteria required to achieve that certification, spanning data accessibility, reliability, and policy compliance.

What Comes Next

The team is building an internal data mart using Google Dataplex, designed to put curated datasets directly in the hands of analysts across city departments, orchestrated by Airflow and Astro. More open datasets are on the roadmap, shaped by direct community input.

Alida wrapped up her talk by emphasizing the value of Airflow:

“Airflow does more for us than manage pipelines. It helps us democratize data, improve government efficiency, and move our city forward”.

Watch the full session Pittsburgh Goes With The Flow - Use Cases In Local Government to see how a two-person team is using Airflow and Astro to redefine what data-driven government looks like.

For a deeper look at how Pittsburgh made the case internally for migrating to Astro, the specific pipeline failures that forced the decision, and how the team is planning to extend Astro's reach into public safety and finance, read the case study. You can assess the impact of a managed Airflow service by trying out Astro for free today.

Get started free.

OR

API Access
Alerting
SAML-Based SSO
Airflow AI Assistant
Deployment Rollbacks
Audit Logging

By proceeding you agree to our Privacy Policy, our Website Terms and to receive emails from Astronomer.