Astronomer’s push-button data infrastructure connects data from any source and sends it to any destination in real-time, including marketing/sales analytics tools, data warehouses, dashboards or anywhere else.
Companies get automatic access any data they need in a centralized location and a consumable format.
Actually, we’re not that different from traditional astronomers who draw connections among seemingly disjointed white dots to reveal insights
You might be wondering if your internal team could just pull this together. Technically, with the right APIs, applications, departmental databases and even third-parties, it's possible to move data into a data warehouse, analytics tools or other destinations. But the time, money and frustration of getting access to what you need can cost your organization dearly. Here is what Astronomer offers:
Most often, we see to it that your data gets securely stored in the destination of your choice—and monitor it regularly to ensure that it continues to flow efficiently. That allows you to maintain full control.
We have many partners to whom we can refer you for this. What we've found is that most of our customers need to get data into a preexisting dashboard or analytics tool. It's extracting the data that's the pain point. We solve that problem and automatically transfer any data you need to any tool you want.
Machines are what we affectionately call our platform and products while humans are the engineers and ones responsible for customer success. We believe a great platform is best implemented when a team of humans is on standby to help.
That’s the best question you could have asked!
We use the following rubric to determine how difficult data is to access:
Criteria for Difficult-to-Access Data:
By these criteria, we’ve identified 15 potential opportunities to take advantage of data that is otherwise difficult to access. For more information, take a look at the next question.
We'll lay it out here, but if you know you want to dig in, read this blog post.
Difficult by Inherent Property
Difficult by Human Interaction/Choice
Difficult by Process (Lost in Translation)
The unique part of working with Astronomer is that we get everything set up for you. From the first moment you get started, our Customer Success team will work with you to receive all necessary credentials and permissions needed to get set up. If you’re using existing integrations, you can typically get setup within a few hours.
Our platform is built on Apache Airflow and a host of Apache software, including Mesos, Kafka, Beam and Spark.
Using Docker, we package everything up in containers, which allows us to break data pipelines into distinct tasks within a workflow management system and ensure not just complete adaptability within data pipelines, but also total portability to new environments.
We are in the process of bundling up all of our services to run inside Docker containers, giving us a layer of portability. We've also migrated away from AWS managed services, in favor of open source alternatives (we chose Airflow). By migrating everything to open source technologies, we become very portable. We run on our cloud, or yours.
An integration refers to the data from a very specific location (like Google Analytics, MixPanel, Facebook ads and hundreds more).
Think about it this way: We’re all collecting data like crazy, but it’s not doing us much good. Most of it is sprawled all over the place and totally disconnected.
It’s like discovering oil for the first time. We can buy land, dig wells, build refineries, even construct power plants. But it takes pipes to transport it.
Data is a commodity. Data pipelines are the vehicle to centralize all of your data, even the hardest to reach.
Currently, our platform does not support native web scraping. However, we have an integration with Import.io that can build a web scraper without needing to write any code. Import.io can be run daily for free and the Astronomer team is happy to assist in setting anything up.
Import.io's platform is robust, stable, and being used by companies as large as Accenture and PWC.
Currently, our platform can not scrape data from PDFs. We have some ideas of how to do this (interesting existing project here: https://www.npmjs.com/package/pdf-text-extract) but we're looking for a customer with a clear use case and need so we can raise it to the top of our roadmap.
Aries is an open source collection library to pull, push, and transform data from any source (e.g. API, DB, FTP) and to any destination.
To sum it up in one sentence: Webhooks are a system for third party systems to notify other systems of things that happen on their end.
Let us explain: It's pretty much a way for one system to notify other systems of events that occur as they happen. A good example is Stripe. Suppose a customer is a Stripe user and uses them to process monthly subscription payments. Say a few months down the line, a customer's credit card expires and causes an error with Stripe.
We can be set up in the same way to accept webhooks (it’s as easy as copy/pasting
For a few major reasons:
Astronomer categorizes data into two types: real-time and batch.
Excellent question! We at Astronomer believe that your data is only as valuable as the questions you can answer with it and that’s why we’ve partnered with a number of business intelligence and visualization tools, including Chartio, Mode Analytics, Periscope, and SimplyInsight. We also have a number of data science consultancies we’re happy to refer you to, or, if you prefer, we also provide custom data science solutions for $125/hour.
Our customers are the most important humans around. So we have a Customer Success team devoted to easing you from discovery to full implementation to what we call "crushing it" (ie. getting to immediate value). If you want to know more about what the Astronomer experience is like, check out our humans page.
Clickstream data tracks every “click” of a customer across a website or app. Many tools measure these clicks, but there are over 3500 marketing tools to choose from. Not only are data sets disjointed, but data is often imprisoned within the tool, so it’s really hard to change tools or combine them. To see a full picture of how a customer interacts with a website or app, it’s important to see all of these customer clicks as one cohesive body of information. With Astronomer, an organization can broadcast any data to their analytics tools or dashboard, or connect it all in one central location like a data warehouse or other destination. A simple interface allows users to simply press a button to add or change any number of services—or test out multiple tools at once—without losing any data.
An event is typically anything that would happen within someone’s application. Some examples include:
Tracking pixels is a common but old-school way to track events. All you do is place an
Analytics.js, on the other hand, allows for richer data to be collected. It can provide more detailed event snapshots since you can set multiple triggers on the same page to describe different actions the end user is taking. Companies use analytics.js when they need specific details, like how long someone is on the page or what elements they interact with on that page.
When using analytics.js, however, pixel integrations can easily be enabled; analytics.js simply acts as a tag-manager. In the case of a pixel integration, like Facebook, the pixel parameters are already bundled inside the library.
You can definitely have a one-to-one replica of your database where the destination table (in Redshift) is dropped and rebuilt at each scheduled run of the workflow. However, this isn’t the most efficient in both time and processing power so if time is a factor or you’re running a large number of resource-intensive queries often, we don’t recommend this option.
Amazon Redshift, unfortunately, does not offer a traditional method of upserting (where records are updated if they already exist or added to a table if they don’t yet exist). An alternative we offer is what we call “pseudo-upsert” where we create a staging table within your database temporarily that is used to update existing records.
For more information on how the Redshift integration works, please visit GitHub to see our open source repo.
We have a full stack of integrations in development (or planned) for the next 6 months and beyond. Our roadmap for future integrations is driven by what we find to have the most beneficial impact relevant to our prospects' or customers' trends/demands. That said, we are able to pivot our integration development based on the needs of a current or prospective customer.