Are You Ready for A Data Breach?

***Disclaimer: The content, data, and claims in this article are for informational purposes only and not for the purpose of providing legal advice. You should contact your attorney to obtain advice with respect to any particular issue or problem you may be facing.***

We at Astronomer have an optimistic—even idealistic—view about the future of data, its handling, and the role it will play in both technological advancement and our personal lives for years to come. We believe that data will aid in everything from revolutionizing the healthcare system to automating toilet paper orders based on your wiping habits (in fairness, Amazon already did this with data!). We even think it could save us from unwittingly becoming a society where a Kimye robot is president, Pokemon never fades back into obscurity, Nickelback and Creed are considered classic rock, and gaining access to everyone’s data is as easy as trolling on Twitter. (Here’s hoping!)

However, our optimism is tempered by a few important things. First, we have deep industry knowledge, so we understand both the fundamental weaknesses and the best practices. Second, we’re a startup, which means while we are starry-eyed dreamers, we’re also a little salty and cynical (a byproduct of working in such a volatile environment day-in and day-out). Third, and lastly, we are observant: we stay up to date with the latest tech news, we follow industry chatter, and we talk with other experts. Through our observations, it is abundantly clear that data—especially when not handled properly—can be a liability, no matter how careful or small you think you are.

Like most things in life, though, there are ways to mitigate and manage risk. In this post, we hope to make you a little wiser when it comes to risks associated with modern data practices through some cautionary tales and common sense tips.

The Nature of Data

The thing is, with big data comes (pretty big) risk, especially if you are a small-to-midsize business (SMB). We all know how Big Data is (or should be) revolutionizing every business on Earth, but at the same time, the risks inherent in advanced data practices have the ability to sink a company completely. Nobody wants that. Am I being overly alarmist? Maybe. But the parlous nature of data is real. The fact remains that over 90% of all data breaches occur in the SMB space, with the average breach costing more than $36,000 per incident (and that cost doesn’t even take into account the lost revenue many SMBs experience due to an average 31% loss of their customer base post-breach). Simply put, the direct financial costs coupled with the residual effects on reputation and customer loss make data breaches potentially business-ending events.

Yet, while data is risky, market forces are demanding the use of it. Even on a small scale, it is becoming more and more difficult to compete in already highly saturated markets without at least reasonably up-to-date data practices. Why is that? Customers are demanding more customized and targeted offerings more often and faster than ever before. They want you to know what they want, how they want it, and where they want to find it before they even know what it is they want. Bottom line is this: data usage has hit a tipping point, and it is only a matter of time until every business out there is powered by data. So, we think it's best to be prepared for impending risks, whether you welcome the change or not.

Why is data a liability?

There is a plethora of legally nuanced and ridiculous ways in which data can become a liability to your organization. However, this is neither an administrative law article nor a tort law class, so we will avoid the absurdity of addressing every possible permutation of liability that could potentially arise. Rather, let’s focus on some big themes and simple solutions. We all love lists, so here’s one.

1. Federal Regulations are Serious

This one is pretty simple: either comply with federal law or don’t. The latter is an ill-advised course of action, but hey, the choice is yours. If your organization is handling any data that is covered by federal or state regulations, complying with the requirements of said regulations is not optional. If you do not, you are opening your organization up to civil lawsuits, regulatory audits, fines, criminal liability, and a bad reputation. In short, bad shit that costs money. This is why every legitimate enterprise out there has a compliance desk, a legal desk, and sometimes both. And no, this is not a shameless plug for you to hire an attorney; rather, it is a sobering admission that having someone who knows how to navigate these waters can, and will, save you a lot of gut-wrenching pain down the road. That person doesn’t need to be a lawyer, and they don’t need to be in-house. I learn important legal considerations and concepts about the data industry from “lay people” everyday. Just have a person.

2. You Signed a Contract - Do What You Said You Would

Terms of Use. Terms of Service. Privacy Policy. Services Agreements. These are all different contracts that your organization enters into with its customer/users/clients. As such, if you don’t do what you say you are going to do (or you do something you said you wouldn’t) with regards to the data and information you collect from your clients/users, you open yourself up to liability. Again, this is pretty straightforward and logical thinking, but things can get hairy if you don’t know what your agreements say or what your policies are. For example, I’ve heard of many companies wholesale copying privacy policies from other websites. While this is ostensibly clever and efficient, what if the entity from which you took the policy has some whacky and restrictive data sharing clause in its privacy policy that directly precludes you from doing something that is core to your business or standard practice for your organization? Not so efficient now, eh? Simply put, know what you are agreeing to or promising to do (or not do).

3. Just by Existing You Are a Liability - Know What To Do

As mentioned above, market forces are demanding the use of data to drive better business. Yet the mere existence of data makes you a target. Literally (and I don’t mean “literally” like this—I literally mean “literally”), there are billions of cyber attack attempts per day. So, statistically speaking, it isn’t a matter of “if” but “when” you will experience some sort of attack event on your data. We all face this omnipresent threat, so it is best to prepare accordingly.

First, two words: plausible deniability. We recently had a cyber security audit done (and we are happy to report that we are on top of our security and our CTO’s product roadmap falls in line with, and even goes beyond, security best practices, but I digress… ), and one of the main points the auditor made was always ensuring plausible deniability. Simply put, do not open your organization up to unnecessary liability on the off chance something goes wrong. In practice, this means: 1) not storing any personally identifying information or payment information unless absolutely necessary; 2) double encrypting data coming in and out of your organization so you never actually see it, even if you wanted to; and 3) not archiving data or records unless otherwise required by your business or by law. In short, maintain your organization’s ability to deny liability by not having access in the first place.

Next, have protocols in place to deal with security breaches. Our auditor suggested that we go so far as role playing a critical situation so we know exactly who would be doing what and when. This practice and preparedness will help mitigate any damage that may be done. Recently, a company you may have heard of, called Target, had a data breach in which over 40 million credit card numbers and 70 million addresses and phone numbers were successfully stolen. Old news. What’s interesting, though, is how the breach was dealt with internally. Target did in fact have controls and tools in place to prevent and warn against this exact sort of breach. However, when it came time for the controls to do their jobs,many of the warning signs and alerts went unheeded, arguably because the protocol wasn’t solidified. Could the breach have been prevented? Maybe not. But damage could have been mitigated, and sometimes mitigation is the name of the game.

Lastly, make sure your CTO or technical director is savvy enough to build a system that isn’t completely interdependent. I’m a technical layperson, so this seems logical to me and may be blindingly obvious to others, but ensure that your systems can operate discretely, and that access to one entry point doesn’t necessarily mean access to all entry points.

A Final, Positive Note

Admittedly, this has been a bit of a scare-filled post, but this exercise is critically important. Even in the face of overwhelming statistics and evidence, many of us collectively suffer from one of the worst cases of optimism bias (or valence effect) in recent history: “Data breaches happen, but not to us.” We’ve all willfully turned a blind eye to a threat lurking in the not-too-distant shadows of the internet. I’m as guilty as the next. So, it serves us all best to be candid about these shadows and the challenges they present to cutting-edge and bullish businesses.

Now, all of that being said, data is nothing short of amazing. It continues to open incredible new economic opportunities. It’s why we do what we do. We wake up every day breathing it, loving it, and nurturing it. We help others grow and tend their own data. We want everyone to experience the ecstasy of truly understanding why something is happening the way it is and how manipulating single levers affects the outcomes. We live for the “HOLY SHIT!” look on our data scientist’s face when he finds something mind-bogglingly interesting or obscure. We are betting our livelihood, our reputation, and our damned hard work on the importance of data. We know that it is only with data that we avoid a world where Nickelback replaces Zeppelin. Not on our watch.

Ready to build your data workflows with Airflow?

Astronomer is the data engineering platform built by developers for developers. Send data anywhere with automated Apache Airflow workflows, built in minutes...