Why we built Teleskope | Teleskope Blog

As we wrap up our trip to RSA 2024, it’s clearer than ever that the data security space is flooded with vendors listing dozens of acronyms, trying to be everything for every customer. There is so much posturing and marketing that it can be hard to cut through the noise to figure out who is truly innovating and building products that solve real problems and deliver real value. So we thought it would be the perfect time to introduce everyone to Teleskope. We’re creating a different kind of data protection platform, one built by security and engineering practitioners, for security and engineering practitioners, and we’re excited to share our vision with you all.

‍

Our Story

Julie and I first met while working together on the Airbnb security team, where we faced firsthand the challenges associated with collecting massive amounts of data, from both security and privacy perspectives. I was on the data security team, responsible for protecting our customers' data and complying with privacy regulations like GDPR and CCPA.

‍

The Data Maximization Problem

Many organizations struggle to protect their data. As companies amass petabytes of data in pursuit of unlocking business value, data ends up sprawling across their entire data ecosystem, spanning internal data stores & warehouses and third party vendors. This data maximization poses both security and business concerns. As valuable data gets buried amidst billions of files that don’t really mean anything for the business, companies lose their ability to make use of the data they’ve collected. This also increases risk as it becomes impossible to pinpoint where sensitive data lives, where it’s going, who has access to it, and whether it’s protected or at risk. If you don’t know where sensitive data lives, how can you protect it?

‍

The Manual Approach Problem

At Airbnb, we were fortunate to have a large security team, mostly comprised of software engineers. This allowed us to build our internal data security platform from scratch. But most companies don't have this luxury, and frankly, building an internal data security platform is probably the wrong approach for 99% of companies out there. Data classification is a complex problem, and building it right requires years of work, constant updates, and regular retraining, which ends up being far more expensive than purchasing a tool. This is why most companies don’t even bother developing tools, and instead choose to tackle the data sprawl problem manually. Classifications are labeled by the dataset owner, or a centralized data governance team. Data redaction occurs through manual scripting if a leak happens to be discovered, and data subject rights requests are handled manually via a SQL script that often hasn't been updated in years. The problem with this approach is that it doesn’t scale (across your different structured data stores, data warehouses, unstructured data, and third parties), it's point-in-time, and it’s prone to human error, leaving significant gaps in your security posture. And for companies that are late to the game and already have massive amounts of unlabeled data, this can be a daunting and expensive effort.

‍

The Data Classification Problem

There are tens, if not hundreds of products that claim to classify data accurately, quickly, and automatically. It might seem that data classification has been commoditized, and something you can get from an existing vendor, from your cloud provider. But most of those tools fail miserably when applied to production and messy data. From our own experience, the majority of classifications generated end up being false positives, adding more burden on teams having to sift through the results rather than manually classifying the data directly. Classification is nuanced and complicated because data is nuanced and complicated. Tabular data, log files, code, legal documents, conversational data, etc. are all inherently different from one another, and classification engines need to effectively support all these different data types, since they're the rule, not the exception, in most data ecosystems.

‍

The Marketing vs Reality Gap

There’s a huge issue in the cyber startup ecosystem, and perhaps the startup ecosystem as a whole, of exaggerating capabilities or marketing features that just don’t exist. It can be very hard for someone to discern what companies actually do versus what they claim to do, and everything is gatekept behind a series of demo calls. In the data security space in particular, existing vendors, ranging from legacy DLP solutions to modern data governance and DSPM startups, claim to have solved the problem of data sprawl. They promise advanced data classification using “the power of GenAI” and automated remediation of data security issues. Their websites list dozens of use cases and features, serving as your one-stop-shop for data security. But these promises quickly crumble when applied to production data. Classifications deliver mixed results, software starts to break down when scanning gigabytes, let alone terabytes of data, and thousands of “CRITICAL” and “HIGH SEV” alerts get generated for issues that don’t even need to be addressed. Rather than check feature boxes and create noisy environments that bog down security and engineering teams, we’ve taken a different approach:

‍

Our Thesis

We’re building Teleskope to automate data protection, from detection, to remediation, to prevention. And we’re breaking our approach down into three simple tenets:

Accurate classifications and actionable insights are the building blocks for any data security program. Garbage in = garbage out.
Automated remediation is the only way to enforce data protection at scale
Proactive prevention, and integrating with developer tooling, is the only way to ensure you’re not playing catchup, and instead maintaining data protection by default.

We’re on a mission to build not only the best-in-class data protection platform, but also a transparent one. Stay tuned for more in-depth, technical blogs about all things data security and governance.