Notes - Security for AI and Data Security (Pt.2)

Summary
- Data security is rapidly becoming a core pillar of cybersecurity as AI adoption and data value surge, driving new demand for protection and compliance.
- DSPM has emerged as the leading data security category, with most top startups already acquired by incumbents like PANW, IBM, and RBRK.
- Platform vendors are best positioned to win as DSPM becomes embedded within broader cloud, endpoint, and backup ecosystems.
- Real-time prevention, granular governance, and broader SaaS/PaaS coverage are key areas for innovation and differentiation going forward.
- Despite consolidation, technologically the market is still in early innings. Similar to cloud security in the 2010s, we expect future waves of new startups.
Data security is rapidly emerging as a critical category within cybersecurity, especially with the rise of AI and growing concerns around securing data for AI systems. While this segment is still in its early stages — much like the first generation of cloud security startups — it’s worth exploring what the future of security for AI could look like. In the near term, early-stage startups in this space may be acquired due to a market that isn’t yet large enough for independent growth. Established incumbents are likely to dominate until the next wave of startups arrives with innovative solutions tailored to new demands.
Historically, data security has not been a standalone category, but rather an extension of other security disciplines. Data Loss Prevention (DLP) has traditionally been at the heart of data security, designed to prevent sensitive information from leaking — whether through insider threats or external attacks. Early on-premises solutions focused on endpoint DLP, such as Vontu (acquired by Symantec, now Broadcom), and server-based Data Access Governance (DAG) like Varonis (VRNS). Endpoint DLP secures data in motion, preventing sensitive files from being exfiltrated via USB drives or web uploads, while DAG solutions protect data at rest by managing access permissions (i.e., read/write permissions) and maintaining audit trails.
Despite their promise, both DLP and DAG were niche products that never reached the scale of endpoint or network security solutions, which have produced several major public companies. This limited success was partly because data wasn’t always seen as a highly valuable asset, and hackers had less incentive to target it. Moreover, early DLP and DAG solutions offered limited granularity, protecting at the file level rather than the data-string level, creating significant loopholes for sophisticated attackers.
Generally, DLP has been more widely adopted as a safeguard against employees exfiltrating trade secrets to competitors, and for a time, DLP was practically synonymous with data security.
However, the landscape is changing fast, especially in the context of AI. Data, both structured and unstructured, is becoming increasingly valuable. Hackers have evolved their monetization strategies: first, by encrypting victims’ data for ransomware, and now by exfiltrating data to sell or use as leverage, threatening to report breaches to regulators, which could result in hefty fines for victims.
For enterprises, structured data is more useful than ever, especially with the rise of Modern Data Stacks (MDS). Meanwhile, advancements in machine learning and LLMs are transforming unstructured data — once considered a burden — into a valuable asset. Today, proprietary, curated data is a goldmine, and protecting it is more important than ever.
As the value of both structured and unstructured data continues to rise, it’s logical, and necessary, for organizations to invest more in data security.
In more recent quarters, we've been seeing DSPM (Data Security Posture Management) break out as a major segment in the market with several startups being acquired by incumbents. We first anticipated the rise of the DSPM segment back in July 2023.
The logic behind DSPM is straightforward: data is the most critical asset for LLMs. As we exhaust the supply of high-quality public data and enterprises are increasingly demanding highly accurate, production-grade GenAI applications tailored to their own environments, organizations are prioritizing their data landscapes. This begins with investing in robust MDS and is followed by a growing willingness to invest in data security solutions — especially as the value of data continues to rise, powered by LLM advancements.
What is DSPM?
Think of DSPM as the natural evolution from CSPM (Cloud Security Posture Management). It extends the agentless, API-based approach of CSPM — originally used for ensuring infrastructure configuration hygiene — into the data layer. In essence, DSPM is about making sure that the configuration and handling of data within IT infrastructure are secure, visible, and compliant.