May 20, 2025

Data-Broke: U.S. Tech Firms’ Counterintelligence Dilemma

Nearly a decade has passed since the breach of the U.S. Office of Personnel Management (OPM) by Chinese state-backed hackers in the spring of 2015. That the operation netted Beijing the detailed backgrounds and personal data of over twenty million federal employees, clearance-holders, and applicants, as well as that of their co-habitants and spouses, constituted one of the most damaging counterintelligence breaches in U.S. history. Assessing the loss, former Central Intelligence Agency (CIA) and National Security Agency (NSA) chief Michael Hayden offered a blunt, sobering take: “It remains a treasure trove of information that is available to the Chinese [Communist Party] until the people represented by that information age off. There’s no fixing it.” In his estimation, the impact of the breach would take a generation or more to fully subside, until the youngest members of the federal workforce at that time ultimately retired.¹

Over the following decade, it would become clear that such counterintelligence hazards would hardly subside at all for reasons that were not yet fully evident, but still perhaps predictable. The aggregation of personal and location data on American consumers, including military service members, intelligence officers, national security officials, and contractors, would become part and parcel of the data-driven advertising behemoth that underpins the modern digital economy. While the threat of sophisticated cyber breaches into sensitive datasets remains, another trend is both an addition and contributor to the hacking risk: there is little need to steal through cunning espionage what can be obtained through little cost or effort in a vast and open data marketplace.

The question is whether such a status quo can be sustained without causing irrevocable damage to the counterintelligence interests of the United States. Can consumer data be treated as a “strategic resource,” as the most recent National Counterintelligence Strategy asserts, from both the commercial and security perspectives simultaneously?² Or will one necessarily come at the expense of the other? As the age of “Big Data” and advances in computing have birthed the Artificial Intelligence era, these questions require urgent attention from policymakers. In what follows, we argue that they have, indeed, at critical points come into underappreciated but serious tension.

From OPM to Equifax to Salt Typhoon, the issue is now less that a single sensitive puzzle piece might be collected by U.S. adversaries but that a holistic mosaic has already been aggregated: that a vivid and detailed picture of U.S. military, intelligence, and national security rank-and-file personnel is coming into view for any sophisticated adversaries who care to look.

Military Jogging Routes and AdTech Targeting Packages

It’s no secret that the entities that collect and store people’s data are vulnerable to hacking. What is far less understood is the degree to which companies building and managing smartphones, laptops, mobile apps, websites, and myriad other digital technologies and interfaces all collect, aggregate, analyze, and share people’s information. Jogging enthusiasts and open-source intelligence researchers brought this problem to the fore in 2018 when they revealed that Strava, a software application linked to FitBit devices, had been publicly posting the geolocations of its users. This oversight enabled curious online sleuths to watch U.S. military forces and intelligence officers in countries around the world as they jogged around forward operating bases and visited, presumably, safe houses.³ While hardly a responsible privacy practice (even average runners might find their total run history, with timestamps, made available online without a password disturbing), what would have been an obvious, serious breach in a government context was the product instead of shockingly widespread industry data practices.

Corporations that are even less understood accelerate this personal data collection, aggregation, and analysis further. Companies that manage real-time bidding networks—algorithmically run online auctions for other companies to buy “ad space” on an app, website, and so forth in order to show people ads—make reams of personal data available to countless entities sitting in the virtual auction house, on a constant basis, every single day.

Data brokers, those companies in the business of aggregating and selling people’s data, likewise root their business, and as such their profit margins, on repeatedly selling information they’ve bought, compiled, and inferred to a wide range of buyers, largely at their own discretion.

Connected devices used by American consumers now routinely come with third-party software installed which transmits information about users, activity, and location to the digital advertising market—data brokers and bidding exchanges—to be packaged, traded, and sold. And companies that want to target people with particular messages, and then collect data on their responses, can leverage adtech companies to profile and reach individuals.

The prodigious volumes of data both collected and publicly available on data markets illustrate the remarkable extent to which Americans’ personal information is rendered vulnerable to being hacked, stolen, and compromised. This fact alone should put OPM-style hacks into perspective. As NSA General Counsel April Doss wrote in 2020, governments pose but one facet of the challenge: “Data collected by national security programs [have come to] pale in comparison to the exquisitely detailed user profiles that are being amassed” either by, for, or on behalf of the U.S. tech sector.⁴

From Dating Apps to General Motors

A host of real-world events and civil society investigations from the post‑OPM decade illustrate why the explosion of commercially available Big Data complicates and accelerates the counterintelligence dilemma facing the United States. Researchers have long demonstrated the ease of identifying, targeting, and even inducing U.S. and allied military service-members through their use of social media, dating, and messaging apps.⁵ Their findings highlight the already high risks posed by poor organizational and operational security in the digital era.

More recently, a group of journalists outlined how even the most conscientious users, including those playing sensitive roles in government, intelligence, or the military, would be hard-pressed to extricate themselves from the digital ecosystem hard-wired into their devices. The team accessed geolocation and related data from in and around a U.S. military installation in Germany, one which is said to house, among other things, elements of America’s nuclear arsenal and intelligence collection platforms. Their assessment is alarming: “Not only is [such] data collection likely capable of revealing military secrets, it is essentially unavoidable at the personal level. . . service members’ lives being simply too intertwined with the technology permitting it.”⁶

Attempts to cordon off specific locales from such data-harvesting are unlikely to alleviate such counterintelligence concerns. The vast majority of modern smart devices require some degree of geolocation data to function properly. Even if tech companies were prohibited from collecting and selling geolocation around specific sensitive facilities, the prohibition would fail to cover everywhere else affiliated personnel travel, everyone else they associate with, and everything else they do.⁷ U.S. law generally requires an affirmative opt-in from users to collect and sell geolocation data, but this appears to serve more as a speed bump (or speeding ticket) rather than an adequate barrier against the targeting of U.S. officials and installations.⁸ Just as the average citizen flies past the privacy policy and terms of service briefly popped onto their screens, clicking “agree” without so much as a substantive glance, so, too, do many U.S. government affiliates, rendering their supposed opt-ins just as illegitimate in practice. The Federal Trade Commission (FTC) has only historically taken a small number of enforcement actions against violators, most recently against General Motors.⁹ Such actions, however, are mostly designed to bring offending organizations into full (or stricter) compliance with the law, while any punitive fines for unscrupulous activity ($51,744 per violation) are likely insufficient to serve as a deterrent in the broader multi-billion dollar digital advertising industry.¹⁰

The inability or unwillingness of the industry to police itself essentially means the data brokerage ecosystem can only be as ethically or legally respectful of consumer privacy as its least scrupulous participants. As legal scholar Andy Wang argues, “The magnitude of harm arising from one broker’s activities depends on what data other brokers in the network are selling.”¹¹

Unlike tangible goods, data is an endlessly duplicable, non-exhaustible, and non-rivalrous good. A “tragedy of the commons” thus prevails: if one broker accumulates and sells nonconsensual or otherwise protected data, the compliance efforts of others can be mooted. Compounding the issue is that so-called know your customer (KYC) practices are either nonexistent or inconsistently applied throughout the data brokerage industry. According to a team of scholars from Duke University, “A malicious actor could easily lie their way around many data brokers’ lax KYC controls, or simply find a broker with virtually no KYC practices whatsoever.”¹²

Meanwhile, the promise that the data collected and aggregated on U.S. citizens can be safely and irreversibly “anonymized” has been repeatedly debunked as wishful thinking by researchers. For instance, a 2019 study at Imperial College, London, drew on just fifteen demographic attributes to reidentify U.S. citizens from an anonymized dataset, concluding that a composite picture of 99.98 percent of Americans could likewise be constituted, “seriously challeng[ing] the technical and legal adequacy of the de-identification release-and-forget model.”¹³ Nordic academics in 2021 likewise found that a year’s worth of usage data by 3.5 million people from as few as four mobile apps was sufficient to reidentify 91.2 percent of them by cross-referencing publicly available information.¹⁴“Americans are the easiest to re-identify,” the authors noted, raising the stakes in a society where only a small handful of companies control the entire mobile app ecosystem.

Endless hype cycles about AI and the future of the digital economy only further accelerate companies’ efforts to collect vast amounts of information, aggregate and analyze disparate datasets, and monetize and sell data previously collected for a limited purpose. In other words, they accelerate the collection and dissemination of what is, at the end of the day, effectively exploitable intelligence on U.S. assets and personnel.

Protectionism, AI Competition,
and Strategic Vulnerability

In 2025, national-level economic planning and industrial policy have largely shed the stigma that once surrounded them. As competition with China continues to intensify and a second Trump administration takes its stride, all eyes are on a U.S. tech sector that has achieved an almost mythical strategic significance. Beyond their innovative capacities, the U.S. federal bureaucracy and military-industrial complex have become widely dependent upon—if not inextricably linked with—the major tech multinationals, such as Microsoft, Google, Amazon, Palantir, Starlink, OpenAI, and others. The continued success of these firms is thus considered as much an issue of national security (however spurious those claims may be) as one of geoeconomic vitality.

But as the global scramble for data-fueled AI supremacy continues unabated, pitting the United States against its peer adversary, China, Washington has adopted a protectionist stance in just about every aspect but one: the ubiquitous collection and permissive availability of data on its citizens. Indeed, the United States is far and away the global leader in both the number of data categories available for purchase and the number of companies who provide them.¹⁵

Tech sector leaders are keenly aware of this fact. Like many other U.S. industries throughout history (particularly those wary of regulation that might curb anti-competitive or anti-consumer activity), the “national tech champion” card has long served as something of a free pass.¹⁶

Particularly in the era of hyper-competition for AI supremacy (however ill-defined the concept¹⁸), the U.S. tech sector has largely been victorious in its lobbying push to make the United States among the most permissive locales on earth to collect and trade data on consumers.¹⁷ This is despite valiant cajoling by some privacy advocates, regulators, and legislators in recent years. This quintessentially American laissez-faire approach to online privacy has prompted individual states like California, Vermont, and others to take up the burden of introducing their own curbs on the unfettered agglomeration of personal data.¹⁹ Valuable as they may be for individuals’ privacy, such piecemeal endeavors are not enough in and of themselves to capture and curtail the wider canvas of personal data collection and sharing—and the counterintelligence risks that come with it.

Lacking much progress from Congress on a comprehensive privacy overhaul, successive administrations have attempted, however belatedly and piecemeal, to curb the outflow of sensitive data to adversaries through espionage or digital commerce. For instance, after the FitBit-related data leak of early 2018, the Department of Defense responded in August of that year by restricting the use of geolocation-enabled devices in areas of active military operations.²⁰ The directive ordered a broader risk assessment about digital geolocation, while Pentagon staff noted that enforcement of the order would be conducted on a “case-by-case basis.”²¹ By spring 2023, the Securities and Exchange Commission (SEC) had rolled out minimum cybersecurity standards and risk assessment guidelines for publicly traded companies.²² The Biden White House had also issued an executive order forbidding the use of commercial spyware by U.S. departments and agencies in a bid to use federal purchasing power to constrain the burgeoning global market for sophisticated tracking and surveillance technologies.²³

While these measures implicitly or indirectly nodded to the growing counterintelligence threat to U.S. officials and service-members, a subsequent Biden directive in 2024 was both more blunt and expansive. Executive Order (EO) 14117 tasked the attorney general with determining what kinds of data, irrespective of volume, could be exploited by specific adversarial states if linked to active or former federal employees, contractors, and sensitive government locations.²⁴ It additionally forbade the “bulk sale” of any nonpublic health, biometric, genomic, financial, and geolocation data that might be exploited by U.S. adversaries to identify and target individuals or groups (also delegating the task of defining “bulk” to the Department of Justice). Meanwhile, the Federal Trade Commission (FTC) and Consumer Financial Protection Bureau (CFPB) made their initial forays into stricter enforcement of data broker activity later that year.²⁵

However well-intended, these measures could prove fleeting from a political perspective, and easily circumventable from a practical one. The Trump administration moved quickly after it took office to weaken many of the agencies, like the CFPB, that it considers onerous for businesses or superfluous to the federal bureaucracy.²⁶ Data experts across academia, civil society, and industry-supporting law firms, meanwhile, widely predict that many of the most significant cybersecurity, spyware, and data-protection initiatives from previous years are likely to be deliberately scrapped by the Trump White House, if not simply left to atrophy.²⁷ But from a purely counterintelligence perspective, experts are also quick to point out that previous measures merely acted like Band-Aids on a gaping wound.

This is because the structural focus of EO 14117 and its congressional cousin, the Protecting Americans’ Data from Foreign Adversaries Act of 2024 (or padfaa, for short), on “cross-border data-flows” is too narrowly scoped, with the key terms being too arbitrarily defined to serve as a meaningful safeguard. Both documents focus squarely on preventing specific adversarial countries, such as China and Iran, as well as any firms controlled in some part by their citizens, from obtaining sensitive troves of data about U.S. entities.

But in both instances, the stress on volume, direct transfers, and specific categories of data overlook the many pathways that can otherwise be taken to accumulate the data needed to build a comprehensive mosaic. Peter Swire and Samm Sacks, writing for Lawfare, draw parallels with money laundering: “Years of effort have gone into detection and regulation of ‘structuring’ transactions, to prevent the small from adding up to the big. [But it] will be a long and uncertain path to build similar rules for data sales.”²⁸ These measures fail to consider the spider webs of indirect purchases, diversions, and well-obscured fronts—common in the intelligence business for circumventing sanctions and export controls—that foreign adversaries could wield to buy up reams of data indirectly from U.S. firms.²⁹ These fail to tackle the deeper causes of this stubborn phenomenon of data porousness, such as highly unregulated data broker sales and poorly regulated online advertising practices, making government actors ill-equipped to handle the full spectrum of privacy issues, let alone complex national security risk.

And, perhaps most crucially, these concomitant programs struggle to grapple with the intersections between privacy and national security. The executive branch has, for instance, relegated the task of inspecting consumer data transfers with a national security risk to the Justice Department’s National Security Division, meaning that inspectors operate without comprehensive federal privacy and security laws to stand on. Another policy decision has effectively given some national security data authority to the FTC, an agency well-versed in complex data technologies and predatory adtech practices, but one lacking the resources, expertise, and integration with intelligence and law enforcement agencies necessary to tackle nation-state exploitation of data.

Confronting the Dilemma

If policymakers’ speeches and industry lobbying points are to be believed, commercial data is a strategic resource necessary to “win” the “AI race” with China. This argument demands continued widespread, unfettered, permissionless corporate access to vast troves of data (so the rhetoric goes). Not only consumer data should be accessible, but data held (and, in many cases, copyrighted by) corporations, research institutions, and other entities, too—all of which can be funneled into corporate AI systems, turbocharged into the military and national security apparatus, and used to gain a competitive edge, particularly against Beijing.

Indeed, there are plentiful ways that commercial data and open-source intelligence can be used to advance national security objectives and fulfill what many would agree are important governmental functions, such as hunting foreign hackers or investigating Russian war crimes in Ukraine.³⁰ Yet, the conundrum lies in the copious national security and counterintelligence problems embedded throughout the practically unrestrained private-sector data collection landscape. However much might be gained by this (often nakedly self-serving) appeal to the national interest as the trump card against stronger AI and data regulations, there are too few incentives to consider what has already been lost: any semblance of anonymity, obscurity, or privacy that once enabled sensitive government entities to function safely.

Ultimately, policymakers must come to terms with the fact that U.S. national security officials, military service members, intelligence officers, and sensitive facilities worldwide already operate at a major disadvantage in the face of both state and nonstate adversaries alike. This vulnerability will only grow with time as long as we have few meaningful protections against data-brokers and other types of unfettered data collection, transmission, and use. Protecting service members from espionage, blackmail, sabotage, or worse will mean confronting a stark reality: piecemeal policing at the post-collection stage like that attempted by previous administrations cannot achieve what even a modest federal data privacy framework could. Taming the adtech market will require a great deal of political will, particularly in response to objections from would-be “national champions.”³¹ But the decade since the OPM hack has made one thing clear: data-dependent tech companies can champion the national economy or they can champion the national security bureaucracy, but doing the former through race-to-the-bottom data hoarding and sale will only put the latter in even greater
jeopardy.

This article originally appeared in American Affairs Volume IX, Number 2 (Summer 2025): 156–65.

Notes

¹Dan Verton, “Impact of OPM Breach Could Last More than 40 Years,” FedScoop, July 10, 2015.

²Liz Sly, “U.S. Soldiers Are Revealing Sensitive and Dangerous Information by Jogging,” Washington Post, January 29, 2018.

³Tim Hwang, Subprime Attention Crisis: Advertising and the Time Bomb at the Heart of the Internet (New York: Farrar, Straus and Giroux, 2020); Jesse Frederik and Maurits Martijn, “The New Dot Com Bubble Is Here: It’s Called Online Advertising,” Correspondent, November 6, 2019.

⁴April Falcon Doss, Cyber Privacy: Who Has Your Data and Why You Should Care (Dallas: BenBella Books, 2020).

⁵Issie Lapowsky, “NATO Group Catfished Soldiers to Prove a Point About Privacy,” Wired, February 18; Frank X. Hartle et al., “The Impact of Social Media Geolocation on National Security and Law Enforcement,” Issues in Information Systems 23, no. 1 (2022): 204–13.

⁶Dhruv Mehrotra and Dell Cameron, “Anyone Can Buy Data Tracking US Soldiers and Spies to Nuclear Vaults and Brothels in Germany,” Wired, November 19, 2024.

⁷Justin Sherman, “Data Brokers and Threats to Government Employees,” Lawfare, October 22, 2024.

⁸For a primer, see: Stacey Gray and Pollyanna Sanderson, “Policy Brief: Location Data Under Existing Privacy Laws,” Future of Privacy Forum, December 2020.

⁹Federal Trade Commission, “FTC Takes Action against General Motors for Sharing Drivers’ Precise Location and Driving Behavior Data without Consent,” news release, January 16, 2025.

¹⁰Federal Trade Commission, “FTC Takes Action against Gravy Analytics, Venntel for Unlawfully Selling Location Data Tracking Consumers to Sensitive Sites,” news release, December 2, 2024.

¹¹ Andy Z. Wang, “Network Harms,” University of Chicago Law Review 91, no. 7 (November 2024): 2093–137.

¹² Brady Kruse, “How Congress Can Rein in Data Brokers,” CyberScoop, December 20, 2023.

¹³ Luc Rocher, Julien M. Hendrickx, and Yves-Alexandre de Montjoye, “Estimating the Success of Re-Identifications in Incomplete Datasets Using Generative Models,” Nature Communications 10, no. 1 (July 23, 2019): 3069.

¹⁴ Vedran Sekara, Laura Alessandretti, Enys Mones, and Håkan Jonsson, “Temporal and Cultural Limits of Privacy in Smartphone App Usage,” Scientific Reports 11, no. 1 (2021).

¹⁵ Henrik Twetman and Gundars Bergmans-Korats, Data Brokers and Security: Risks and Vulnerabilities Related to Commercially Available Data (Riga, Latvia: NATO Strategic Communications Centre of Excellence, 2020).

¹⁶ Robert Kuttner, “Global Capitalists or National Champions?,” American Prospect, October 17, 2023.

¹⁷ Heather Roff, “The Frame Problem: The AI ‘Arms Race’ Isn’t One,” Bulletin of the Atomic Scientists 75, no. 3 (April 2019): 1–4.

¹⁸ Dell Cameron, “Surprise! The Latest ‘Comprehensive’ US Privacy Bill Is Doomed,” Wired, June 27, 2024.

¹⁹ Jeremy Crampton, “The U.S. Government Must Protect Citizens from Geolocational Disinformation and Surveillance AI,” Scholars Strategy Network, January 31, 2024.

²⁰ Craig Timberg, “Lawmakers Demand Answers about Strava ‘Heat Map’ Revealing Military Sites,” Washington Post, January 31, 2018.

²¹ Lauren Williams, “DOD Sharply Restricts Use of Geolocation Devices and Services,” Defense One, August 8, 2018

²² SEC staff, “SEC Adopts Rules on Cybersecurity Risk Management, Strategy, Governance, and Incident Disclosure by Public Companies,” U.S. Securities and Exchange Commission, July 26, 2023.

²³ Mark Mazzetti, “Biden Acts to Restrict U.S. Government Use of Spyware,” New York Times, March 27, 2023.

²⁴ Mark Febrizio, “Biden’s Ambitious Executive Order Does More for Data Security than Banning TikTok,” George Washington University Regulatory Studies Center, April 26, 2024.

²⁵ Electronic Privacy Information Center, “FTC Takes Action Against Data Brokers for Selling Sensitive Location Data,” December 3, 2024.

²⁶ Lesley Stahl, “Trump, DOGE Work to Shutter CFPB, an Agency Created in Response to the 2008 Financial Crisis,” CBS News, February 23, 2025.

²⁷ Eric Geller, “More Spyware, Fewer Rules: What Trump’s Return Means for US Cybersecurity,” Wired, November 14, 2024.

²⁸ Peter Swire and Samm Sacks, “Limiting Data Broker Sales in the Name of U.S. National Security: Questions on Substance and Messaging,” Lawfare, February 29, 2024.

²⁹ Justin Sherman, “Data Brokerage and the Third-Country National Security Problem,” Lawfare, April 16, 2025.

³⁰ Lindsay Freeman, “Digital Evidence and War Crimes Prosecutions: The Impact of Digital Technologies on International Criminal Investigations and Trials,” Fordham International Law Journal 41, no. 2 (2018).

³¹ Henry Farrell and Abraham Newman, “What Happens When Tech Bros Run National Security,” TIME, September 20, 2023.