Data Security
,
Incident & Breach Response
,
Security Operations
Experts Debunk Legitimacy of Data Sets With 16 Billion Credentials Being Circulated

Beware of claims of “colossal” collections of leaked online credentials, for therein almost always lies a heavy dose of exaggeration – as in this week’s news of archives comprising 16 billion stolen login credentials have circulated on the cybercrime underground.
See Also: OnDemand | Navigate the threat of AI-powered cyberattacks
The tranche of data, supposedly amassed by information-stealing malware and data leaks and compromising login credentials for Apple, Facebook and Google accounts, collectively stands as “the largest data breach in history,” asserted Lithuanian tech site Cybernews, which broke the story on Wednesday. The records have circulated under such names as “logins,” “bigdata-index,” “breach-files” and “trojan-logs.”
Experts have questions.
For starters, the numbers don’t add up. Cybersecurity firm Hudson Rock said that on average, an infostealer will harvest about 50 credentials from each endpoint it infects. “For a leak to reach 16 billion credentials, it would require 320 million compromised devices – a figure that’s unrealistic given global infection trends.”
Instead, “the leak is likely a combination of legacy infostealer credentials, legacy database leaks data and made up lines,” meaning it’s largely padded out, if not completely fabricated by whoever assembled each different data set, said Alon Gal, CTO of Hudson Rock.
Gal said his firm had seen no signs of any recent, super large infostealer campaigns or breaches comprising millions or billions of records, further suggesting that “this leak is more noise than substance,” and comprised of a grab-bag of data and had no focus or contain any especially sensitive credentials. “This leak is a disorganized data dump with little strategic value,” it said.
The freshness of the data is also under question. “Someone took a bunch of existing leaks, threw it all together, and slapped a NEW stick on it,” said the malware-tracking expert @vxunderground in a post to social platform X.
“If someone successfully compromised Google, Facebook and Apple all at the same time it would be international headlines and politicians around the globe would be having a psychiatric meltdown,” @vxunderground said.
To be clear, the infostealer scourge is becoming worse. Criminal users can easily access these malware-as-a-service offerings. Lately, StealC, Lumma and RedLine are the most popular, said cybersecurity firm Kela in a recent report.
Attackers wield infostealers to steal sensitive – and valuable – information stored on a PC from browsers, including session tokens that allow attackers to emulate a legitimate user and defeat multifactor authentication defenses. The malware is designed to swipe credentials for everything from online banking sites, to cryptocurrency exchanges and wallets, to corporate VPNs.
Stolen data often is bought and sold on automated cybercrime forums and Telegram channels. Sometimes it’s given big, scary-sounding names, despite having been padded with outdated or fake data.
That’s a reminder that for years, people who trade in data leaks have hoarded information of dubious quality from public leaks, private forums and other sources, and regularly repackage it for sale, or just to boost their cybercrime kudos.
In many cases, the appearance of a large quantity of data of dubious value – such as a set containing hundreds of millions, if not billions of records – might get mistaken for quality, while posing little if any actual risk.
Big sets of leaked data of unknown provenance and quality regularly debut. This leak report, notably, follows the appearance in February of the “Alien Txtbase,” so named for the Telegram channel that hosted 284 million unique email addresses, plus associated passwords and details of the websites for which they’d been used to register the accounts. That supposed leak comprised 1.5 terabytes of data with 23 billion rows of information gleaned from infostealer logs.
Not all of that data was legitimate. For example, Hudson Rock said it included numerous “manipulated or fabricated entries.” In many cases, attackers appeared to have altered the credentials, making minor changes to usernames or passwords, for brute-force attack purposes. Thus one record, perhaps valid months or years ago, might become hundreds or thousands of records today, none of which might have ever worked.
Months or years later, how many times will that mostly bogus – and by then thoroughly outdated – data get repackaged, sold or traded, or become the focus of an alarmist news report?
“My site had a leak back in 2011,” a hacking community forum admin said in a Thursday X post. “At least once a year it’s packaged again as a ‘new leak’ and I have to waste time reviewing to make sure it’s not legit, just in case.”