National Security

773 Million Accounts Now at Risk: Understanding ‘Collection #1,’ the Latest Mega-Breach

It’s official: a new record for “largest data breach in history” has been attained. The incident was first revealed by security researcher Troy Hunt on his cybersecurity blog. Hunt writes that he became aware of the hack after being directed to an unidentified hackers’ forum (almost certainly located on a Dark Web site) in which participants were discussing a recent compilation of personal data sets.

The collection, allegedly pulled from a multitude of sources, totaled over 12,000 separate files and more than 87GB of data. In total the compilation, dubbed “Collection #1” by Hunt, consisted of 2.5 billion lines of code, totaling over one billion email and password sets. Out of those, nearly 773 million were unique email addresses.

It is important to understand the implications of Hunt’s find.

Those readers with an interest in tech news may be quick to point out that the 773 million credentials identified by Hunt still does not come close to the infamous Yahoo breach, a crime that affected some 3 billion user accounts. The element that makes Collection #1 unique is the level of exposure. While the Yahoo hack resulted in billions of compromised accounts, meaning the accounts could have potentially been accessed by hackers during the course of the attack, in this incident, login credentials were actually posted to a public forum, open for any participant to see.


The Collection #1 database is testament to the cumulative nature of cybercrime and the organic collaboration among hackers. The post on the forum identified by Hunt referenced “a collection of [over] 2000 databases and Combos (combination of usernames [usually email addresses] and passwords) stored by topic” and provided a directory listing of 2,890 of the files. Hunt compiled these files in a publicly viewable Pastebin file.

It is the culture of criminality that makes many (perhaps even the majority) cybercrimes possible. Hackers the world over contribute to data complications just like these on Dark Web forums. Sometimes there’s a profit motive, and databases are sold to the highest bidder. In other instances (as seems to be the case with Collection #1) criminals are in it for the love of the game. Troves of files are uploaded for any would-be hacker to capitalize on. The industry term for this is called a “paste,” namely when information is “pasted” to a publicly-facing website designed to share content.


So what do hackers do with troves of stolen credentials?

As one would guess, hackers have more sophisticated methods then checking passwords one by one against random accounts.

Primarily there are two not-so-nice campaigns that can be run with a database like Collection #1. The more benign possibility is spamming. Advertisers will go to great lengths to contact potential customers and an extra 500 million potential leads here and there can’t hurt.

The more malicious use for a credential database is a hacking technique known as credential stuffing.

Once cybercriminals have amassed a good amount of spilled usernames and passwords, they use a program called an account checker to test the stolen credentials against many websites, usually high-value sites such as social media platforms or online marketplaces.

Statistically, 0.1 to 0.2 percent of total logins in a well run stuffing campaign are successful. That may sound like a minuscule amount, but when running hundreds of thousands of credential sets against dozens or even hundreds of sites, those successes add up. After getting a hit, the attacker drains the breached account of stored value, credit card numbers, or other high-value information. Unfortunately, credential stuffing attempts are far from a rare occurrence. According to industry research, 80 percent of all login attempts in online retail over 2017 were conducted during stuffing operations. It is not uncommon for high-profile companies to get hit during such campaigns. In early November, HSBC, one of the seven largest financial organizations in the world, announced that as many as 14,000 customers had their personal information compromised in a recent data breach. Expert analysis of the incident concluded that the breach had all of the hallmarks of a credential stuffing attack.


With this is in mind it’s easily understood that as the volume of compromised credentials grow, the risk to digital identities around the world increases exponentially.

For a hacker looking to cause trouble, good quality material is not always easy to come by. In order to execute efficient attacks, hackers need credentials that are still “live,” that is to say, in use by an active account. If the username belongs to an inactive email address or a password has since been changed by a user, there is no longer any value there for a criminal. This means that Internet thieves need to know two things about private data troves to determine their usefulness: (a) they are relatively new, and (b) no do-gooder concerned for the public welfare (private or governmental) has sounded the alarm. Luckily, there are today a slew of security researchers around the world, both private and incorporated, that collect illicitly-exposed data and provide the public with ways to check if their data has been compromised. One of the more famous free services is run by Hunt himself—the site Have I Been Pwned (HIBP). HIBP contains nearly 6.5 billion account credentials and is updated regularly. This means that HIBP has a good portion of the private details that have been compromised in recent history the world over.

The results of testing the Collection #1 files against the HIBP database were chilling. A slice of the email addresses were run through HIBP to see how many of them had been seen before. A few hundred thousand checks of a large portion of addresses (numbering in the tens of millions) were never before seen. As far as the passwords on Collection#1, out of the 21 million unique ones, about half of them were never previously identified. What this means is that the Collection #1 files contain huge volumes of freshly-hacked credentials.


Due to the sheer magnitude of this breach, all conscientious users should take a brief moment to determine whether they were affected. This can be done using free services like those of Hunt and others. In the event that a user uncovers he or she has been “pwned” (ie.: login information has been exposed), there are simple steps to re-secure an account, such as changing a password and adding two-step authentication, a feature easily integrated and available on most popular sites today.

The opinions expressed here by contributors are their own and are not the view of OpsLens which seeks to provide a platform for experience-driven commentary on today's trending headlines in the U.S. and around the world. Have a different opinion or something more to add on this topic? Contact us for guidelines on submitting your own experience-driven commentary.
Samuel Siskind

Samuel Siskind studied intelligence research at the American Military University in West Virginia. He served as a squad commander in the Israeli Defense Force (IDF) Corp of Combat Engineers, in the Corps' ground battalions and later in its Intelligence Wing at regional and divisional stations. For the past five years, Samuel has worked as a consultant and researcher on physical and information security issues for private and governmental institutions, in the US, Africa, India, and Israel. He currently lives in Jerusalem.

Join the conversation!

We have no tolerance for comments containing violence, racism, vulgarity, profanity, all caps, or discourteous behavior. Thank you for partnering with us to maintain a courteous and useful public environment where we can engage in reasonable discourse.

OpsLens Premium on BlazeTV.

Everywhere, at home or on the go.