Středa, 4 června, 2025

21st Century Data Breaches: A Brief History of Major Incidents

Sdílet

The modern age has introduced modern problems, one of which is the growing prevalence of data breaches and leaks. These incidents have emerged alongside the exponential increase in the volume of data, particularly private data generated by individuals, businesses, institutions, and governments. Such data has long held immense value, even before the invention of digital storage solutions. It has consistently attracted unauthorized people and organizations wanting to exploit it for personal or external gain.

While not all data breaches and leaks are malicious or cause harm, this paper focuses on intentional and impactful ones. Specific examples that will be discussed include the Equifax breach, which compromised the sensitive information of over 140 million individuals (Mathews, 2019), the 2013 and 2014 Yahoo breaches, which remain the largest in history by scale, exposing 3 billion accounts (Haselton, 2017), and the Shanghai National Police database leak, exposing the personal details of over a billion people (Jazeera, 2022). Other significant incidents that will also be included, such as the Indian Council of Medical Research data leak (Curryer, 2023), highlight the evolving methods and diverse motivations behind these attacks. Older cases, such as the US Department of Veteran Affairs data breach in 2006 (May 22, 2006 Data Security Breach Leaves 26.5 Million Veterans Vulnerable to Identity Theft | US Senator Ed Markey of Massachusetts, 2006), and retail-focused breaches like the Target incident in 2013 (Target Settles “Nightmare Before Xmas” Data Breach for $18.5 Million, 2017), will show how vulnerabilities span across sectors and years.

We will explore the causes of these cases, the damage they inflicted, and the measures taken retrospectively to address them. Finally, we will discuss practical methods to enhance personal and organizational cybersecurity, aiming to reduce the risk of falling victim to such incidents.

Understanding Data Breaches and Historical Methods

Understanding Data Breaches

A data breach can be defined as “Any security incident in which unauthorized parties access sensitive or confidential information, including personal data (Social Security numbers, bank account numbers, healthcare data) and corporate data (customer records, intellectual property, financial information).” (Kosinski, 2024).

Moreover, to distinguish between the two, Data leaks are indeed different as they are “an unauthorized disclosure of sensitive, confidential, or personal information from an organization’s systems or networks to an external party.” (What Is a Data Leak? | Microsoft Security, n.d.).

To give a simplified perspective, we will give a realistic example of a process that follows an incident that encompasses a data leak and breach of personal information.

Imagine setting up an account on a website—whether for social media, a cloud storage service, an email platform, or even applying for a driver’s license. During this process, a person provides personal information to a third party, which stores the data to maintain the individual’s account or ID (Vigderman & Turner, 2024). The data breach begins when this third party, either through an error or a cyberattack, allows an unauthorized fourth party to access their data storage systems (Higgins & Higgins, 2024). This means personal data, which the individual trusted the third party to protect, is now in someone else’s hands. Having gained unauthorized access, this fourth party can sell, trade, or misuse sensitive information, often for purposes that harm the affected individuals. For instance, stolen data frequently becomes a commodity on dark markets (Kosinski, 2024), where it is bought and sold for malicious activities. These malicious activities can then range from simple targeted marketing to more extreme cases like identity theft (Kosinski, 2024), financial fraud, or even blackmail (Higgins & Higgins, 2024).

Organization-wise, as an organization collects more data over time, the risk of being targeted by cybercriminals or making mistakes in handling that data also increases (Metomic, n.d.). Additionally, the more data accessibility organizations provide, the bigger the risk for exposure, whether through breaches or human error (data.org, 2024).

Organizations bear a significant responsibility for the data they store, as they are often the primary targets of attacks because many attackers are looking for financial gain (Bhadouria, 2022). Ensuring the security of this data is a moral and legal obligation (GDPR, CCPA) to protect individuals‘ privacy and trust.

Despite advances in cybersecurity, no system is entirely immune to data breaches, as attackers continuously find new methods to exploit vulnerabilities (National Security Agency/Central Security Service, 2024).

It is important to mention the most relevant methods of data getting breached and leaked to give a scope of how many ways data can get stolen and used for malicious purposes. We will categorize these causes into external and internal (Cheng et al., 2017).

Examples of External Causes

Malware – different kinds of malicious software may be used to essentially rob people/organizations of their data; some notable kinds of this type of software are the following: Ransomware, Trojan horse viruses, Keyloggers, and or Phishing kits (Bhadouria, 2022)

Social engineering – this kind of method is usually combined with other methods, most often malware, specifically Phishing kits, in order to be even more effective

Physical data theft – although not the most common, it does still happen and can cause major damage, often involving stealing USB flash discs, laptops, and other storage devices

Hacking/Exploiting – usually exploiting vulnerabilities in systems, denial of service attack (generally accompanied by another exploit and or malware), zero-day exploits, man in the middle attack

Examples of Internal Causes

Insider espionage – employees can intentionally share private organizational data to harm the company, sell this data for profit, or both; they could also be an agent from a competing company

Human error – this could be very widely applicable for anything involving not keeping the security systems up to date, mishandling data, falling for phishing attempts, unintentionally installing malware on computers, leaving portable storages (e.g., Any hot plug portable drives) unattended, and or sending private credentials to the wrong person

Significant Data Breaches and Leaks in History

US Army Veterans Breach

A case from 2006 is an exemplary historical event on why all people should be mindful of physically storing their data in a safe place where it would not get easily lost or stolen. The victims of this breach were all US Army veterans discharged from 1975 to 2006, totaling up to potentially 26.5 million veterans affected. (Perera, 2006).

A laptop was stolen from a Veterans Affairs employee’s house, with the aforementioned employee taking unencrypted data home from work without authorization as part of a statistical analysis for an annual study as part of his job. The data included veterans’ names, dates of birth, social security numbers, disability ratings, and the names of their spouses (Perera, 2006).

Fortunately for the US Army and its veterans, the stolen laptop was later recovered as part of an investigation. It was discovered that the laptop had not been stolen intentionally, but rather it was just a part of an overall burglary of the employees’ house because forensics determined the data of veterans had indeed not been accessed. This incident has been a major warning for the Veterans Affairs Department to rework their inner security measures regarding handling their data. This resulted in new daily reports about these kinds of incidents, encryption of all sensitive data and work laptops, more transparency about data breach cases, and establishing a new data breach analysis team (Mosquera, 2012).

This breach was one of the earlier wake-up calls to the importance of secure handling of government data.

Target Data Breach

The 2013 Christmas data breach of a US shopping mall corporate giant Target shows that almost anyone could be a victim of data theft, whether they have good security measures or not. Affecting at minimum roughly 40 million credit and debit cards that were compromised in total and the account information of 70 million customers, which included the likes of full names, phone numbers, email addresses, and the tied credit and or debit card information (Stempel & Bose, 2015).

Hackers infiltrated Target’s network via a supply chain attack and spear phishing methods, meaning that Target’s third-party vendor received an email from a duped employee address containing trojan horse malware with keylogging ability. The hackers then waited until they scraped the necessary credentials to get into the third-party vendor, an HVAC company. From then on, they hacked into the Target network and targeted their Point of sales systems, registering thousands of card swipes per day (Kassner, 2015).

Interestingly, both the HVAC company and Target used anti-malware software that could detect these attacks. Moreover, Target’s malware detection tool did detect the breach and sent a warning, but for unknown reasons, it was ignored. On the other hand, the HVAC company used a free trial of Malwarebytes anti-malware detection tool, which does not offer real-time protection, therefore only detecting the malicious software upon the start of a manual scan (Kassner, 2015).

The aftermath of this case was not as drastic as some may think, as Target is still a flourishing company after all these years. The stolen credit/debit card information was later found for sale on the dark web marketplaces. The then chief executive officer was fired, Target’s profit fell almost to one-half of the prior year’s fourth quarter, and stock prices fell 9 % following 2 months after the breach discovery (McGrath, 2014). Likely, the most significant losses due to this incident are the numerous lawsuits and the connected investigations, all-around restorative measures, and downtime, which have totaled up to a minimum of around 250 million US dollars in 2024 (Niedbala, 2024).

New cybersecurity measures put into place post-incident were: Improved monitoring of system activity, improved more secure Point of sale systems, updated firewall rules and policies, limited or restricted vendor access to Target’s network, revised privileges on target personnel accounts, expansion of two-factor authentication and more (Kassner, 2015).

Yahoo Data Breaches

In the years 2013 and 2014, Yahoo experienced two of the perhaps largest data breaches in world history. There were roughly 3 billion accounts compromised in total, which was the entirety of all Yahoo accounts that existed at the time of the 2013 breach. Besides the sheer size of this leak, Yahoo also failed to disclose the total amount, only acknowledging that 1 billion accounts information was breached. The leaked data mainly encompassed names, passwords, email addresses, and security questions and answers, though some of the data was encrypted (namely passwords) (Larson, 2017).

Firstly, we will mention the 2014 breach as it was the first to be disclosed by the company. Secondly, we will discuss the 2013 breach as it was disclosed later and is the larger one of the two.

The alleged perpetrators of the 2014 attack were of Russian origin with Russian and Canadian citizenships. It was also alleged that the attack was state-sponsored. The method of action was sending a phishing email to some of the employees of Yahoo, with one click being enough for them to hack into the network and go further from there. The main target of the hack was the enormous user database that has been rapidly growing since the start of the company. Once the hackers have successfully accessed the database, they scraped all the data and saved it on their computers. They were also able to gain access to cryptographic values of individual accounts, which allowed them to generate access cookies for any email account they wanted without the need for a password. The account management tool available for the database did not allow for simple text searches, and therefore, they turned to look at the victims‘ recovery emails to identify individuals requested by the Russian agents involved in this case. In total, 500 million users were affected by this leak. (Williams, 2017).

In 2017, it was determined that the alleged perpetrators were indeed involved in this crime. In return, they were charged with computer hacking, economic espionage, and other crimes related to this case (US Office of Public Affairs, 2017).

In 2016, Yahoo was closing the deal about its acquisition by Verizon, so it had to disclose the size of the data breaches that took place in 2014. It falsely disclosed the sheer size of the breach, claiming it was a minor one to Verizon initially, but then corrected it to 500 million affected users. This incident lowered Yahoo’s purchase price by 350 million US dollars. Upon acquisition, Yahoo revised its disclosure several times, with the last one being in 2017, finally admitting that the 2013 breach had potentially compromised the information of 3 billion accounts (McAndrew, 2018).

The data breached in 2013 was sold on the dark web, as the perpetrators behind this hack, an anonymous hacker group, had a financial motive. It is not clear exactly how they managed to gain access to Yahoo databases, but it is highly suspected that it resulted from phishing links and exploitations of vulnerabilities like in 2014. Notable victims of this leak included employees of the FBI, the NSA, the White House, and officials in the U.K. (Larson, 2016).

These Yahoo breaches were a significant factor in the firm’s decline as it had a notable financial impact due to the numerous settlements and fines the company faced. Additionally, the breaches severely damaged the reputation of Yahoo, leading to a major loss of users’ trust (Senouci, 2023).

Equifax Data Breach

Equifax is a credit reporting company in the United States that reports on the financial health of US citizens. In 2017, their systems were breached, and the sensitive data of roughly 145 million Americans were in the hands of a foreign country. This data breach story is intriguing and quite unique in different spectrums, as we will explore (epic.org, 2020).

The data was breached in March of that year; hackers got into the Equifax system, exploiting a vulnerability in the web application of a dispute portal. Additionally, the hackers got access to the databases containing names, social security numbers, birth dates, addresses, and driver’s license numbers, thanks to the fact that the dispute portal was connected to the databases. What is more, the hacker group found credentials in plain text after breaching their systems via the aforementioned exploit. On the other hand, it took Equifax until the end of July to figure out something was up, but by that point, hackers had already stolen all the data that they needed. The reason it took this amount of time for Equifax employees to start noticing it is that, till July, the company had not renewed its encryption certificate, which was a security measure implemented for continual encrypting and decrypting of all the internal data traffic. The perpetrators, having been encrypting the stolen data before transmitting it to their location of choice, went completely undetected thanks to this, as the unrenewed certificate could not decrypt these data transmissions (Fruhlinger, 2020).

One hundred forty-five million Americans‘ personal identity information leaked. It was all the data needed for en masse identity theft, loan fraud, and tax fraud. In addition, experts expected the data to eventually end up on dark web markets. What was odd at the time was that neither of these scenarios happened (Fruhlinger, 2020).

Finally, after an investigation, four members of China’s military were charged by the US Department of Justice with having hacked the Equifax systems. This explained why there were no identity thefts or offers for the data on dark web markets. The main motive of this breach was the statewide espionage of US citizens for the benefit of the Chinese government. Specifically, the Federal Bureau of Investigation has theorized that the Chinese government could use this information to discover American officials or spies that have financial trouble, afterward offering them bribes to give information about US intelligence or to blackmail them (Federal Bureau of Investigation, 2020).

Overall, this case highlights the absolute importance of keeping security systems up to date and functioning, as neglecting them can lead to some unfortunate incidents. As for the costs of this breach, Equifax reported that it invested 1.4 billion US dollars into upgrading its security systems. There were also some fines and compensations, but these were in the order of lower tens of millions of US dollars, which, for a company the size of Equifax, was not a noticeable setback (Fruhlinger, 2020).

Shanghai Police Database Leak

Although not confirmed officially by the Chinese government, the 2022 Shanghai police database leak is regarded as the largest leak in history, affecting around 1 billion people and involving several billion records, with multiple sources supporting its genuineness.

The discovery of this leak started on a cybercrime forum, where a user was selling the data, claiming he had sourced it from Alibaba’s cloud network. The user also provided a sample of data to prove it was indeed real; portions of this sample were later verified as being authentic. The data contained the names, birthplaces, addresses, national ID numbers, phone numbers, and case details of mainly Chinese citizens dating from 1995 to 2020 (Whittaker & Page, 2022).

The cause of the leak was most likely human error, as the CEO of Binance, Zhao Changpeng, said that their threat intelligence team detected 1 billion records on sale on dark web markets and later also stated “that a government developer’s blog post on the China Software Development Network (CSDN) accidentally included the credentials to a Shanghai police database” (Todd, 2022).

The Chinese government has still not confirmed this leak’s existence, but interestingly, after the leak went public, the hashtag „Shanghai data leak“ was blocked on the Weibo messaging service. Some people have speculated that the Chinese government did not want to confirm it despite a sizeable amount of news outlets and experts confirming it because it would bring a bad image to the Chinese governmental cybersecurity, especially when the first claim of the breach was published as “China has vowed to improve protection of online user data privacy, instructing its tech giants to ensure safer storage after public complaints about mismanagement and misuse” (Ni, 2022).

We can only assume that the database is still being traded on the dark web markets, and whoever gains access to it could potentially use it for various malicious purposes.

ICMR Indian Medical Database Leak

Also not confirmed by the nation’s government, this 2023 leak could potentially be if confirmed, the largest data breach in Indian history. Affecting 815 million people, the data that was breached contained Aadhar ID card numbers, passport details, names, and contact details (Alles Technology, 2023).

The data was discovered being brokered by a user, on an online Data Breach hacking forum. According to the user who was brokering the data at the time, „the data was extracted from information submitted by Indian residents to the Indian Council of Medical Research (ICMR) when they had COVID-19 tests “. To support the claim that the offered database is not fake, the user provided a sample of 100,000 records. An analysis conducted by Resecurity (the organization that first took notice of the user’s offer) confirmed that the ID card numbers were authentic, suggesting the data breach was indeed real (Cluley, 2023).

Afterward, the ICMR denied it had been hacked but acknowledged the existence of the data breach itself and was allegedly investigating the incident. What is more, to explain the severity of the leak, the Aadhaar ID mentioned includes biometric data that is widely used in India for verification of many services, including banking, government services, and telecommunication services (Alex, 2023).

The investigation of this case is still ongoing in 2024

Prevention Recommendations

As already mentioned, sometimes an individual simply cannot prevent being a victim of a data breach, as it is entirely out of his control at times. Be it the government’s fault or the grocery stores’ mistakes that eventually lead to individual’s data being leaked; there are still ways we can minimize the risk of this happening in the era of mass digitalization

In the following text, general methods will be recommended on how to prevent being part of a data breach and minimize any damage if one’s data does get breached.

Preemptive Methods

Limit data sharing – This one is perhaps the most important for an individual: always stay mindful of what data you are entering on what sites, avoid entering sensitive data on unprotected sites (sites without HTTPS protocol), avoid entering personal identity information (phone number, address, name, and other) on any website besides creating an account or making a purchase

Routinely changing your passwords – Some experts recommend changing passwords every 90 days; in my experience, this is not a realistic expectation of day-to-day users, and so we would recommend ideally changing up passwords at least once every year for all accounts (this could be made easier with passwords managers, but those come with their own risks like the manager vendors’ data getting breached)

Not using the same passwords for all accounts – Try to limit using the same password (even if it is strong) for more than 2 or 3 accounts at maximum

Keep your operation system and antivirus program (if you have one) updated – Having systems updated is general advice for avoiding cybersecurity disasters

Avoid accessing and or sharing sensitive information when connected to public Wi-Fi

Use strong, unique passwords

Proactive Methods

Scouting the organization you are going to share your data with – This one is tedious but can minimize data breach risks, but it simply is learning as much as a person can about the organization, especially their cyber security measures and history of prior incidents and their general reputation

Using user privacy-oriented web services (browsers, email clients, messaging, and other)

Educate oneself and others about the newest cyber threats – Especially be aware of the newest phishing trends, as phishing methods get more innovative with every year

Data encryption – Encrypt all sensitive data that one does not want to be stolen

Use multi-factor authentication

Conduct regular security checks

Conclusion

Besides a basic understanding of the nature of data breaches, this paper has shown us a few notable examples of data breach cases, every single one of them being unique in its own way, be it the cause, methodology, aftermath, the organizations‘ reaction or how many people were affected. Together, they all showed the importance of government and companies’ attitudes to data protection. The importance of strict compliance with at least fundamental cybersecurity measures, such as keeping all systems updated, making sure vendors comply with your security requirements, and educating your employees about the common social engineering techniques hackers use, such as phishing campaigns. These individual cases also demonstrate the damage they can cause, be it in the present or future. We have also included some preemptive and proactive techniques to give some idea on how to minimize the damages data breaches can do and, in the best scenario, prevent them from happening.

List of References

+ posts

Číst více

Další články