Facebook: 419 Million Scraped User Phone Numbers ExposedSocial Network Says Problem Fixed, But Techcrunch Reports Many Still Accurate
Facebook has confirmed that unprotected, internet-connected databases containing more than 419 million users' phone numbers contained data that had been scraped from the social network.
See Also: The Essential Guide To Machine Data
The existence of the online server containing the databases was first reported by TechCrunch, which said it had been discovered by researchers at GDI Foundation. The databases included Facebook user records for 133 million U.S. accounts, 50 million Vietnamese accounts and 18 million British accounts, among other regions, researchers determined. They say some records also contained a user's name, gender and country of residence.
Neither the news outlet nor GDI were able to identify the owner of the server, which was not password-protected, but said that after contacting the web host, the databases were removed.
Facebook has confirmed that the data is legitimate. "This data set is old and appears to have information obtained before we made changes last year to remove people's ability to find others using their phone numbers," a Facebook spokesman tells Information Security Media Group. "The data set has been taken down and we see no evidence that Facebook accounts were compromised. The underlying issue was addressed as part of a Newsroom post on April 4, 2018, by Facebook's chief technology officer."
But TechCrunch said that it verified multiple entries in the database, checking a user's listed Facebook ID against known phone numbers associated with that account, as well as using Facebook's own password-reset mechanism, which reveals partial phone numbers for users' accounts.
In other words, although Facebook moved to restrict access to users' phone numbers more than a year ago, many users don't appear to have changed their phone number since then.
All Phone Numbers May Have Been Scraped
The April 2018 post from CTO Mike Schroepfer, "An Update on Our Plans to Restrict Data Access on Facebook," mentions telephone numbers specifically in the context of Facebook's "search and account recovery" feature.
"Until today, people could enter another person's phone number or email address into Facebook search to help find them," he wrote. "Malicious actors have also abused these features to scrape public profile information by submitting phone numbers or email addresses they already have through search and account recovery."
As of April 2018, Schroepfer said it was likely that attackers had obtained a copy of every phone number Facebook had collected from its users, which appeared on a public profile.
This Data Set Uploaded in August
Sanyam Jain, a security researcher and member of the GDI Foundation who discovered the databases, told TechCrunch that they contained personal information for at least several celebrities.
Jain couldn't be immediately reached for further comment.
But Victor Gevers, chairman of GDI Foundation, tweeted on Thursday that the information Jain found online had only been deployed last month, which suggests that it remains not only current but useful - potentially for fraudsters.
Although Facebook had disabled the API that shares users mobile phone & address details back in 2011, this data leak with scraped Facebook details was deployed recently in August 2019 on the latest version (4.0.12) of MongoDB. There is also a mail server running on that server https://t.co/Q7ulAnGp6W pic.twitter.com/Q6GI37kZvb— Victor Gevers (@0xDUDE) September 5, 2019
Gevers tells ISMG that it's not clear why someone would be storing this data online. "I honestly don't know what the purpose is of the data that is being stored on the server," he says. "But it looks like it is being maintained for a purpose," including multiple custom fields being used to describe the data, which collectively show that "99.9 percent of all the records were updated in the last month."
More Copies of Data Found
Later on Thursday, however, CNET reported that another copy of the data have been found online by U.K.-based security researcher Elliott Murray. Gevers said the data spotted by Murray dates from the end of January, which is seven months earlier than the data found by GDI Foundation's Jain.
"Data structure is also different," Gevers tweets. "We can expect a lot Facebook data clones floating around. Clone wars?"
Risk: SIM-Swapping Attacks
The massive databases of phone numbers - many of which still work - for Facebook users doesn't just pose a privacy risk to users. The phone numbers could also be abused by attackers to send spam messages or phishing lures via text as well as for identifying potential targets for SIM swapping or hijacking attacks. These refer to stealing a victim's phone number. Controlling a target's phone number can be powerful, because it enables the attacker to hijack the user's identity and gain access to many online services that rely on the phone number as an identity verification mechanism or authentication channel, for example via one-time passwords sent via SMS (see: Alleged SIM Swappers Charged Over Cryptocurrency Thefts).
This isn't the first time that Facebook has lost massive quantities of data to outsiders via scraping. In May, as TechCrunch first reported, Mumbai-based social media marketing company Chtrbox left a database online that appeared to contain profile data for millions of users of Instagram, which is part of Facebook. The information was stored on Amazon Web Services and not protected by a password (see: Database May Have Exposed Instagram Data for 49 Million).
Subsequently, however, Facebook's investigation revealed that the database contained information for 350,000 accounts. It also banned Chtrbox, saying the firm had violated its rules against scraping public information from Instagram profiles.
Crucially, however, Facebook didn't appear to have detected - or at least proactively moved to block - such activities when they were underway.
Cambridge Analytica Scandal
Facebook's 2018 efforts to clamp down on outside use of users' data was launched in the wake of the Cambridge Analytica scandal, in which information on 87 million Facebook users was obtained by a Cambridge University lecturer, Aleksandr Kogan, via a personality quiz app called "This Is Your Digital Life" on Facebook.
Kogan later gave that data to Cambridge Analytica, and the company reportedly used it to develop psychographic profiles that could be used for political advertising.
Facebook has been facing a range of global inquiries by regulators and lawsuits sparked by Cambridge Analytica obtaining user data (see: Lawmakers, Privacy Advocates Slam FTC's Facebook Settlement). The U.S. Federal Trade Commission recently fined Facebook $5 billion for misusing its users' data in violation of a 2012 agreement (see: It's Official: FTC Fines Facebook $5 Billion).
Managing Editor Jeremy Kirk contributed to this story. This story has been updated to reflect that additional copies of the scraped Facebook user data have been discovered online.