Information about the logistics of phone calls — metadata — may lead to personally sensitive details.
The U.S. National Security Agency (NSA) and others have essentially said “it’s only metadata,” but a new study from Stanford University’s Security Lab disagrees.
Computer science doctoral students Jonathan Mayer and Patrick Mutchler used only telephone metadata, some public social networks, and basic pattern matching to reveal sensitive information.
The study didn’t require any spying, only the phone records of 546 volunteers. Metaphone, an Android app called on participants’ phones, sent device logs.
AI Weekly
The must-read newsletter for AI and Big Data industry written by Khari Johnson, Kyle Wiggers, and Seth Colaner.
Included with VentureBeat Insider and VentureBeat VIP memberships.
Using data obtained since last November, they published their results so far this week:
“At the outset of this study, we shared the same hypothesis as our computer science colleagues—we thought phone metadata could be very sensitive. We did not anticipate finding much evidence one way or the other, however, since the MetaPhone participant population is small and participants only provide a few months of phone activity on average.
“We were wrong. We found that phone metadata is unambiguously sensitive, even in a small population and over a short time window. We were able to infer medical conditions, firearm ownership, and more, using solely phone metadata.”
The metadata included the caller’s and recipient’s phone numbers, the serial numbers of the phones they used, the call time and duration, and on occasion, the physical location of each caller.
At one level, simply understanding who got calls led to some rudimentary inferences. A call to a political campaign, for instance, most likely means the person supports the candidate, and a call at length to a religious institution implies the person is of that faith.
Other inferences were derived, for instance, by using the metadata in conjunction with public data from Facebook, Yelp, and Google Places.
The 546 participants collectively contacted 33,688 unique numbers. Of those 33,688, the researchers were able to find the identity of 18 percent of those called.
In many cases, they were also able to determine if the caller was in a relationship and, if so, the number of the significant other.
They also found inferences from series of calls. One caller’s sequence indicated an apparent heart problem, another had or knew someone with multiple sclerosis, a third showed a keen interest in the AR semiautomatic rifle, and a fourth – well, you can infer from calls to a home improvement store, a locksmith, a hydroponics dealer, and a head shop.
Yet another participant had a “long, early morning call with her sister,” followed by several calls over the next few weeks with the local branch of Planned Parenthood.
Keep in mind three major differences between this study and the NSA. First, participants knew they were being watched. Second, the NSA has access to system-level metadata from hubs at phone and Internet companies, not just device logs and public social media. Third, the NSA has the world’s most sophisticated computer systems.
Metadata can be revealing even without computers or phones. As a thought experiment, for instance, Kieran Healy, an associate professor in sociology at Duke University, decided to find out if metadata could have impacted the kind of people who went on to author the Fourth Amendment that prohibits “unreasonable search and seizure.”
Writing as a Loyalist in the American colonies of 1775, he mathematically analyzed overlapping membership metadata from various suspected organizations to reveal the social network of 256 social active Bostonians.
The trail eventually led him to “the name of a traitor like Paul Revere.”
Hopefully, the Stanford research and similar efforts will help to eliminate from our public discourse the expression “it’s only metadata.”
VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Learn More