Columbia and Google Researchers Identify New Privacy Concerns for Mobility Metadata
Stripping a big data set of names and personal details is no guarantee of privacy. Previous research has shown that individual shoppers, Netflix subscribers and even taxicab riders are identifiable in heaps of supposedly anonymous data.
Now, a team of computer science researchers at Columbia University and Google has identified new privacy concerns by demonstrating that geotagged posts on just two social media apps are enough to link accounts held by the same person. The team will present its results at the World Wide Web conference in Montreal on April 14.
Of the many digital traces we leave in daily life, location metadata may be the most revealing. Our real world movements are so distinctive that most people can be identified from a few data points within a single data set. With as little as four credit card purchases, individual shoppers can be picked out from among millions of other credit card users.
The new study takes these previous findings a step further by showing that individuals can be identified with a high degree of confidence by matching their movements across two data sets. “If you look unique in how you make phone calls, it is possible to connect that to where you’ve made credit card purchases,” said study coauthor Chris Riederer, a graduate student in computer science at Columbia Engineering.
The team developed an algorithm that compares geotagged posts on Twitter with posts on Instagram and Foursquare to link accounts held by the same person. It works by calculating the probability that one person posting at a given time and place could also be posting in a second app, at another time and place. The Columbia team found that the algorithm can also identify shoppers by matching anonymous credit card purchases against logs of mobile phones pinging the nearest cell tower. This method, they found, outperforms other matching algorithms applied to the same data sets.