About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
IEEE/ACM TON
Paper
Structural Data De-Anonymization: Theory and Practice
Abstract
In this paper, we study the quantification, practice, and implications of structural data de-Anonymization, including social data, mobility traces, and so on. First, we answer several open questions in structural data de-Anonymization by quantifying perfect and (1-epsilon )-perfect structural data de-Anonymization, where epsilon is the error tolerated by a de-Anonymization scheme. To the best of our knowledge, this is the first work on quantifying structural data de-Anonymization under a general data model, which closes the gap between the structural data de-Anonymization practice and theory. Second, we conduct the first large-scale study on the de-Anonymizability of 26 real world structural data sets, including social networks, collaborations networks, communication networks, autonomous systems, peer-To-peer networks, and so on. We also quantitatively show the perfect and (1-epsilon )-perfect de-Anonymization conditions of the 26 data sets. Third, following our quantification, we present a practical attack [a novel single-phase cold start optimization-based de-Anonymization (ODA) algorithm]. An experimental analysis of ODA shows that ∼ 77.7 %-83.3% of the users in Gowalla (196 591 users and 950 327 edges) and 86.9%-95.5% of the users in Google+ (4 692 671 users and 90 751 480 edges) are de-Anonymizable in different scenarios, which implies that the structure-based de-Anonymization is powerful in practice. Finally, we discuss the implications of our de-Anonymization quantification and our ODA attack and provide some general suggestions for future secure data publishing.