About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
ISC2 2015
Conference paper
Efficient algorithms for identifying privacy vulnerabilities
Abstract
The automatic identification of privacy vulnerabilities in datasets is an important step in the privacy-preserving data publishing process, and an area of increased interest for commercial data masking products. In this paper, we propose two multi-threaded algorithms for discovering privacy vulnerabilities in datasets, in the form of combinations of attributes leading to few records. Our algorithms fully utilize the execution environment and outperform the state-of-the-art to the extent that we had to design a multi-threaded counterpart of the state-of-the-art method to form the baseline for our experiments. Through experimental evaluation on a large set of datasets, we show that our algorithms can analyze microdata consisting of millions of records in less than 10 minutes, when the baseline method required more than 3 hours.