A privacy reinforcement approach against de-identified dataset
Abstract
Protection of individual privacy has been a key issue for the corresponding data dissemination. Nowadays powerful search utilities increase the re-identification risk by easier information collection as well as validation than before. Despite there usually performs certain de-identified process, attackers may recognize someone from released dataset with which attacker-owned information is matched. In this paper, we propose an approach to mitigate the identity disclosure problem by generating plurals in a given dataset. The approach leverages decision tree to help selection of quasi-identifier and several masking techniques can be employed for privacy reinforcement. In addition to different privacy metrics applicability, the approach can achieve better trade-off between data integrity and privacy protection through flexible data masking. © 2011 IEEE.