Hiding sensitive patterns from sequence databases: Research challenges and solutions
Abstract
Sequence data are encountered in a plethora of applications, spanning from telecommunications to web usage analysis, marketing and healthcare. Disseminating these data offers remarkable opportunities for discovering interesting patterns, but it is challenging to perform in a privacy-preserving way. Although there is a large gamut of techniques to anonymizing sequential data, the discovery of sensitive sequential patterns through data mining algorithms may still lead to serious privacy violations. This is because the mining of such patterns enables intrusive inferences about the habits of a portion of the population, or provides the means for unsolicited advertisement and user profiling. In this paper, we present the problem of hiding sensitive sequential patterns, and survey existing works that attempt to address it. In addition, we discuss the important research challenges that pertain to solving this problem, and present a roadmap for future work. © 2013 IEEE.