IBM J. Res. Dev

Toward smarter healthcare: Anonymizing medical data to support research studies

View publication


Healthcare is a major industry in the Smarter Planet initiative of IBM and a key area where analytics can have a substantial impact by improving disease prediction and treatment. To facilitate healthcare analytics, patient data usually need to be widely disseminated. This, however, may risk the disclosure of private and sensitive patient information. In this paper, we illustrate the importance of preserving medical data privacy and the inapplicability of several popular techniques to preserve the privacy of structured medical data. Subsequently, we review a privacy-preserving approach for the dissemination of patient records. This approach involves patient record de-identification, anonymization of diagnosis codes contained in the records, and a method for balancing data utility with privacy. This approach is practical in that it allows healthcare data providers to specify fine-grained privacy and utility requirements, and it is able to construct anonymized data with a desired balance between utility and privacy. The effectiveness of the approach is demonstrated through a case study using electronic medical records. We conclude this paper with a roadmap for future trends in medical data privacy.


01 Jan 2014


IBM J. Res. Dev