About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
Journal of Biomedical Informatics
Paper
Disassociation for electronic health record privacy
Abstract
The dissemination of Electronic Health Record (EHR) data, beyond the originating healthcare institutions, can enable large-scale, low-cost medical studies that have the potential to improve public health. Thus, funding bodies, such as the National Institutes of Health (NIH) in the U.S., encourage or require the dissemination of EHR data, and a growing number of innovative medical investigations are being performed using such data. However, simply disseminating EHR data, after removing identifying information, may risk privacy, as patients can still be linked with their record, based on diagnosis codes. This paper proposes the first approach that prevents this type of data linkage using disassociation, an operation that transforms records by splitting them into carefully selected subsets. Our approach preserves privacy with significantly lower data utility loss than existing methods and does not require data owners to specify diagnosis codes that may lead to identity disclosure, as these methods do. Consequently, it can be employed when data need to be shared broadly and be used in studies, beyond the intended ones. Through extensive experiments using EHR data, we demonstrate that our method can construct data that are highly useful for supporting various types of clinical case count studies and general medical analysis tasks. © 2014 Elsevier Inc.