A publication process model to enable privacy-aware data sharing
Abstract
As the Internet continues to permeate and connect communities, businesses, and things, there is an increasing demand for new approaches and technologies to analyze and synthesize data generated from diverse and distributed sources. In addition, this data must be accessible to a set of users having different analytic objectives and viewpoints. We examine these topics in light of the growing number of data consortia in sectors such as finance and healthcare, whose role is to share data among a set of contributing members. We address the need for data consortia to apply data customization and context-alignment services to make heterogeneous data relevant for its subscribers. Such services include record linkage, record selection, and scaling and homogeneity analysis. In addition, the often personal or business-sensitive nature of such data requires that privacy-preservation methods be employed to avoid improper disclosures. We provide a publication process model for data consortia that allow users to extract the maximum amount of information from these heterogeneous databases in a privacy-aware manner. We describe the Operational Riskdata eXchange (ORX) as a successful case study to illustrate these concepts. © 2011 IBM.