About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
KAIS
Paper
The Mask of ZoRRo: preventing information leakage from documents
Abstract
In today’s enterprise world, information about business entities such as a customer’s or patient’s name, address, and social security number is often present in both relational databases as well as content repositories. Information about such business entities is generally well protected in databases by well-defined and fine-grained access control. However, current document retrieval systems do not provide user-specific, fine-grained redaction of documents to prevent leakage of information about business entities from documents. Leaving companies with only two choices: either providing complete access to a document, risking potential information leakage, or prohibiting access to the document altogether, accepting potentially negative impact on business processes. In this paper, we present ZoRRo, an add-on for document retrieval systems to dynamically redact sensitive information of business entities referenced in a document based on access control defined for the entities. ZoRRo exploits database systems’ fine-grained, label-based access-control mechanism to identify and redact sensitive information from unstructured text, based on the access privileges of the user viewing it. To make on-the-fly redaction feasible, ZoRRo exploits the concept of k-safety in combination with Lucene-based indexing and scoring. We demonstrate the efficiency and effectiveness of ZoRRo through a detailed experimental study.