About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
ICDE 2009
Conference paper
On the efficiency of provenance queries
Abstract
While models for data provenance have been extensively studied in the literature, the efficient evaluation of the resulting provenance queries remains an open problem. Traditional query optimization techniques, like the use of generalpurpose indexes, or the materialization of provenance data, fail on different fronts to address the problem. Provenance-specific optimization techniques, like the use of customized indexes, similarly prove inadequate since the techniques are bound to specific provenance models. Therefore, the need to develop generic provenance-aware techniques quickly becomes apparent. In this paper, we argue for such a generic technique in the form of a provenance index structure that can be used to efficiently evaluate provenance queries in a variety of contexts. By highlighting the limitations of existing techniques, we identify the set of key properties of the generic index, including a novel property called duality which guarantees that the single index can evaluate both backward provenance queries (which data items from a set I are associated with an item from set O) and forward provenance queries (which items from O are associated with an item from I). © 2009 IEEE.