On the efficiency of provenance queries

Anastasios Kementsietsidis; Min Wang

doi:10.1109/ICDE.2009.206

ICDE 2009

Conference paper

08 Jul 2009

On the efficiency of provenance queries

View publication

Abstract

While models for data provenance have been extensively studied in the literature, the efficient evaluation of the resulting provenance queries remains an open problem. Traditional query optimization techniques, like the use of generalpurpose indexes, or the materialization of provenance data, fail on different fronts to address the problem. Provenance-specific optimization techniques, like the use of customized indexes, similarly prove inadequate since the techniques are bound to specific provenance models. Therefore, the need to develop generic provenance-aware techniques quickly becomes apparent. In this paper, we argue for such a generic technique in the form of a provenance index structure that can be used to efficiently evaluate provenance queries in a variety of contexts. By highlighting the limitations of existing techniques, we identify the set of key properties of the generic index, including a novel property called duality which guarantees that the single index can evaluate both backward provenance queries (which data items from a set I are associated with an item from set O) and forward provenance queries (which items from O are associated with an item from I). © 2009 IEEE.

Conference paper