Publication
LICS 2017
Conference paper

Foundations of information integration under bag semantics

View publication

Abstract

During the past several decades, the database theory community has successfully investigated several different facets of the principles of database systems, including the development of various data models, the systematic exploration of the expressive power of database query languages, and, more recently, the study of the foundations of information integration via schema mappings. For the most part, all these investigations have been carried out under set semantics, that is, both the database relations and the answers to database queries are sets. In contrast, SQL deploys bag (multiset) semantics and, as a result, theory and practice diverge at this crucial point. Our main goal in this paper is to embark on the development of the foundations of information integration under bag semantics, thus taking the first step towards bridging the gap between theory and practice in this area. Our first contribution is conceptual, namely, we give rigorous bag semantics to GLAV mappings and to the certain answers of conjunctive queries in the context of data exchange and data integration. In fact, we introduce and explore two different versions of bag semantics that, intuitively, correspond to the maximum-based union of bags and to the sum-based union of bags. After this, we establish a number of technical results, including results about the computational complexity of the certain answers of conjunctive queries under bag semantics and about the existence and computation of universal solutions under these two versions of bag semantics. Our results reveal that the adoption of more realistic semantics comes at a price, namely, algorithmic problems in data exchange and data integration that were tractable under set semantics become intractable under bag semantics.

Date

Publication

LICS 2017

Authors

Share