Configuring bitmap materialized views for optimizing XML queries
In recent years the inverted lists evaluation model along with holistic stack-based algorithms have been established as the most prominent techniques for evaluating XML queries on large persistent XML data. In this framework, we are using materialized views for optimizing XML queries. We consider a novel approach which instead of materializing the answer of a view materializes exactly the inverted sublists that are necessary for computing the answer of the view. This originality allows storing view materializations as compressed bitmaps, a solution that minimizes the materialization space and empowers performing optimization operations as CPU-efficient bitwise operations. To realize the potential of bitmap materialized views in optimizing query performance, we define and address the following problem (view configuration problem): given an XML tree and its schema find a template of tree-pattern views (view configuration) such that: (a) the views of this configuration can answer all the queries that can be issued against the schema, (b) their materialization fits in the space provided, and (c) evaluating the queries using these views minimizes the overall query evaluation cost. We consider an instance of this problem for tree pattern queries. Our intension is to find view configurations whose materializations are small enough to be stored in main memory. We find two candidate solution configurations and we identify cases where views can be excluded from materialization in a configuration without affecting query performance. In order to compare our approach with an approach which also can support the optimization of every query on the schema, we implemented an improvement of a state-of-the-art approach which is based on structural indexes. Our experimental results show that our approach is stable, greatly improves evaluating queries without materialized views, outperforms the structural index approach on all test cases and is very close to the optimal. These results characterize our approach as the best candidate for supporting the optimization of queries in the framework of the inverted lists model.