QED: Out-of-the-box datasets for SPARQL query evaluation
Abstract
The SPARQL query language is probably the most popular technology for querying the Semantic Web and supported by most triple stores and graph databases [6, 7, 3]. Several benchmarks have been developed to evaluate their efficiency (a good overview is provided by the W3C1), but correctness tests are not so common. In fact, to the best of our knowledge, the W3C compliance tests2 are the only test suite publicly available and commonly applied [1, 4]. However, these tests mostly contain fairly synthetic queries over similarly artificial example data and, in particular, comprise only few more complex queries nesting different SPARQL features, which model real user queries more faithfully. A simple text search reveals, for example, that the UNION keyword only occurs in nine3 and rather simple SELECT queries, such as the following query Qex