Property graph schema optimization for domain-specific knowledge graphs
Abstract
Enterprises are creating domain-specific knowledge graphs by curating and integrating their business data from multiple sources. Ontologies provide a semantic abstraction for such knowledge graphs to describe their data in terms of the entities involved and their relationships. There has been a lot of effort to build systems that enable efficient querying over knowledge graphs, represented as property graphs. However the problem of schema optimization in the property graph setting has been largely ignored. In this work, we show that graph schema design has significant impact on query performance, and propose two algorithms to generate an optimized property graph schema from the domain ontology. To the best of our knowledge, we are the first to present an ontology-driven approach for property graph schema optimization. The rich semantic relationships in an ontology contain a variety of opportunities to reduce edge traversals and consequently improve the graph query performance. Our experimental study with two real-world knowledge graphs shows that our algorithms produce high-quality schemas, achieving up to 2 orders of magnitude speed-up compared to alternative schema designs.