Large-scale relation extraction from web documents and knowledge graphs with human-in-the-loop
The Semantic Web movement has produced a wealth of curated collections of entities and facts, often referred as Knowledge Graphs. Creating and maintaining such Knowledge Graphs is far from being a solved problem: it is crucial to constantly extract new information from the vast amount of heterogeneous sources of data on the Web. In this work we address the task of Knowledge Graph population. Specifically, given any target relation between two entities, we propose an approach to extract positive instances of the relation from various Web sources. Our relation extraction approach introduces a human-in-the-loop component in the extraction pipeline, which delivers significant advantage with respect to other solely automatic approaches. We test our solution on the ISWC 2018 Semantic Web Challenge, with the objective to identify supply-chain relations among organizations in the Thomson Reuters Knowledge Graph. Our human-in-the-loop extraction pipeline achieves top performance among all competing systems.