Active learning for multi-relational data construction

Hiroshi Kajino; Akihiro Kishimoto; Adi Botea; Elizabeth Daly; Spyros Kotoulas

doi:10.1145/2736277.2741103

WWW 2015

Conference paper

18 May 2015

Active learning for multi-relational data construction

View publication

Abstract

Knowledge on the Web relies heavily on multi-relational representations, such as RDF and Schema.org. Automatically extracting knowledge from documents and linking existing databases are common approaches to construct multirelational data. Complementary to such approaches, there is still a strong demand for manually encoding human expert knowledge. For example, human annotation is necessary for constructing a common-sense knowledge base, which stores facts implicitly shared in a community, because such knowledge rarely appears in documents. As human annotation is both tedious and costly, an important research challenge is how to best use limited human resources, whiles maximizing the quality of the resulting dataset. In this paper, we formalize the problem of dataset construction as active learning problems and present the Active Multi-relational Data Construction (AMDC) method. AMDC repeatedly interleaves multi-relational learning and expert input acquisition, allowing us to acquire helpful labels for data construction. Experiments on real datasets demonstrate that our solution increases the number of positive triples by a factor of 2 to 17, and that the predictive performance of the multi-relational model in AMDC achieves the highest or comparable to the best performance throughout the data construction process.

Conference paper