Publication
WWW 2015
Conference paper

Active learning for multi-relational data construction

View publication

Abstract

Knowledge on the Web relies heavily on multi-relational representations, such as RDF and Schema.org. Automatically extracting knowledge from documents and linking existing databases are common approaches to construct multirelational data. Complementary to such approaches, there is still a strong demand for manually encoding human expert knowledge. For example, human annotation is necessary for constructing a common-sense knowledge base, which stores facts implicitly shared in a community, because such knowledge rarely appears in documents. As human annotation is both tedious and costly, an important research challenge is how to best use limited human resources, whiles maximizing the quality of the resulting dataset. In this paper, we formalize the problem of dataset construction as active learning problems and present the Active Multi-relational Data Construction (AMDC) method. AMDC repeatedly interleaves multi-relational learning and expert input acquisition, allowing us to acquire helpful labels for data construction. Experiments on real datasets demonstrate that our solution increases the number of positive triples by a factor of 2:28 to 17:0, and that the predictive performance of the multi-relational model in AMDC achieves the highest or comparable to the best performance throughout the data construction process.

Date

18 May 2015

Publication

WWW 2015

Authors

Share