LUSTRE: An interactive system for entity structured Representation and Variant Generation

Kun Qian; Nikita Bhutani; Yunyao Li; H. V. Jagadish; Mauricio Hernandez

doi:10.1109/ICDE.2018.00189

ICDE 2018

Conference paper

24 Oct 2018

LUSTRE: An interactive system for entity structured Representation and Variant Generation

View publication

Abstract

Many data analysis and data integration applications need to account for multiple representations of entities. The variations in entity mentions arise in complex ways that are hard to capture using a textual similarity function. More sophisticated functions require the knowledge of underlying structure in the representation of entities. People traditionally identify these structures manually and write programs to manipulate them: such work is tedious and cumbersome. We have built LUSTRE, an active learning based system that can learn the structured representations of entities interactively from a few labels. In the background, it automatically generates programs to map entity mentions to their representations and to standardize them to a unique representation. Furthermore, LUSTRE provides a user-friendly interface to allow user declaratively specify normalization and variant generation functions for downstream applications.

Conference paper