Synthesizing extraction rules from user examples with SEER
Maeda F. Hanafi, Azza Abouzied, et al.
SIGMOD 2017
The amount of textual data has reached a new scale and continues to grow at an unprecedented rate. IBM's SystemT software is a powerful text-analytics system that offers a query-based interface to reveal the valuable information that lies within these mounds of data. However, traditional server architectures are not capable of analyzing so-called big data efficiently, despite the high memory bandwidth that is available. The authors show that by using a streaming hardware accelerator implemented in reconfigurable logic, the throughput rates of the SystemT's information extraction queries can be improved by an order of magnitude. They also show how such a system can be deployed by extending SystemT's existing compilation flow and by using a multithreaded communication interface that can efficiently use the accelerator's bandwidth.
Maeda F. Hanafi, Azza Abouzied, et al.
SIGMOD 2017
Ajay Nagesh, Ganesh Ramakrishnan, et al.
EMNLP 2012
Kubilay Atasu
FPT 2015
Huahai Yang, Daina Pupons-Wickham, et al.
CHI 2013