Token-based dictionary pattern matching for text analytics
Raphael Polig, Kubilay Atasu, et al.
FPL 2013
The amount of textual data has reached a new scale and continues to grow at an unprecedented rate. IBM's SystemT software is a powerful text-analytics system that offers a query-based interface to reveal the valuable information that lies within these mounds of data. However, traditional server architectures are not capable of analyzing so-called big data efficiently, despite the high memory bandwidth that is available. The authors show that by using a streaming hardware accelerator implemented in reconfigurable logic, the throughput rates of the SystemT's information extraction queries can be improved by an order of magnitude. They also show how such a system can be deployed by extending SystemT's existing compilation flow and by using a multithreaded communication interface that can efficiently use the accelerator's bandwidth.
Raphael Polig, Kubilay Atasu, et al.
FPL 2013
Heiner Giefers, Raphael Polig, et al.
ASAP 2014
Huahai Yang, Daina Pupons-Wickham, et al.
CHI 2013
Laura Chiticariu, Rajasekar Krishnamurthy, et al.
ACL 2010