Accelerating business analytics applications
Abstract
Business text analytics applications have seen rapid growth, driven by the mining of data for various decision making processes. Regular expression processing is an important component of these applications, consuming as much as 50% of their total execution time. While prior work on accelerating regular expression processing has focused on Network Intrusion Detection Systems, business analytics applications impose different requirements on regular expression processing efficiency. We present an analytical model of accelerators for regular expression processing, which includes memory bus-, I/O bus-, and network-attached accelerators with a focus on business analytics applications. Based on this model, we advocate the use of vector-style processing for regular expressions in business analytics applications, leveraging the SIMD hardware available in many modern processors. In addition, we show how SIMD hardware can be enhanced to improve regular expression processing even further. We demonstrate a realized speedup better than 1.8 for the entire range of data sizes of interest. In comparison, the alternative strategies deliver only marginal improvement for large data sizes, while performing worse than the SIMD solution for small data sizes. © 2012 IEEE.