About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
IISWC 2009
Conference paper
Workload characterization and optimization of high-performance text indexing on the cell broadband enginetm (Cell/B.E.)
Abstract
In this paper we examine text indexing on the Cell Broadband EngineTM (Cell/B.E.), an emerging workload on an emerging multicore architecture. The Cell Broadband Engine is a microprocessor jointly developed by Sony Computer Entertainment, Toshiba, and IBM (herein, we refer to it simply as the "Cell"). The importance of text indexing is growing not only because it is the core task of commercial and enterprise-level search engines, but also because it appears more and more frequently in desktop and mobile applications, and on network appliances. Text indexing is a computationally intensive task. Multi-core processors promise a multiplicative increase in compute power, but this power is fully available only if workloads exhibit the right amount and kind of parallelism. We present the challenges and the results of mapping text indexing tasks to the Cell processor. The Cell has become known as a platform capable of impressive performance, but only when algorithms have been parallelized with attention paid to its hardware peculiarities (expensive branching, wide SIMD units, small local memories). We propose a parallel software design that provides essential text indexing features at a high throughput (161 Mbyte/s per chip on Wikipedia inputs) and we present a performance analysis that details the resources absorbed by each subtask. Not only does this result affect traditional applications, but it also enables new ones such as live network traffic indexing for security forensics, until now believed to be too computationally demanding to be performed in real time. We conclude that, at the cost of a radical algorithmic redesign, our Cell-based solution delivers a 4× performance advantage over recent commodity machine like the Intel Q6600. In a per-chip comparison, ours is the fastest text indexer that we are aware of. © 2009 IEEE.