About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
Journal of Supercomputing
Paper
Analysis and performance results of computing betweenness centrality on IBM Cyclops64
Abstract
This paper presents a joint study of application and architecture to improve the performance and scalability of an irregular application-computing betweenness centrality-on a many-core architecture IBM Cyclops64. The characteristics of unstructured parallelism, dynamically non-contiguous memory access, and low arithmetic intensity in betweenness centrality pose an obstacle to an efficient mapping of parallel algorithms on such many-core architectures. By identifying several key architectural features, we propose and evaluate efficient strategies for achieving scalability on a massive multi-threading many-core architecture. We demonstrate several optimization strategies including multi-grain parallelism, just-in-time locality with explicit memory hierarchy and non-preemptive thread execution, and fine-grain data synchronization. Comparing with a conventional parallel algorithm, we get 4X-50X improvement in performance and 16X improvement in scalability on a 128-cores IBM Cyclops64 simulator. © 2009 Springer Science+Business Media, LLC.