About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
Concurrency: Practice and Experience
Paper
Efficient communication primitives on hypercubes
Abstract
We give practical algorithms, complexity analysis and implementation for one‐to‐all broadcasting, all‐to‐all personalized communication and matrix transpose (with two‐dimensional partitioning of the matrix) on hypercubes. We assume the following communication characteristics: circuit‐switched, e‐cube routing and one‐port communication model. For one‐to‐all broadcasting, we give an algorithm that combines the well‐known recursive doubling algorithm[1] and the algorithm based on edgedisjoint spanning trees[2]. The measured times of the combined algorithm are always superior to those of the edge‐disjoint spanning tree algorithm and outperform the recursive doubling algorithm. For all‐to‐all personalized communication we propose a hybrid algorithm that combines the well‐known recursive doubling algorithm[3,4] and the recently proposed direct‐route algorithm[5,6] Our hybrid algorithm balances between data transfer time and start‐up time of these two algorithms, and its communication complexity is estimated to be better than the two previous algorithms for a range of machine parameters. For matrix transpose with two‐dimensional partitioning of the matrix, we relate a two‐phase algorithm to the previous result in Reference 7. The algorithm is predicted to be better than the recursive transpose algorithm[8] by n nearest‐neighbor communications[4]. It takes advantage of circuit‐switched routing and is congestion‐free within each phase. We also suggest a way of storing the matrix such that the transpose operation can be realized in one phase without congestion. Copyright © 1992 John Wiley & Sons, Ltd