L. Joskowicz, Elisha Sacks
aaai 1994
We give an optimal algorithm for broadcasting on SIMD hypercubes without the hardware support for indirect addressing; i.e., an indirect memory reference takes as many cycles as the number of different memory locations (offsets) accessed by all active processors. With communication restricted to one dimension at a time, our algorithm broadcasts m elements in m + n - 1 communication steps and 2m direct memory references in an n-cube. The well-known algorithm based on the spanning binomial tree needs mn communication steps and 2m direct memory references. A previous algorithm based on the n edge-disjoint spanning trees takes m + n - 1 communication steps but needs at least 2 mn direct memory references. We then give a reduction algorithm on SIMD hypercubes, which is optimal in terms of the communication times and memory references and nearly optimal (about a factor of n (n - 1) when m ≫ n) with respect to arithmetic time. We also generalize our algorithms to the case in which communication is restricted to k dimensions at a time, for any integer k in the range of 1 {slanted equal to or less-than} k ≤ n, with their optimality preserved. © 1991.
L. Joskowicz, Elisha Sacks
aaai 1994
David W. Jacobs, Daphna Weinshall, et al.
IEEE Transactions on Pattern Analysis and Machine Intelligence
Matteo Baldoni, Nirmit Desai, et al.
AAMAS 2009
Karan Bhanot, Ioana Baldini, et al.
AIES 2023