Paper

Matrix transpose on meshes with wormhole and XY routing

Abstract

We give nearly optimal algorithms for matrix transpose on meshes with wormhole and XY routing and with a 1-port or 2-port communication model. For an N x N mesh, where N = 3 · 2n and each mesh node has a submatrix of size m to be transposed, our best algorithm takes about Nm/3.27 time steps. The lower bound is Nm/3.414. While there is no previously known algorithm for matrix transpose on meshes with wormhole and XY routing, a naive algorithm, which is naturally adapted from the well-known Recursive Exchange Algorithm, has a complexity of about Nm. That is our best algorithm improves over the naive algorithm by about a factor of 3.27, and is about a factor of 3.414/3.27 of the lower bound. © 1998 Elsevier Science B.V. All rights reserved.

Related