Theoretical Computer Science

Communication complexity of PRAMs

View publication


We propose a model, LPRAM, for parallel random access machines with local memory that captures both the communication and computational requirements in parallel computation. For this model, we present several interesting results, including the following:. Two n × n matrices can be multiplied in 0(n3/p) computation time and 0(n2/p 2 3) communication steps using p processors (for p = 0(n3/log 3 2 n)). Furthermore, these bounds are optimal for arithmetic on semirings (using +, × only). It is shown that any algorithm that uses comparisons only and that sorts n words requires Ω(n log n/(p log(n/p))) communication steps for 1 < p < n. We also provide an algorithm that sorts n words and uses (-)(n log n/p) computation time and (-)(n log n/p log(n/p))) communication steps. These bounds also apply for computing an n-point FFT graph. It is shown that computing any binary tree τ with n nodes and height h requires Ω(n/p + log n + √h) communication steps, and can always be computed in 0(n/p + min(√n, h)) steps. We also present a simple linear-time algorithm that generates a schedule for computing τ in at most 2Dopt(τ) steps, where Dopt(τ) represents the minimum communication delay for computing τ. It is also shown that various problems that are expressed as DAGs exhibit a communication-delay/computation-time trade-off. © 1990.


13 Mar 1990


Theoretical Computer Science