David S. Kung
DAC 1998
The analysis of a huge backload of ever-accumulating data presents a huge challenge in all respects of computing. Inverse covariance matrices in this respect are very important. We target data uncertainty quantification, a very useful measure of which is provided by inverse covariance matrix diagonal entries. In previous work, we introduced a novel method that reduces overall complexity by at least two orders of magnitude. At the same time, a state-of-the-art message-passing interface (MPI) implementation allowed us to reach a sustained performance of up to 73% (730 TFLOPS on the full 72 Blue Gene/P rack configuration at Jülich). Thanks to its reduced complexity, this work has attracted significant interest, and thus, we have received numerous requests concerning its exploitation in various fields. A common denominator in these requests is that they almost all came from people with no or, in the best case, limited high-performance computing background. Nevertheless, all interest is in analyzing huge data sets, suitably adapting the method to particular applications. A bottleneck then is that potential users are reluctant to pay for a steep learning curve to get proficient in parallel computing using the de facto standard: MPI. Thus, we turned to the Partitioned Global Address Space programming model and in particular the Unified Parallel C language. In this work, we gave a comprehensive description of the framework and demonstrated the efficiency of the state-of-the-art MPI implementation. In addition, we showed that one can develop an easy-to-follow yet efficient Unified Parallel C implementation, which is also easy to debug and maintain, features that significantly boost overall productivity. Copyright © 2011 John Wiley & Sons, Ltd.
David S. Kung
DAC 1998
Ehud Altman, Kenneth R. Brown, et al.
PRX Quantum
David A. Selby
IBM J. Res. Dev
Frank R. Libsch, S.C. Lien
IBM J. Res. Dev