About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Abstract
The analysis of a huge backload of ever-accumulating data presents a huge challenge in all respects of computing. Inverse covariance matrices in this respect are very important. We target data uncertainty quantification, a very useful measure of which is provided by inverse covariance matrix diagonal entries. In previous work, we introduced a novel method that reduces overall complexity by at least two orders of magnitude. At the same time, a state-of-the-art message-passing interface (MPI) implementation allowed us to reach a sustained performance of up to 73% (730 TFLOPS on the full 72 Blue Gene/P rack configuration at Jülich). Thanks to its reduced complexity, this work has attracted significant interest, and thus, we have received numerous requests concerning its exploitation in various fields. A common denominator in these requests is that they almost all came from people with no or, in the best case, limited high-performance computing background. Nevertheless, all interest is in analyzing huge data sets, suitably adapting the method to particular applications. A bottleneck then is that potential users are reluctant to pay for a steep learning curve to get proficient in parallel computing using the de facto standard: MPI. Thus, we turned to the Partitioned Global Address Space programming model and in particular the Unified Parallel C language. In this work, we gave a comprehensive description of the framework and demonstrated the efficiency of the state-of-the-art MPI implementation. In addition, we showed that one can develop an easy-to-follow yet efficient Unified Parallel C implementation, which is also easy to debug and maintain, features that significantly boost overall productivity. Copyright © 2011 John Wiley & Sons, Ltd.