M. Shub, B. Weiss
Ergodic Theory and Dynamical Systems
The minimal block storage Distributed Square Block Packed (DSBP) format for distributed memory computing on symmetric and triangular matrices is presented. Three algorithm variants (Basic, Static, and Dynamic) of the blocked right-looking Cholesky factorization are designed for the DSBP format, implemented, and evaluated. On our target machine, all variants outperform standard full-storage implementations while saving almost half the storage. Communication overhead is shown to be virtually eliminated by the Static and Dynamic variants, both of which take advantage of hardware parallelism to hide communication costs. The Basic variant is shown to yield comparable or slightly better performance than the full-storage ScaLAPACK routine PDPOTRF while clearly outperformed by both Static and Dynamic. Models of execution assuming zero communication costs and overhead are developed. For medium- and larger-sized problems, the Static schedule is near optimal on our target machine based on comparisons with these models and measurements of synchronization overhead. © 2009 ACM.
M. Shub, B. Weiss
Ergodic Theory and Dynamical Systems
Karthik Visweswariah, Sanjeev Kulkarni, et al.
IEEE International Symposium on Information Theory - Proceedings
Naga Ayachitula, Melissa Buco, et al.
SCC 2007
Frank R. Libsch, Takatoshi Tsujimura
Active Matrix Liquid Crystal Displays Technology and Applications 1997