Composable Locality Optimizations for Accelerating Parallel Forest Computations
Graph algorithms with large, sparse inputs oftentimes perform poorly on cache-based platforms. For parallel spanning forest and minimum spanning forest computations, we propose three locality optimizations, Update, Stages, and PSM, that improve the cache performance. Update exploits the 'graft-and-shortcut' pattern of the base algorithms. PSM transforms irregular accesses into regular ones. Stages is a meta approach based on evolution random graph theory that we use to combine Update and Stages. On the target platform, our combined optimization achieves up to 19 times speedups over the base algorithms.