Optimization of link bandwidth for parallel communication performance
Abstract
The efficiency of computer network has been regarded as a bottleneck in parallel computing paradigm. It is important to have efficient methodology to obtain network performance measures, especially for a large scale system, i.e. exa-scale system. Communication performance is often investigated by the static complexity analysis based on a given network topology or a detailed network simulation, which is often time consuming. To provide a dynamic and scalable communication performance measure, we first propose an aggregate multi-stage queueing network model to capture the application's communication load and derive the closed-form system performance, i.e. throughput and delay. Trace simulation results obtained from a sophisticated simulator, Venus, show that the proposed model is accurate, yet simple. Secondly, we develop a link bandwidth optimization framework, which optimally allocates/distributes link bandwidth across the network to maximize the system communication throughput. Specifically, we apply the derived optimal bandwidth allocation on dimensioning link bandwidth of an exploratory direct network and slimming fat-tree network. Our results show that the proposed methodology is cost-effective in providing system performance and design explorations for the existing and the next-generation network system. © 2009 IEEE.