IET Computers and Digital Techniques

On the trade-off of mixing scientific applications on capacity high-performance computing systems

View publication


Network contention is seen as a major hurdle to achieve higher throughput in today's large-scale high-performance computing systems. Even more so with the current trend of employing blocking networks driven by the need of reducing cost. Additionally, the effect is aggravated by current system schedulers that allocate jobs as soon as nodes become available, thus producing job fragmentation, that is, the tasks of one job might be spread throughout the system instead of being allocated contiguously. This fragmentation increases the probability of sharing network resources with other applications, which produces higher inter-application network contention. In this study, the authors perform a broad analysis of diverse applications' performance variability because of the topology connectivity and fragmentation and a classification of applications based on their sensitiveness to these two factors. Once they understood the inherent characteristics of applications, the authors analysed the applications performance in a shared environment, that is, when mixing with other applications. They show that inter-application contention might be a significant factor of degradation even in the networks with high connectivity. Their results suggest different strategies on task allocation policies: grouping sensitive and insensitive applications, reducing the number of applications sharing the first level switch or isolation of sensitive applications. © The Institution of Engineering and Technology 2013.