Hardware thread-level speculation performance analysis
Abstract
This paper presents performance analysis for hardware Thread-Level Speculation (TLS) in the IBM Blue Gene/Q computer. Unlike traditional multi-thread programming model which uses lock to ensure the consistency of shared data, TLS is a harware mechanism to detect and resolve memory access conflicts among threads. The model shows good performance prediction, as verified by the experiments. This study helps to understand potential gains from using special purpose TLS hardware to accelerate the performance of codes that, in a strict sense, require serial processing to avoid memory conflicts. Furthermore, based on analysis and measurements of the TLS behavior and its overhead together with OpenMP comparison, a strategy is proposed to help utilize this hardware feature. The results also suggest potential improvement for the future TLS architectural designs.