Mixed speculative multithreaded execution models

Polychronis Xekalakis; Nikolas Ioannou; Marcelo Cintra

doi:10.1145/2355585.2355591

Transactions on Architecture and Code Optimization

Paper

05 Oct 2012

Mixed speculative multithreaded execution models

Download paper

Abstract

The current trend toward multicore architectures has placed great pressure on programmers and compilers to generate thread-parallel programs. Improved execution performance can no longer be obtained via traditional single-thread instruction level parallelism (ILP), but, instead, via multithreaded execution. One notable technique that facilitates the extraction of parallel threads from sequential applications is thread-level speculation (TLS). This technique allows programmers/compilers to generate threads without checking for inter-thread data and control dependences, which are then transparently enforced by the hardware. Most prior work on TLS has concentrated on thread selection and mechanisms to efficiently support the main TLS operations, such as squashes, data versioning, and commits. This article seeks to enhance TLS functionality by combining it with other speculative multithreaded execution models. The main idea is that TLS already requires extensive hardware support, which when slightly augmented can accommodate other speculative multithreaded techniques. Recognizing that for different applications, or even program phases, the application bottlenecks may be different, it is reasonable to assume that the more versatile a system is, the more efficiently it will be able to execute the given program. Toward this direction, we first show that mixed execution models that combine TLS with Helper Threads (HT), RunAhead execution (RA) and MultiPath execution (MP) perform better than any of the models alone. Based on a simple model that we propose, we show that benefits come from being able to extract additional ILP without harming the TLP extracted by TLS. We then show that by combining all the execution models in a unified one that combines all these speculative multithreaded models, ILP can be further enhanced with only minimal additional cost in hardware. © 2012 ACM.

Conference paper