About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
GCC and GNU Toolchain Developers' Summit 2007
Conference paper
Loop-aware SLP in GCC
Abstract
The GCC auto-vectorizer currently exploits data parallelism only across iterations of innermost loops. Two other important sources of data parallelism are across iterations of outer loops and in straight-line code. We recently embarked upon extending the scope of autovectorization opportunities beyond inner-loop interiteration parallelism, in these two directions. This paper describes the latter effort, which will allow GCC to vectorize unrolled-loops, structure accesses in loops, and important computations like FFT and IDCT (as well as several testcases in missed-optimization PRs). Industry compilers like icc and xlC already support SLP-like vectorization, each in a different way. We want to introduce a new approach for SLP vectorization in loops, that leverages our analysis of adjacent memory references, originally developed for vectorizing strided accesses. We extend our current loop-based vectorization framework to look for parallelism also within a single iteration, yielding a hybrid vectorization framework. This work also opens additional interesting opportunities for enhancing the vectorizer - including partial vectorization (right now it's an "all or nothing" approach), permutations, and MIMD (Multiple Instructions Multiple Data, as in subadd vector operations such as in SSE3 and BlueGene). We will describe how SLP-like vectorization can be incorporated in the current vectorization framework.