About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
SBAC-PAD 2016
Conference paper
Speeding Up Stencil Computations with Kernel Convolution
Abstract
A technique to speed up stencil computation is introduced. Computation and data reuse schemes are developed for its application to 1-and 3-dimensional stencils. The approach traverses the data domain fewer times than a state-of-the-art, straightforward iterative stencil implementation would. Performance results are shown for a variety of platforms, exemplifying how it can be straightforwardly applied with existing techniques and frameworks. The technique, named Aggregate Stencil-Loop Iteration (ASLI), works by applying a stencil obtained by the original stencil operator convolved with itself one or more times. This more complex operator creates new opportunities for in-register data reuse and increases the FLOPs-to-load ratio. The total number of FLOPs decreases for 1D but increases for 2D and 3D star-shaped stencils. In both scenarios, speed-up relative to the state-of-the-art is achieved. ASLI is relatively easy to implement and works synergistically with existing methods to optimize stencil computations.