About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Conference paper
Scalable instruction-level parallelism through tree-instructions
Abstract
We describe a representation of instruction-level parallelism which does not require checking dependencies at run-time, and which is suitable for processor implementations with varying issue-width. In this approach, a program is represented as a sequence of tree-instructions, each containing multiple primitive operations. These tree-instructions are executable either in one or multiple cycles, do not require a specific processor organization, and are generated assuming an implementation capable of large instruction-level parallelism. During instruction cache reloading/accessing, tree-instructions are decomposed into subtrees which fit the actual resources available in an implementation. The resulting subtrees require a simple instruction-dispatch mechanism, as in the case of statically scheduled processors. The representation makes practical the use of the same parallelized code in implementations with different issue-width; simulation results indicate that the instruction-level parallelism achievable with this approach degrades less than 10% with respect to code compiled for each specific implementation.