About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
Communications of the ACM
Paper
The Bulk Multicore architecture for improved programmability
Abstract
A novel and general-purpose multicore architecture, called the Bulk Multicore was designed to enable a highly programmable environment. The programmer and runtime system were relieved of having to manage the sharing of data due to novel support for scalable hardware cache coherence. The Bulk Multicore provided to the software high-performance sequential memory consistency and introduced several novel hardware primitives to help minimize the chance of parallel-programming errors. These primitives were to be used to build an advanced program-development-and-debugging environment. These include low-overhead datarace detection, deterministic replay of parallel programs, and high-speed disambiguation of sets of addresses. The key idea in the Bulk Multicore involved two processes where the hardware automatically executed all software as a series of atomic blocks of a large number of dynamic instructions called Chunks.