IBM second-generation RISC machine organization
Abstract
A highly concurrent second-generation superscalar reduced-instruction-set computer (RISC) is described. It combines a powerful RISC architecture with sophisticated hardware design techniques to achieve a short cycle time and a low cycles-per-instruction (CPI) ratio. Like earlier RISC processors, this design employs a register-oriented instruction set, the CPU is hardwired rather than microcoded, and it features a pipelined implementation. Unlike earlier RISC processors, several advanced architectural and implementation features are employed. They include separate instruction and data caches, zero-cycle branches, multiple-instruction dispatch, and simultaneous execution of fixed- and floating-point instructions. The CPU has a four-word data bus to main memory, a four-word instruction-fetch bus from the I-cache arrays, and a two-word data bus between the D-cache and floating-point unit. These provide the high instruction and data bandwidths required for a high-performance superscalar implementation. In a single cycle, four instructions can be executed simultaneously (a branch, a condition-register instruction, a fixed-point instruction, and a floating-point instruction). The floating-point has a multiply-add instruction that executes with the same delay as a multiply or add. Counted as two instructions, this yields a peak instruction execution rate of five.