Dheeraj Sreedhar, J.H. Derby, et al.
HiPC 2014
A 440 000-transistor second-generation RISC floating-point chip is described. The pipeline latency is only two cycles, and a double-precision result is produced every cycle. System throughput and accuracy is increased by using a floating-point multiply—add-fused (MAT) unit, which carries out a double-precision accumulate D = (A X B) + C as a two-cycle pipelined execution with only one rounding error. While the cycle time (40 ns) is competitive with other CMOS RISC systems, the floating-point performance stretches to the range of bipolar RISC systems (7.4-13 MFLOPS UNPACK). © 1990 IEEE
Dheeraj Sreedhar, J.H. Derby, et al.
HiPC 2014
Leland Chang, Robert K. Montoye, et al.
IEEE Journal of Solid-State Circuits
Alejandro Rico, Jeff H. Derby, et al.
CF 2010
Leland Chang, Yutaka Nakamura, et al.
VLSI Circuits 2007