Performance and power evaluation of an in-line accelerator
Alejandro Rico, Jeff H. Derby, et al.
CF 2010
A 440 000-transistor second-generation RISC floating-point chip is described. The pipeline latency is only two cycles, and a double-precision result is produced every cycle. System throughput and accuracy is increased by using a floating-point multiply—add-fused (MAT) unit, which carries out a double-precision accumulate D = (A X B) + C as a two-cycle pipelined execution with only one rounding error. While the cycle time (40 ns) is competitive with other CMOS RISC systems, the floating-point performance stretches to the range of bipolar RISC systems (7.4-13 MFLOPS UNPACK). © 1990 IEEE
Alejandro Rico, Jeff H. Derby, et al.
CF 2010
Koushik K. Das, Rajiv V. Joshi, et al.
ESSCIRC 2003
Robert H. Dennard, Fritz H. Gaensslen, et al.
IEEE JSSC
Peter W. Cook, Stanley E. Schuster, et al.
IEEE T-ED