Experience porting Mach to the RP3 large-scale shared-memory multiprocessor
Abstract
The Research Parallel Processing Prototype (RP3) is a research vehicle developed at the IBM T.J. Watson Research Center to explore the hardware and software aspects of highly parallel computation. The RP3 is a shared-memory machine designed to be scalable to 512 processors; a 64 processor machine has been in operation since October 1988. A parallel programming environment based on Mach has been developed, and a variety of programming models have been tested on the machine. The Mach kernel has been extended to support a rich set of software-controllable architectural features such as non-coherent caches, local and interleaved storage, and performance monitors. This paper describes the experience of porting Mach to the RP3, focusing both on the performance tuning process and on exploiting the RP3 architecture. Performance was significantly improved by concentrating on kernel activities, such as spin locking and busy-wait synchronization, that have global performance impact. However, identifying the real sources of a congestion was often more difficult than providing solutions. © 1992.