Verification strategy for the Blue Gene/L chip
Abstract
The Blue Gene®/L compute chip contains two PowerPC® 440 processor cores, private L2 prefetch caches, a shared L3 cache and double-data-rate synchronous dynamic random access memory (DDR SDRAM) memory controller, a collective network interface, a torus network interface, a physical network interface, an interrupt controller, and a bridge interface to slower devices. System-on-a-chip verification problems require a multilevel verification strategy in which the strengths of each layer offset the weaknesses of another layer. The verification strategy we adopted relies on the combined strengths of random simulation, directed simulation, and code-driven simulation at the unit and system levels. The strengths and weaknesses of the various techniques and our reasons for choosing them are discussed. The verification platform is based on event simulation and cycle simulation running on a farm of Intel-processor-based machines, several PowerPC-processor-based machines, and the internally developed hardware accelerator Awan. The cost/performance tradeoffs of the different platforms are analyzed. The success of the first Blue Gene/L nodes, which worked within days of receiving them and had only a small number of undetected bugs (none fatal), reflects both careful design and a comprehensive verification strategy. © Copyright 2005 by International Business Machines Corporation.