About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Abstract
The higher complexity of the hardware and software employed by modern computing systems, as well as semiconductor technology scaling, are increasing the likelihood of Silent Data Corruption (SDC). SDC occurs when incorrect data is provided to the user, e.g., written to the memory or I/O system, and no error is triggered. Such events may have catastrophic effects, in the case of life critical applications, and represent a significant cost penalty for businesses. The purpose of this panel is to provide real examples of silent corruption, and discuss solutions for avoiding it. The presentations address SDC generated at the semiconductor device level, as well as the virilization software level. Techniques for reducing SDC, from the circuit to system level, will be covered. Results of an extensive SDC study, carried out at Los Alamos National Laboratory (LANL) on high-performance computing (HPC) platforms are also given. © 2008 IEEE.