About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
NAS 2008
Conference paper
Reliability assurance of RAID storage systems for a wide range of latent sector errors
Abstract
The low-cost disk drives, which are increasingly being adopted in today's data storage systems, have higher capacity but lower reliability, which leads to more frequent rebuilds and to a higher risk of unrecoverable or latent media errors. An intra-disk redundancy scheme has been proposed to cope with such errors and enhance the reliability of RAID systems. Empirical field results recently reported in the literature, however, suggest that the extent to which unrecoverable media errors occur is higher than the data sheet specifications provided by the disk manufacturers. Our results demonstrate that the reliability improvement due to intradisk redundancy is adversely affected because of the increase in the number of unrecoverable errors. We demonstrate that, by revising the parameter choice of the intradisk redundancy scheme, we can obtain essentially the same reliability as that of a system operating without unrecoverable sector errors. The I/O and throughput performance are evaluated by means of analysis and event-driven simulations. The effects of the spatial locality of errors and of the error-burst length distribution on the system reliability are also investigated. © 2008 IEEE.