Design and evaluation of gracefully degradable disk arrays
Abstract
The availability of inexpensive, small, magnetic disks has made possible the building of a reliable, high-performance disk system by integrating a number of such disks in an array. To achieve high reliability in such systems, equivalent to that of larger disks, parity or other error-correcting codes may be used. In systems where data availability is critical, dual copy methods have traditionally been used. Recently some parity-based schemes have been proposed for providing fault tolerance with much less hardware. However, these new techniques do not provide good performance under a failure due to the increase in workload on the functional disks during a failure in the array. The dual copy methods degrade much more gracefully compared to thenew techniques. In this paper, we propose a new technique for making disk arrays fault-tolerant which combines the advantages of both the parity schemes and the dual copy methods. The proposed technique offers a wide variety of options in providing fault-tolerance, dual copy methods and single parity schemes being two extreme cases. We presentresults from simulations to show that the proposed technique offers better performance during all phases of operation: in normal operation, during a failure, and while reconstructing data on a failed disk. We also show that the proposed scheme allows faster reconstruction of data on the failed disk and thereby improves the data availability. © 1993 Academic Press, Inc.