About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
PDCS 2004
Conference paper
Using active disks for failure detection: Two phase commit without blocking
Abstract
Recent advances in network attached disk technology have inspired a host of research on distributed storage systems [1, 2, 3, 4]. Naturally, part of the appeal of such systems is the opportunity they afford for widely replicated data; however, with wide data redundancy comes a host of consistency issues. This paper address the problem of writing concurrently to multiple network attached devices with a two phase commit write protocol. Most work in this area proposes using three-phase commit protocols to avoid blocking [5, 6, 2]. We introduce a novel reconciliation protocol managed by the storage devices themselves to alleviate a blocked transaction should one occur. In our system the set of shared disks implementing a replicated object maintains coordination to the object. This approach allows shorter access times in the common case where clients and storage devices do not fail, reverting to a separate procedure to resolve blocking and maintain data consistency only when failures occur.