About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
SRDS 2006
Conference paper
Improvements and reconsideration of distributed snapshot protocols
Abstract
Distributed snapshots are an important building block for distributed systems, and, among other applications, are useful for constructing efficient checkpointing protocols. In addition to the imposed overhead of the existing distributed snapshot protocols, those protocols are not trivially applicable (if at all) in many of today's distributed systems, e.g., grid, mobile, and sensors systems. After presenting the shortages and the inapplicability of the most popular existing distributed snapshot protocols, this paper discusses improvement directions for the protocols. In addition, it presents a new and an important improvement for the most popular distributed snapshot protocol, which was presented by Chandy and Lamport in 1985. Although the proposed improvement is simple and easy to implement, it has significant benefits in reducing the software and hardware overheads of distributed snapshots. Then, the paper presents proofs for the safety and progress of the new protocol. Lastly, it presents a performance analysis of the protocol using stochastic models. © 2006 IEEE.