Publication
PODC 1993
Conference paper
Unifying self-stabilization and fault-tolerance
Abstract
In this paper we combine two previously disparate aspects of reliable distributed computing - self-stabilization, i.e., tolerance of systemic failures, and fault-tolerance, i.e., tolerance of process failures. We define what it means for a protocol to solve a problem while tolerating both types of failures and demonstrate a `compiler' that transforms a process failure-tolerant protocol for a synchronous system into a process and systemic failure-tolerant protocol. For asynchronous systems, we present a protocol that solves a crucial problem (Consensus) while tolerating both process and systemic failures.