About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
PODC 1995
Conference paper
Fault-local distributed mending
Abstract
As communication networks grow, existing fault handling tools that involve global measures such as global time-outs or reset procedures become increasingly unaffordable, since their cost grows with the size of the network. Rather, for a fault handling mechanism to scale to large networks, its cost must depend only on the number of failed nodes (which, thanks to today's technology, grows much slower than the networks). Moreover, it should allow the non-faulty regions of the networks to continue their operation even during the recovery of the faulty parts. This abstract introduces the concepts fault locality, and of fault-locally mendable problems, which are problems for which there exist correction algorithms (applied after faults) whose cost depends only on the (unknown) number of faults. We show that any problem is fault locally mendable. The solution involves a novel technique combining data structures and 'local votes' among nodes, that may be of interest in itself.