About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Conference paper
Distributed systems diagnosis using belief propagation
Abstract
In this paper, we focus on diagnosis in distributed computer systems using end-to-end transactions, or probes. Diagnostic problem is formulated as a probabilistic inference in a bipartite noisy-OR Bayesian network. Due to general intractability of exact inference in such networks, we apply belief propagation (BP), a popular approximation technique proven successful in various applications, from image analysis to probabilistic decoding. Another attractive property of BP for our application is it natural parallelism that allows a distributed implementation of diagnosis in a distributed system to improve diagnostic speed and robustness. We derive lower bounds for diagnostic error in bipartite Bayesian networks, and particularly in noisy-OR networks, and provide promising empirical results for belief propagation on both randomly generated and realistic noisy-OR problems.