Reliable Fault Diagnosis with Few Tests

Andrzej Pelc; Eli Upfal

doi:10.1017/S0963548398003563

Combinatorics Probability and Computing

Paper

01 Jan 1998

Reliable Fault Diagnosis with Few Tests

View publication

Abstract

We consider the problem of fault diagnosis in multiprocessor systems. Processors perform tests on one another: fault-free testers correctly identify the fault status of tested processors, while faulty testers can give arbitrary test results. Processors fail independently with constant probability p < 1/2 and the goal is to identify correctly the status of all processors, based on the set of test results. For 0 < q < 1, q-diagnosis is a fault diagnosis algorithm whose probability of error does not exceed q. We show that the minimum number of tests to perform q-diagnosis for n processors is Θ(n log 1/q) in the nonadaptive case and n + Θ(log 1/q) in the adaptive case. We also investigate q-diagnosis algorithms that minimize the maximum number of tests performed by, and performed on, processors in the system, constructing testing schemes in which each processor is involved in very few tests. Our results demonstrate that the flexibility yielded by adaptive testing permits a significant saving in the number of tests for the same reliability of diagnosis.

Conference paper