About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
IM 2011
Conference paper
Elimination based fault localization in shared resource environments
Abstract
Fault Localization is the process to identify the component(s) that is the exact source of failure given a set of observed failure indications. Despite being a focus of research for a long time, fault localization is still deemed to be a challenge due to the complexity of current distributed environment. Growing adoption of cloud computing wherein multiple applications share multiple resources increases the complexity of the problem. Existing probing techniques are found to be inefficient due to large number of applications and resources. Availability and utilization of such shared resource environment trigger the need for finding other novel techniques to fault localization. In this paper, we present an elimination-based fault localization method that leverages shared resources among applications. Shared resources are used as Readily Available Probes to find the real-time state of applications. These probes are used to eliminate non-faulty resources leaving minimal subset of resources that are likely to the faulty components. We show this method significantly reduces the effort required to design and implement probes. Various experiments demonstrate that our method reduces time taken and increases efficiency and accuracy of problem determination. © 2011 IEEE.