IBM J. Res. Dev

A probabilistic model for quantifying the resilience of networked systems

View publication


Resilience is an important aspect of computing systems. Previous work on resilience has often focused on the design and architectural aspects of such systems, and not on the quantification of resilience. In addition, quantification is often restricted to a limited portion of the system. In networked systems, where multiple heterogeneous components interact in a complex manner, resilience quantification becomes a nontrivial problem. This paper proposes a model for quantifying resilience on the basis of the interdependencies of services and their adaptation. It combines performance and adaptability metrics to compute resilience of individual services that are then applied to a Markov network that computes the overall system resilience. The adaptation metric, here called adaptivity, computes how often the service adapts and evaluates the efficiency of such adaptations in terms of performance improvement. This paper also presents an evaluation that considers critical infrastructure systems. © 1957-2012 IBM.


08 Oct 2013


IBM J. Res. Dev