CLOUD 2022
Conference paper

Towards More Effective and Explainable Fault Management Using Cross-Layer Service Topology

View publication


As microservice architecture becomes prominent, existing fault management techniques to deal with service disruption become limiting mainly due to the amount of data needed to be analyzed. This paper emphasizes the need to consider the cross-layer topology of the cloud service to intelligently identify and correlate the observability data and assist in implementing efficient and more accurate fault management techniques that can provide better explainability. Towards this goal, the paper presents a tool that discovers the cross-layer topology for a cloud microservice application and discusses the benefits of using cross-layer service topology to implement effective fault management.