Publication
DSN-W 2018
Conference paper

Diagnosing Failures of Cloud Management Actions

View publication

Abstract

Cloud management actions such as system patches and software updates are a regular activity in large-scale cloud deployments. Collected data shows that these actions have a high tendency of failures. System administrators currently spend hours on manual troubleshooting because limited solutions exist that can automatically diagnose such failures. This paper addresses the automatic analysis of cloud management action failures and determination of the root causes. The proposed failure diagnosis approach is able to identify the system attributes of cloud instances that are different in case of a failure. Furthermore, it doesn't require the knowledge of the source code. The design and implementation of the proposed solution are presented and it is evaluated using realistic management actions.

Date

Publication

DSN-W 2018

Authors

Share