Enhancing OpenStack fault tolerance for provisioning computing environments
Abstract
With the rise of cloud computing and virtualization of resources, cloud management systems are becoming a key differentiator for the quality of service offered by the cloud providers. OpenStack is considered the de-facto open-source cloud management system at the infrastructure as a service layer. Despite the efforts of hardening the high availability of OpenStack, its fault tolerance during the provisioning of resources is yet to be proven. In this paper we present a testing framework for the fault tolerance of OpenStack, namely TestStack. We expose the limitations of OpenStack by injecting runtime failures into a highly available OpenStack environment. Our testing results reveal inconsistencies in the behavior of OpenStack in the presence of failures that we address by proposing our solution, namely FTStack, to harden its fault tolerance.