About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
CLOUD 2015
Conference paper
Remote Restart for a High Performance Virtual Machine Recovery in a Cloud
Abstract
In this paper, we present a scalable parallel virtual machine planning and fail over method that enables high availability at a VM level in a data center. The solution is implemented and used in IBM's CMS enterprise private cloud as a high availability feature for efficient fail over in large data centers with a large number of servers, VMs, and a large number of disks. The introduced restart system enables dynamic and at-fail over-time planning and execution, and keeps the recovery time within limits of service level agreement (SLA) allowed time budget. The initial serial fail over time is reduced by a factor of up to 11 for parallel implementation, and by a factor of up to 44 for parallel fail over - parallel storage mapping implementation. As part of our future work, we plan to explore the applicability of this planning and fail over solution for Disaster Recovery.