Publication
IBM J. Res. Dev
Paper

Resilient cloud computing

View publication

Abstract

IBM SmartCloud® Enterprise+ (SCE+) is IBM's premier cloud computing offering for enterprise customers. The SCE+ cloud is designed to be a resilient, highly available system with no single point of failure. SCE+ uses an integrated set of enterprise-class servers, network elements, and storage components that have internal redundancy, RAS (reliability, availability, and serviceability) features, and high mean time between failures. SCE+ is also a managed cloud. The SCE+ management system utilizes IBM service-management and platform-management tools. To help ensure that the customer's virtual servers are running and avoid system failures, we employ several techniques. For example, virtual servers are automatically restarted upon crash or upon hosting physical-server failures. In addition, network elements are interconnected to allow alternate network traversal paths, and data are mirrored to offset storage failure, while each of these server, storage, and network elements also have redundant internal configurations. In addition to providing guest-level availability, important customer workloads, such as the enterprise resource planning applications and databases, require highly available clusters. More generally, we describe approaches used to achieve resiliency in SCE+. © 1957-2012 IBM.

Date

Publication

IBM J. Res. Dev

Authors

Topics

Share