Publication
FTCS 1998
Conference paper

Software exploitation of a fault-tolerant computer with a large memory

View publication

Abstract

The DM/6000 hardware (a prototype, fault-tolerant RS/6000 built at the T J Watson Research Center) provides fault tolerance and a large, nonvolatile main memory. Running a commercial, general-purpose operating system on it, of itself, does nothing to increase software availability. In fact, the time to rebuild the contents of a large memory may decrease availability.We describe our techniques for hiding most of the main memory, which requires the operating system to access it only by way of services separate from the operating system. This can allow the memory and those access services to achieve much higher availability, which, in turn, increases the availability of the system as a whole. We also performed simulation studies to determine those conditions where this system organization can lead to improved performance for recoverable database applications.

Date

Publication

FTCS 1998

Authors

Share