Publication
Symposium on Reliability in Distributed Software and Database Systems 1981
Conference paper

AUDITOR: A FRAMEWORK FOR HIGH AVAILABILITY OF DB/DC SYSTEMS.

Abstract

A critical function that supports high availability of a distributed system of processors is a fast detection of and recovery from hardware/software failures. The software component that will perform this function for a prototype distributed database/data communication system being designed at IBM Research, San Jose, Calif. , is called the auditor. An auditor resides in each of the processors in the network and performs an internal surveillance for the database/data communication systems running in the local processors. Also, all the auditors in the network participate in protocols designed to provide an external surveillance of failures and recovery from the failures. Protocols are presented that have been developed to support the surveillance and recovery functions of the auditor under multiple concurrent failures of hardware/software components.

Date

Publication

Symposium on Reliability in Distributed Software and Database Systems 1981

Authors

Share