S.M. Sadjadi, S. Chen, et al.
TAPIA 2009
This paper proposes a design methodology for building highly available systems. In addition, we describe a set of operating system services that can be used to achieve this goal. The techniques described are intended for a parallel environment and can be generalized for any distributed system. We describe a methodology for providing basic services for high availability, specific services for restart and an implementation of these services.
S.M. Sadjadi, S. Chen, et al.
TAPIA 2009
G. Almasi, G. Almasi, et al.
Digest of Technical Papers - IEEE International Solid-State Circuits Conference
Ram Chillarege, Nicholas S. Bowen
FTCS 1989
Nicholas S. Bowen, D.A. Elko, et al.
IBM Systems Journal