Communication compilation for unreliable networks
Abstract
Parallel programs running on top of generic protocols (e.g. TCP) in a cluster of workstations often do not perform or scale as well as one would expect. One reason for this is that both the performance and scalability of parallel applications are highly dependent on the speed of communication, yet the generic protocols used to guarantee reliable message delivery add unnecessary overhead which degrades the performance of the parallel application. The main thesis we explore in this paper is that it is possible to use knowledge of application behavior to design protocols that are more efficient. In particular, we investigate automatic techniques for generating optimized application-specific network protocols for parallel applications running on unreliable networks. Our algorithms assume that the application communication can be represented by a context free grammar. Such algorithms form the basis for a communication compiler.