Publication
Journal of Parallel and Distributed Computing
Paper

Design alternatives for virtual interface architecture and an implementation on IBM netfinity NT cluster

View publication

Abstract

The Virtual Interface Architecture (VIA) specification has been developed to standardize user-level network interfaces that provide low-latency. high-bandwidth communications. Few hardware and software implementations of VIA exist. Since the VIA specification is flexible, different choices exist for implementing various components of VIA such as doorbells, address translation methods, and completion queues. Although previous studies have evaluated the overall performance of different VIA implementations, there has not been a comparative study on the performance of VIA components. In this paper, we evaluate and compare the performance of different implementations of essential VIA components. We discuss the pros and cons of each design approach and describe the required support for implementing each of them. Then, we discuss an experimental implementation of the Virtual Interface Architecture for the IBM SP Switch-Connected NT cluster, one of the newest clustering platforms available. We discuss different design issues involved in this implementation. In particular, we explain how the virtual-to-physical address translation is implemented efficiently with a minimum Network Interface Card (NIC) memory requirement. We show how caching the VIA descriptors on the NIC can reduce the communication latency. We also present an efficient scheme for implementing the VIA doorbells without any hardware support. We provide a comprehensive performance evaluation study and discuss the impact of several hardware improvements on the performance of our implementation. The performance of the implemented VIA surpasses that of other existing software implementations of the VIA and is comparable to that of a hardware VIA implementation. The peak measured bandwidth for our system is 101.4 MBytes/s and the one-way latency for short messages is 18.2 μs. © 2001 Academic Press.