Virtual machine (VM) technologies have made much progress in improving the efficiency of virtualizing CPU and memory. However, achieving high performance for I/O virtualization remains a challenge, especially for high speed networking devices such as 10 Gigabit Ethernet (10GbE) NICs, and commonly used software-based I/O virtualization approaches usually suffer significant performance degradation compared with native hardware. One promising approach to address the performance issue of I/O virtualization is to use single root I/O virtualization (SR-IOV) devices which have been standardized by the PCI-SIG. With SR-IOV, a PCI Express (PCIe) device can present itself as multiple virtual devices. By dedicating a virtual device to a single VM, it is possible for the VM to access the virtual device hardware directly, thus reducing overheads such as context/control switches and extra memory copies. However, SR-IOV comes with its limitations such as requiring special hardware support and increased complexity in achieving VM tasks such as checkpointing, migration, and record/reply. Therefore, it is very important for us to fully understand the performance benefit of SR-IOV before adopting it. Unfortunately, there exists little previous work which provides such information. In this paper, we present a detailed performance evaluation of a 10 GbE SR-IOV PCIe device from Neterion in the KVM (Kernel-based Virtual Machine) virtualization environment. Our focus is not just performance metrics such as bandwidth and latency, but also other aspects of the system such as CPU utilization, memory access, VM exits, and host/guest interrupts. We have also studied several important factors that affect networking performance in both virtualized and native systems. These include issues such as the MTU size, the use of a single processor versus multiple processors, IRQ affinity, and IRQ distribution. Our experiments show that the hardware-based SR-IOV approach provides superior performance to the software-based approach in KVM. SR-IOV can achieve close to line rate TCP communication (9.3 Gbps) for both transmitting (Tx) and receiving (Rx) with the standard 1500 byte Ethernet MTU, although it does consume more CPU cycles than the native (non-virtualized) case. Overall, our evaluation demonstrates that the SR-IOV approach has great potential to achieve high performance I/O in a virtualized environment. © 2010 IEEE.