Graphics processing units (GPUs) are widely used in high performance computing (HPC) and cloud computing to accelerate workloads. Virtualization provides flexible access to resources while improving utilization and throughput. This is essential to resource disaggregation, which allows ubiquitous access to remote resources among nodes. However, remote GPU virtualization at scale suffers from severe performance degradation due to inter-node communication and resource consolidation overhead, especially for data-intensive workloads.We propose HFGPU, a GPU virtualization solution transparent to application code based on application programming interface (API) remoting. We define a virtual device manager that allows remote GPUs to be seen, managed, and used as though they were local. To perform at scale we combine multi-adapter InfiniBand networking with a novel distributed I/O forwarding mechanism that eliminates consolidation bottlenecks and reduces data movement. Experiments with up to 1024 NVIDIA V100 GPUs demonstrate overhead lower than 1% for data-intensive operations.