Rim: Offloading Inference to the Edge

Yitao Hu; Weiwu Pang; Xiaochen Liu; Rajrup Ghosh; Bongjun Ko; Wei-Han Lee; Ramesh Govindan

doi:10.1145/3450268.3453521

IoTDI 2021

Conference paper

18 May 2021

Rim: Offloading Inference to the Edge

Download paper

Abstract

Video cameras are among the most ubiquitous sensors in the Internet-of-Things. Video and audio applications, such as cross-camera activity detection, avatar extraction or language translation will, in the future, offload processing to an edge cluster of GPUs. Rim is a management system for such clusters that satisfies throughput and latency requirements of these applications, while enabling high cluster utilization. It uses coarse-grained knowledge of application structure to profile throughput of applications on resources, then uses these profiles to place applications on cluster nodes to achieve these goals. It dynamically adapts placement to load and failures. Experiments show that on maximal workloads on a testbed, Rim can satisfy requirements of all applications, but competing approaches designed for low-latency GPU execution cannot.

Conference paper

Olympian: Scheduling GPU usage in a deep neural network model serving system

Yitao Hu, Swati Rallapalli, et al.

Middleware 2018

Conference paper

Kestrel: Video analytics for augmented multi-camera vehicle tracking

Hang Qiu, Xiaochen Liu, et al.

IoTDI 2018

View all publications

Abstract

Related

Olympian: Scheduling GPU usage in a deep neural network model serving system

Kestrel: Video analytics for augmented multi-camera vehicle tracking