Scalable Federated Learning with System Heterogeneity
Abstract
Federated learning (FL) is a distributed learning framework that inherently provides data privacy and parallel computation capability over a set of participating devices (clients). In real-life applications, these clients can have a great variety in terms of resources (storage, RAM, CPU/GPU speed, network speed, etc.). However, most previous FL studies do not consider this scenario with system heterogeneity and assume that all clients can operate on the same full-size deep neural network (DNN) model. In this work, we demonstrate a scalable FL approach, ScaleFL, which tackles system heterogeneity through hierarchically downscaling the DNN model for clients with limited resources. ScaleFL utilizes early exits to form multi-exit DNN models by injecting early exit networks into the given DNN. During FL, the model is adaptively split along depth (exits) and width (hidden dimensions) based on the resource budget of each participating client. A proof-of-concept demonstration is provided with interactive features, demonstrating the system flow on image classification and NLP benchmark workloads.