Heterogeneity-Aware Adaptive Federated Learning Scheduling
Abstract
Federated learning (FL) is becoming an important distributed machine learning approach that considers privacy and security concerns while training a shared model across various clients with localized data. One of the key challenges in FL is heterogeneity in both hardware resources and local datasets due to the nature of incorporating diverse clients. Given the resource heterogeneity, the availability of participating clients is not stable over time and their resource usage patterns become dynamic. This leads to resource wastage and straggler issues. Additional challenges are introduced due to data heterogeneity, causing model biasness and poor model performance. However, most existing FL systems are not well suited to heterogeneous environments because those approaches are not adaptive to various and dynamically changing resource usage patterns and accuracy trends during training process. To this end, we propose a heterogeneity-aware scheduling which is adaptive to the accuracy trends and various resource usage patterns. Our proposed scheduling provides different scheduling knobs for achieving different goals such as resource-efficient fast training, resource fairness, accuracy fairness, and high model performance. To the best of our knowledge, this is the first effort to mitigate effects of resource and data heterogeneity while providing adaptive scheduling based on dynamically changing resource usage patterns and accuracy trends.