About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
SC 2021
Conference paper
FedAT: A High-Performance and Communication-Efficient Federated Learning System with Asynchronous Tiers
Abstract
Federated learning (FL) involves training a model over massive distributed devices, while keeping the training data localized and private. This form of collaborative learning exposes new tradeoffs among model convergence speed, model accuracy, balance across clients and communication cost, with new challenges including the straggler problem and communication bottleneck. To address these issues, we present FedAT, a novel federated learning system with asynchronous tiers. FedAT synergistically combines synchronous, intra-tier training and asynchronous, cross-tier training. By bridging the synchronous and asynchronous training through tiering, FedAT minimizes the straggler with improved test accuracy. FedAT uses a weighted aggregation heuristic to balance the training across clients for further accuracy improvement. FedAT compresses uplink and downlink communications using an efficient compression algorithm, which minimizes the communication cost. Results show that FedAT improves the prediction performance by up to 21.09% and reduces the communication cost by up to 8.5x, compared to state-of-the-art FL methods.