Publication
EuroSys 2024
Conference paper

DeTA: Minimizing Data Leaks in Federated Learning via Decentralized and Trustworthy Aggregation

Abstract

Federated learning (FL) relies on a central authority to oversee and aggregate model updates contributed by multiple participating parties in the training process. This centralization of sensitive model updates naturally raises concerns about the trustworthiness of the central aggregation server, as well as the potential risks associated with server failures or breaches, which could result in loss and leaks of model updates. Moreover, recent attacks have demonstrated that, by obtaining the leaked model updates, malicious actors can even reconstruct substantial amounts of private data belonging to training participants. This underscores the critical necessity to rethink the existing FL system architecture to mitigate emerging attacks in the evolving threat landscape. One straightforward approach is to fortify the central aggregator with confidential computing (CC), which offers hardware-assisted protection for runtime computation and can be remotely verified for execution integrity. However, a growing number of security vulnerabilities have surfaced in tandem with the adoption of CC, indicating that depending solely on this singular defense may not provide the requisite resilience to thwart data leaks. To address the security challenges inherent in the centralized aggregation paradigm and enhance system resilience, we introduce DeTA, an FL system architecture that employs a decentralized and trustworthy aggregation strategy with a defense-in-depth design. In DeTA, FL parties locally divide and shuffle their model updates at the parameter level, creating random partitions designated for multiple aggregators, all of which are shielded within CC execution environments. Moreover, to accommodate the multi-aggregator FL ecosystem, we have implemented a two-phase authentication protocol that enables new parties to verify all CC-protected aggregators and establish secure channels to upstream their model updates. With DeTA, model aggregation algorithms can function without any alterations. However, each aggregator is now oblivious to model architectures, possessing only a fragmented and shuffled view of each model update. This approach effectively mitigates attacks aimed at tampering with the aggregation process or exploiting leaked model updates, while also preserving training accuracy and minimizing performance overheads.