International Workshop on Federated Learning in the Age of Foundation Models
Abstract
Training machine learning models in a centralized fashion often faces significant challenges due to regulatory and privacy concerns in real-world use cases. These include distributed training data, computational resources to create and maintain a central data repository, and regulatory guidelines (GDPR, HIPAA) that restrict sharing sensitive data. Federated learning (FL) is a new paradigm in machine learning that can mitigate these challenges by training a global model using distributed data, without the need for data sharing. The extensive application of machine learning to analyze and draw insight from real-world, distributed, and sensitive data necessitates familiarization with and adoption of this relevant and timely topic among the scientific community. Recently, foundation models such as ChatGPT have revolutionized the field of machine learning by demonstrating remarkable capabilities across a wide range of tasks. These models have democratized the development of machine learning models, empowering developers to focus more on tuning a foundation model to their specific task rather than building complex models from scratch. This paradigm shift has the potential to remove the barriers to entry for machine learning development, and enables a broader community of developers to create high-quality models. However, as the model development process itself becomes increasingly accessible, a new bottleneck emerges: computation power and data access. While foundation models have the potential to perform exceptionally well across various tasks, they pose two challenges: 1) training them requires vast amounts of training data and compute power, and 2) fine-tuning them to specific applications requires specialized and potentially sensitive data. Acquiring and centralizing datasets for both training and fine-tuning poses several challenges, including data privacy concerns, legal constraints (such as GDPR, HIPAA), and computational burdens. FL is a promising solution to address these challenges in the era of foundation models. The fundamental goal of federated learning is to train models collaboratively across decentralized devices or data silos while keeping the data securely on those devices or within specific organizations. By adopting federated learning approaches, we can leverage the vast amounts of distributed data and compute available across different sources while respecting privacy regulations and data ownership. The rise of foundation models amplifies the importance and relevance of FL as a crucial research direction. With foundation models becoming the norm in machine learning development, the focus shifts from model architecture design to tackling the issues surrounding privacy-preserving and distributed learning. Advancements in FL methods have the potential to unlock the full potential of foundation models, enabling efficient and scalable training while safeguarding sensitive data. With this in mind, we invite original research contributions, position papers, and work-in-progress reports on various aspects of federated learning in the age of foundation models. Since the emergence of foundation models has been a relatively recent phenomenon, their full impact on federated learning has not yet been well explored or understood. We hope to provide a platform to facilitate interaction among students, scholars, and industry professionals from around the world to discuss the latest advancements, share insights, and identify future directions in this exciting field.