Exploring Serverless Computing for Neural Network Training

Lang Feng; Prabhakar Kudva; Dilma Da Silva; Jiang Hu

doi:10.1109/CLOUD.2018.00049

CLOUD 2018

Conference paper

07 Sep 2018

Exploring Serverless Computing for Neural Network Training

View publication

Abstract

Serverless or functions as a service runtimes have shown significant benefits to efficiency and cost for event-driven cloud applications. Although serverless runtimes are limited to applications requiring lightweight computation and memory, such as machine learning prediction and inference, they have shown improvements on these applications beyond other cloud runtimes. Training deep learning can be both compute and memory intensive. We investigate the use of serverless runtimes while leveraging data parallelism for large models, show the challenges and limitations due to the tightly coupled nature of such models, and propose modifications to the underlying runtime implementations that would mitigate them. For hyperparameter optimization of smaller deep learning models, we show that serverless runtimes can provide significant benefit.

Conference paper