Serving deep learning models in a serverless platform

Vatche Isahagian; Vinod Muthusamy; Aleksander Slominski

doi:10.1109/IC2E.2018.00052

IC2E 2018

Conference paper

16 May 2018

Serving deep learning models in a serverless platform

View publication

Abstract

Serverless computing has emerged as a compelling paradigm for the development and deployment of a wide range of event based cloud applications. At the same time, cloud providers and enterprise companies are heavily adopting machine learning and Artificial Intelligence to either differentiate themselves, or provide their customers with value added services. In this work we evaluate the suitability of a serverless computing environment for the inferencing of large neural network models. Our experimental evaluations are executed on the AWS Lambda environment using the MxNet deep learning framework. Our experimental results show that while the inferencing latency can be within an acceptable range, longer delays due to cold starts can skew the latency distribution and hence risk violating more stringent SLAs.

Conference paper