Deep learning, driven by large neural network models, is overtaking traditional machine learning methods for understanding unstructured and perceptual data domains such as speech, text, and vision. At the same time, the "As-a-Service"-based business model for the cloud is fundamentally transforming the information technology industry. These two trends, deep learning and "As-a-Service," are colliding to give rise to a new business model for cognitive application delivery: deep learning as a service in the cloud. In this paper, we discuss the details of the software architecture behind IBM's deep learning as a service (DLaaS). DLaaS provides developers the flexibility to use popular deep learning libraries - such as Caffe, Torch, and TensorFlow - in the cloud in a scalable and resilient manner with minimal effort. The platform uses a distribution and orchestration layer that facilitates learning from a large amount of data in a reasonable amount of time across compute nodes. A resource provisioning layer enables flexible job management on heterogeneous resources, such as graphics processing units and central processing units, in an infrastructure-as-a-service cloud.