ModelOps: Cloud-based lifecycle management for reliable and trusted AI
This paper proposes a cloud-based framework and platform for end-to-end development and lifecycle management of artificial intelligence (AI) applications. We build on our previous work on platform-level support for cloud-managed deep learning services, and show how the principles of software lifecycle management can be leveraged and extended to enable automation, trust, reliability, traceability, quality control, and reproducibility of AI pipelines. Based on a discussion of use cases and current challenges, we describe a framework for managingAI application lifecycles and its key components. We also show concrete examples that illustrate how this framework enables managing and executing model training and continuous learning pipelines while infusing trusted AI principles.