IBM at Ray Summit
- San Francisco, CA, United States
About
IBM and Ray have collaborated for years to make Ray an efficient and production-ready solution for enterprise workloads. Most recently, we expanded our partnership to advance KubeRay and its community, enabling Kubernetes to be the recommended platform for Ray in the enterprise.
Today, AI foundation models and LLMs bring new and unique requirements for distributed computing. As a result, IBM and Ray are making Ray the most scalable and efficient framework for foundation model and LLM data preparation and validation, including on IBM's watsonx data and AI platform.
For presentation times of featured talks see the agenda section below. Note: All times are displayed in your local time.
Agenda
Serverless computing is a development model that lets developers build and run applications without having to manage servers. Two popular open source frameworks for running serverless workloads are Ray Serve and Knative Serving. Each framework takes a slightly different approach to serverless: Ray focuses primarily on serving machine learning models, whereas Knative focuses on building automatic HTTP services more generally. But despite these differences, there are many opportunities for both communities to learn from one another, which this talk will highlight. Drawing on experience participating in both communities and building open technologies using both frameworks, this talk compares and contrasts the different approaches that Ray and Knative take to serverless. We also uncover best practices and lessons learned for serverless development as well as potential pitfalls and difficulties that serverless users should be aware of. Furthermore, we highlight key pillars for the next generation of serverless applications, including possible areas of collaboration between the Ray and Knative communities. Speakers: Paul Schweigert - Senior Software Engineer, IBM Michael Maximilien - Distinguished Engineer, IBM
We demonstrate the integration of Ray with CodeFlare and Red Hat OpenShift Data Science Pipelines (RHODS Pipelines) for automatically scaling the execution of end-to-end workflows to train and validate foundation models on an OpenShift Container Platform (OCP). Workflow pipelines in foundation model development typically involves running various pre-processing steps to deduplicate data sources, filter out biased and low-quality data, and remove hate and profanity contents. The preprocessed and cleaned-up data are then tokenized and used to further train or fine-tune existing generative pre-trained models. Auto scaling is critical in the execution of this workflow because these steps are usually very compute intensive, with some of the steps iterated several times. RHODS Pipelines is a tool for specifying workflow pipelines as DAGs. It uses Tekton as the workflow engine to deploy pods to execute the workflow DAG in a Kubernetes cluster. However, RHODS Pipelines +Tekton lacks a way for the user to automatically scale up with parallel pods to run a task in the DAG. CodeFlare is a tool to create the necessary configurations for deploying a Ray cluster in an OCP and submitting a task programmed with Ray to execute in parallel. We explore the integration of Ray with CodeFlare and RHODS Pipelines, such that the entire end-to-end workflow DAG, or any subset of them, can be easily specified, independently managed, automatically scaled up by individual developers. We will show foundation model use cases that leverage and benefit from a simple interface to provide specific parameters and specify the DAG in RHODS Pipelines, allowing the tool to generate all the necessary configurations and artifacts for effectively running foundation model workflows in parallel with Ray on OpenShift. Speakers: Yuan-Chi Chang - Research Staff Member, IBM Research Alex Corvin - Software Engineering Manager, Red Hat