About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
SoCC 2022
Short paper
Cloud-native Workflow Scheduling using a Hybrid Priority Rule and Dynamic Task Parallelism
Abstract
Demand for efficient cloud-native workflow scheduling is growing as many data science workloads are composed of several tasks with dependencies. As container technology becomes more prevalent in cloud communities, containerized workflow orchestration tools are introduced and become standard for scheduling workflows. However, current schedulers use simple heuristics and rely on the user’s choice on priority and parallelism level of tasks without accounting for workflow-specific information. We introduce a workflow-aware scheduling algorithm that uses workflow information for scheduling tasks, without user input, with an objective of improving resource utilization and minimizing weighted workflow completion time, duration multiplied by user specific workflow priority. Our scheduler comprises of two strategies, a hybrid priority rule inspired by production planning ideas, and a task splitting rule based on a convex task processing time curve for the parallelism level. Using simulation, we demonstrate that our algorithm (1) produces an efficient balance of weighted workflow completion time and resource utilization and (2) outperforms deterministic parallelism.