02 Aug 2021
Release
5 minute read

IBM Research and Red Hat work together to take a load off predictive resource management

PayPal, the global payments giant, has already started putting load-aware scheduling into production.

IBM Research and Red Hat work together to take a load off predictive resource management

PayPal, the global payments giant, has already started putting load-aware scheduling into production.

In today’s ever-changing hybrid cloud field, based on many open-source projects, researchers face two fundamental challenges:

  1. Being able to back up their ideas with deep research.
  2. Convincing the open source community that their idea is important and enhances existing software frameworks.

Working as one team, scientists at IBM Research and Red Hat joined together to overcome these obstacles and produce tangible solutions in just seven months.

Red Hat OpenShift is the connective tissue between the infrastructure our clients use. It allows users to write applications once and run them anywhere. And it standardizes the approach to development, security, and operations on any cloud, from any vendor. But Kubernetes, the container orchestration engine at the core of OpenShift, has some areas where our team thought additional features and enhancements could be added.

There are two separate components to the work:

  • The first is a set of load-aware scheduler plugins, called Trimaran, that factor in the actual usage on the worker nodes—something Kubernetes doesn’t take into account.
  • The second is a controller that allows developers to automatically resize their containers, called Vertical Pod Autoscaler (VPA).

Today, most developers need to guess how much resource they believe they’ll need, or overestimate just to be sure. This controller can resize a container in real time during runtime. We introduced upstream enhancements in Kubernetes, which lets developers easily incorporate more predictive autoscaling algorithms.

In both cases, the work started as open-source projects. Red Hat works with community-created open-source software and builds upon each project to harden security, fix bugs, patch vulnerabilities and add new features so it is ready for the enterprise.

The auto-scaler has just been made generally available and supported by Red Hat in OpenShift 4.8, and the load-aware scheduling is expected to be available in the next release of OpenShift.

“Our collaboration with IBM Research takes an upstream-first approach, helping to fuel innovation in the Kubernetes community.

“When innovation first happens in the Kubernetes community, it provides the opportunity for others to provide feedback. We then build on that feedback and apply it in OpenShift to help solve new customer use cases in the platform. Red Hat is one of the top contributors in the Kubernetes community,” Tushar Katarki, Director of Red Hat OpenShift Product Management, told us.

And the aim is to make these open-source projects that started out as research efforts blossom into use that can have a profound impact on Red Hat’s customers.

“The collaboration between IBM Research and Red Hat OpenShift has resulted in numerous enhancements that expand the intelligence of core OpenShift components,” Red Hat’s Director of OpenShift Engineering, Chris Alfonso, said. “The impact to our customers is significant in terms of managing their workloads in complex environments which demand flexibility in compute resource utilization.”

PayPal, the global payments giant, has already started putting load-aware scheduling into production

“In a large-scale environment like PayPal, the platform team has to assure the efficiency of the fleet while keeping safety in mind,” Shyam Patel, the director of Container Platform & Infrastructure at PayPal, told us.

“Standard scheduling uses declarative resource mapping, and at times workload has higher SKU than they need. In this case, we end up wasting resources. Similarly, we don’t want compute resource utilization to go beyond safe allocation. Trimaran offers resource usage-based scheduling capabilities that greatly helps achieving the optimal usage while maintaining a safety net.”

Both projects are available to download now:

For more information on how to use it, read more in Red Hat’s post on Trimaran.

Date

02 Aug 2021

Authors

Tags

Share