About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
CIDR 2013
Conference paper
NUMA-aware algorithms: The case of data shuffling
Abstract
In recent years, a new breed of non-uniform memory access (NUMA) systems has emerged: multi-socket servers of multi-cores. This paper makes the case that data management systems need to employ designs that take into consideration the characteristics of modern NUMA hardware. To prove our point, we focus on a primitive that is used as the building block of numerous data management operations: data shuffling. We perform a comparison of different data shuffling algorithms and show that a naïve data shuffling algorithm can be up to 3× slower than the highest performing, NUMA-aware one. To achieve the highest performance, we employ a combination of thread binding, NUMA-aware thread allocation, and relaxed global coordination among threads. The importance of such NUMA-aware algorithm designs will only increase, as future server systems are expected to feature ever larger numbers of sockets and increasingly complicated memory subsystems.