About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
CF 2018
Conference paper
GraphBLAS: Handling performance concerns in large graph analytics - Invited paper
Abstract
Emerging applications in health-care, social media analytics, cybersecurity, homeland security, and marketing require large graph analytics. Attaining good performance on these applications on modern day hardware is challenging because of the complex pipelines and deep memory hierarchy of these machines. In this paper, we review the linear algebra formulation of graph-analytics and show that it effectively handles the separation of performance concerns, best handled by system developers, from application logic concerns. The linear algebra formulation leverages the community experience in optimizing both hardware and software for applications that have a substantial linear algebra component. We review the GraphBLAS API, a compact C API for linear algebra formulation of graph algorithms. The core semiring operations are described first, followed by the rest of the API. We then illustrate how commonly used graph algorithms are implemented using the main GraphBLAS API calls. Executing these algorithms on a highly optimized linear algebra run-time validates that the time spent in execution of the algorithm is indeed almost entirely in the library, thus delegating the performance concerns solely to the library developer. Furthermore, the linear algebra formulation consistently outperforms the textbook version of these algorithms by a factor of two to five. Vector and matrix multiplications consume the majority of the computational time, particularly as problem size increases, putting them in the cross hairs for performance optimization.