About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
CGO 2014
Conference paper
Optimizing R VM: Allocation removal and path length reduction via interpreter-level specialization
Abstract
The performance of R, a popular data analysis language, was never properly understood. Some claimed their R codes ran as efficiently as any native code, others quoted orders of magnitude slowdown of R codes with respect to equivalent C implementations.We found both claims to be true depending on how an R code is written. This paper introduces a first classification of R programming styles into Type I (looping over data), Type II (vector programming), and Type III (glue codes). The most serious overhead of R are mostly manifested on Type I R codes, whereas many Type III R codes can be quite fast. This paper focuses on improving the performance of Type I R codes. We propose the ORBIT VM, an extension of the GNU R VM, to perform aggressive removal of allocated objects and reduction of instruction path lengths in the GNU R VM via profile-driven specialization techniques. The ORBIT VM is fully compatible with the R language and is purely based on interpreted execution. It is a specialization JIT and runtime focusing on data representation specialization and operation specialization. For our benchmarks of Type I R codes, ORBIT is able to achieve an average of 3.5X speedups over the current release of GNU R VM and outperforms most other R optimization projects that are currently available. Copyright © 2014 by the Association for Computing Machinery, Inc. (ACM).