Programming with relaxed synchronization
Lakshminarayanan Renganarayana, Vijayalakshmi Srinivasan, et al.
SPLASH 2012
The ubiquitous adoption of systems specialized for AI requires bridging two seemingly conflicting challenges-the need to deliver extreme processing efficiencies while employing familiar programming interfaces, making them compelling even for non-expert users. We take a significant first step towards this goal and present an end-to-end software stack for the RaPiD AI accelerator developed by IBM Research. We present a set of software extensions, called Deeptools, that leverage and work within popular deep learning frameworks. DeepTools requires no additional user input and enables aggressive, accelerator-specific performance optimization akin to a full, custom framework. DeepTools has two key components: 1) a compiler runtime called DeepRT, which automatically identifies how best to execute a given DNN graph on RaPiD and constructs the requisite program binaries; and 2) an execution runtime called RaPiDLib, which triggers and manages the execution of compute and data-transfer operations on RaPiD. We integrate DeepTools with TensorFlow and map popular DNNs (AlexNet, VGG, ResNet, LSTM) to RaPiD. We demonstrate substantial improvement in performance over hand-tuned mappings.
Lakshminarayanan Renganarayana, Vijayalakshmi Srinivasan, et al.
SPLASH 2012
Shubham Jain, Swagath Venkataramani, et al.
DAC 2018
Jason Zebchuk, Harold W. Cain, et al.
PACT 2012
Ankur Agrawal, Chia-Yu Chen, et al.
DAC 2017