RaPiD: AI Accelerator for Ultra-Low Precision Training and InferenceSwagath VenkataramaniVijayalakshmi Srinivasanet al.2021ISCA 2021
Efficient AI System Design with Cross-Layer Approximate ComputingSwagath VenkataramaniXiao Sunet al.2020Proceedings of the IEEE
DeepTools: Compiler and Execution Runtime Extensions for RaPiD AI AcceleratorSwagath VenkataramaniJungwook Choiet al.2019IEEE Micro
A Compiler for Deep Neural Network Accelerators to Generate Optimized Code for a Wide Range of Data Parameters from a Hand-crafted Computation KernelEri OgawaKazuaki Ishizakiet al.2019COOL CHIPS 2019
Adaptive ensemble prediction for deep neural networks based on confidence levelHiroshi Inoue2019AISTATS 2019
Accelerating Spark Datasets by Inlining DeserializationJan WroblewskiKazuaki Ishizakiet al.2017IPDPS 2017
Efficient tomographic reconstruction for commodity processors with limited memory bandwidthHiroshi Inoue2016ISBI 2016
SIMD- and cache-friendly algorithm for sorting an array of structuresHiroshi InoueKenjiro Taura2015VLDB 2015
Characterization of call-graph profiles in Java workloadsTakuya NakaikeHiroshi Inoueet al.2014IISWC 2014