ScaleCom: Scalable Sparsified Gradient Compression for Communication-Efficient Distributed Training Chia-Yu ChenJiamin Niet al.2020NeurIPS 2020
Efficient AI System Design with Cross-Layer Approximate ComputingSwagath VenkataramaniXiao Sunet al.2020Proceedings of the IEEE
A 3.0 TFLOPS 0.62V Scalable Processor Core for High Compute Utilization AI Training and InferenceJinwook OhSae Kyu Leeet al.2020VLSI Circuits 2020
Hybrid 8-bit floating point (HFP8) training and inference for deep neural networksXiao SunJungwook Choiet al.2019NeurIPS 2019
DLFloat: A 16-b Floating Point Format Designed for Deep Learning Training and InferenceAnkur AgrawalBruce Fleischeret al.2019ARITH 2019
Accumulation bit-width scaling for ultra-low precision training of deep networksCharbel SakrNaigang Wanget al.2019ICLR 2019
Innovate Practices on CyberSecurity of Hardware Semiconductor DevicesAlfred L. CrouchPeter Levinet al.2019VTS 2019
Training deep neural networks with 8-bit floating point numbersNaigang WangJungwook Choiet al.2018NeurIPS 2018
A Scalable Multi-TeraOPS Core for AI Training and InferenceSunil ShuklaBruce Fleischeret al.2018IEEE SSC-L
03 Mar 2025US12240753Micro-electromechanical Device Having A Soft Magnetic Material Electrolessly Deposited On A Palladium Layer Coated Metal Beam
23 Dec 2024US12175359Machine Learning Hardware Having Reduced Precision parameter Components For Efficient Parameter Update
21 Jul 2024JP7525237Machine Learning Hardware Having Reduced Precision Parameter Components For Efficient Parameter Update
KEKaoutar El MaghraouiPrincipal Research Scientist and Manager, AIU Spyre Model Enablement, AI Hardware Center