About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
IEEE Journal of Solid-State Circuits
Paper
Power-Limited Inference Performance Optimization Using a Software-Assisted Peak Current Regulation Scheme in a 5-nm AI SoC
Abstract
Discrete AI inference cards, operating under form-factor and system-defined peak power constraints, must serve diverse inference requests with widely varying power consumption. A peak current-limiting scheme is proposed to maximize inference performance across practical use cases. The peak current management block consists of a card-level current sensing circuit with an AI inference-aware feed-forward and feedback control mechanism. The card-level sensing improves performance by eliminating the need for additional margins for power consumed by off-chip components. Compiler-assisted feed-forward control exploits the predictability of AI inferences and proactively manages peak currents without a static reduction in operating frequency. Measurements from an AI system on chip (SoC), fabricated in 5-nm technology, show up to 41% improvement in Bert-Large inference throughput by engaging the peak current control.