Towards Efficient Key-Value Cache Management for Prefix Prefilling in LLM InferenceYue ZhuHao Yuet al.2025CLOUD 2025
From Device Passthrough to Host Passout: Exploring RAS Risks in {High-Performance, Virtualized} AI-SystemsChathura RajapakshaSandhya Koteshwaraet al.2025OSDI 2025
Deep learning software stacks for analogue in-memory computing-based acceleratorsCorey Liam LammieHadjer Benmezianeet al.2025Nat. Rev. Electr. Eng.
Granite Time Series: Lightweight Time-Series Models for Industrial Edge AITakayuki Katsuki2025AI Alliance Tokyo 2025
IBM Lithography Roadmap and Need for Future Lithography ToolsAllen GaborMartin Burkhardtet al.2025EUVL and Source Workshop 2025
NSFlow: An End-to-End FPGA Framework with Scalable Dataflow Architecture for Neuro-Symbolic AIHanchen YangZishen Wanet al.2025DAC 2025
Effect of Capping Layer Under Forming Gas Anneal for Back-End-of-Line Oxide Semiconductor FETsSaketh Ram MamidalaAntonio La Portaet al.2025DRC 2025
Innovative BEOL Oxide-Based Devices as Key Enablers for High-Performing Heterogeneous SystemsValeria BragagliaWooseok Choiet al.2025DRC 2025
Architecture and Design Approaches towards Large-scale AI Hardware AccelerationAshish Ranjan2025DAC 2025