STARC: Selective Token Access with Remapping and Clustering for Efficient LLM Decoding on PIM SystemsZehao FanYunzhen Liuet al.2026ASPLOS 2026
Advancing Fluorescence Light Detection and Ranging in Scattering Media with a Physics-Guided Mixture-of-Experts and Evidential CriticsIsmail ErbasFerhat Demikiranet al.2025NeurIPS 2025
Accelerating LLM Inference via Dynamic KV Cache Placement in Heterogeneous Memory SystemYunhua FangRui Xieet al.2025IEEE Computer Architecture Letters
Analog AI Accelerators for Transformer-based Language Models: Hardware, Workload, and Power PerformanceH. TsaiH. Benmezianeet al.2025IMW 2025
Analog-AI Hardware Accelerators for Low-Latency Transformer-Based Language Models (Invited)G.W. BurrH. Tsaiet al.2025CICC 2025
Analog-AI Hardware Accelerators for low-latency Transformer-based Language Models (Invited)Geoffrey BurrSidney Tsaiet al.2025CICC 2025
NORA: Noise-Optimized Rescaling of LLMs on Analog Compute-in-Memory AcceleratorsYayue HouHsinyu Tsaiet al.2025DATE 2025
Multi-Task Neural Network Mapping onto Analog-Digital Heterogeneous AcceleratorsHadjer BenmezianeCorey Liam Lammieet al.2024NeurIPS 2024
AIHWKIT-Lightning: A Scalable HW-Aware Training Toolkit for Analog In-Memory ComputingJulian BüchelWilliam Simonet al.2024NeurIPS 2024