Structured Sparse Transition Matrices to Enable State Tracking in State-Space ModelsAleksandar TerzicNicolas Menetet al.2025NeurIPS 2025
Causal LLM Routing: End-to-End Regret Minimization from Observational DataAsterios TsiourvasWei Sunet al.2025NeurIPS 2025
Scaling LLM Planning: NL2FLOW for Parametric Problem Generation and Rigorous EvaluationJung koo Kang2025NeurIPS 2025
SSD controller architecture for similarity search in Vector DBsRoman PletkaJovan Blanusaet al.2025FMS 2025
The ShareLM Collection and Plugin: Contributing Human-Model Chats for the Benefit of the CommunityShachar Don-YehiyaLeshem Choshenet al.2025ACL 2025
Speculate, then Collaborate: Fusing Knowledge of Language Models during DecodingZiyao WangMuneeza Azmatet al.2025ICML 2025
Compress then Serve: Serving Thousands of LoRA Adapters with Little OverheadRickard GabrielssonJiacheng Zhuet al.2025ICML 2025