DOVE: A Large-Scale Multi-Dimensional Predictions Dataset Towards Meaningful LLM EvaluationEliya HabbaOfir Arvivet al.2025ACL 2025
The ShareLM Collection and Plugin: Contributing Human-Model Chats for the Benefit of the CommunityShachar Don-YehiyaLeshem Choshenet al.2025ACL 2025
Compress then Serve: Serving Thousands of LoRA Adapters with Little OverheadRickard GabrielssonJiacheng Zhuet al.2025ICML 2025
LiveXiv - A Multi-Modal live benchmark based on Arxiv papers contentNimrod ShabtayFelipe Maia Poloet al.2025ICLR 2025
NumeroLogic: Number Encoding for Enhanced LLMs' Numerical ReasoningEliyahu SchwartzLeshem Choshenet al.2024EMNLP 2024
Fuse to Forget: Bias Reduction and Selective Memorization through Model FusionKerem ZamanLeshem Choshenet al.2024EMNLP 2024
Deductive Closure Training of Language Models for Coherence, Accuracy, and UpdatabilityAfra Feyza AkyürekEkin Akyüreket al.2024ACL 2024
KGKristjan GreenewaldSenior Research Scientist and Manager, Statistical Methods for Large Language Models