Structured Sparse Transition Matrices to Enable State Tracking in State-Space ModelsAleksandar TerzicNicolas Menetet al.2025NeurIPS 2025
Scalable Evaluation and Neural Models for Compositional GeneralizationGiacomo CamposampieroPietro Barbieroet al.2025NeurIPS 2025
A foundation model with multi-variate parallel attention to generate neuronal activityFrancesco CarzanigaMichael Herscheet al.2025NeurIPS 2025
I-RAVEN-X: Benchmarking Generalization and Robustness of Analogical and Mathematical Reasoning in Large Language and Reasoning ModelsGiacomo CamposampieroMichael Herscheet al.2025NeurIPS 2025
The Case for Cleaner Biosignals: High-fidelity Neural Compressor Enables Transfer from Cleaner iEEG to Noisier EEGFrancesco CarzanigaGary Hoppeleret al.2025ICLR 2025
Limits of Transformer Language Models on Learning to Compose AlgorithmsJonathan ThommGiacomo Camposampieroet al.2024NeurIPS 2024
Recurrent Transformers Trade-off Parallelism for Length Generalization on Regular LanguagesPaul SoulosAleksandar Terzicet al.2024NeurIPS 2024
On the role of noise in factorizers for disentangling distributed representationsKumudu Geethan KarunaratneMichael Herscheet al.2024NeurIPS 2024
RETRO-LI: Small-Scale Retrieval Augmented Generation Supporting Noisy Similarity Searches and Domain Shift GeneralizationGentiana RashitiKumudu Geethan Karunaratneet al.2024ECAI 2024