Causally Reliable Concept Bottleneck Models
Giovanni De Felice, Arianna Casanova Flores, et al.
NeurIPS 2025
We present 3DGrid-LLM, a multimodal foundation model designed to integrate natural language with three-dimensional electron density grids for applications in molecular and materials science. The architecture extends a large decoder-only language model by incorporating discrete volumetric representations obtained through a 3D VQGAN, enabling joint token-level processing of spatial and textual modalities within a unified framework. Pre-trained on a diverse corpus of molecular and materials datasets, 3DGrid-LLM supports bidirectional text–grid generation, multimodal question answering, and retrieval-augmented 3D reconstruction. Comprehensive evaluations demonstrate consistent improvements over baseline methods in multimodal VQA, chemically informed text generation, and property-aligned retrieval tasks, yielding outputs that are both accurate and physically consistent.
Giovanni De Felice, Arianna Casanova Flores, et al.
NeurIPS 2025
Sarath Swaminathan, Nathaniel Park, et al.
NeurIPS 2025
Megh Thakkar, Quentin Fournier, et al.
ACL 2025
Zhenhan Huang, Tejaswini Pedapati, et al.
IJCAI 2025