A Multi-View Mixture-of-Experts based on Language and Graphs for Molecular Properties PredictionVictor ShirasunaEduardo Almeida Soareset al.2024ICML 2024
How Do Nonlinear Transformers Acquire Generalization-Guaranteed CoT Ability?Hongkang LiMeng Wenget al.2024ICML 2024