WikiContradict: A Benchmark for Evaluating LLMs on Real-World Knowledge Conflicts from WikipediaYufang HouAlessandra Pascaleet al.2024NeurIPS 2024
Geometry of naturalistic object representations in recurrent neural network models of working memoryXiaoxuan LeiTaku Itoet al.2024NeurIPS 2024
Multivariate Stochastic Dominance via Optimal Transport and Applications to Models BenchmarkingGabriel RiouxApoorva Nitsureet al.2024NeurIPS 2024
Distributional Preference Alignment of LLMs via Optimal TransportIgor MelnykYoussef Mrouehet al.2024NeurIPS 2024
Unraveling Molecular Structure: A Multimodal Spectroscopic Dataset for ChemistryMarvin AlbertsOliver Schilteret al.2024NeurIPS 2024
Weak Supervision Performance Evaluation via Partial IdentificationFelipe Maia PoloSubha Maityet al.2024NeurIPS 2024
Prompting LLMs for Social Relation Reasoning via Greedy Segment OptimizationWanhua LiZibin Menget al.2024NeurIPS 2024
Easy-to-Hard Generalization: Scalable Alignment Beyond Human SupervisionZhiqing SunLonghui Yuet al.2024NeurIPS 2024
A Surprisingly Simple Approach to Generalized Few-Shot Semantic SegmentationTomoya SakaiHaoxiang Qiuet al.2024NeurIPS 2024