Rollout Roulette: A Probabilistic Inference Approach to Inference-Time Scaling of LLMs using Particle-Based Monte Carlo MethodsIsha PuriShivchander Sudalairajet al.2025NeurIPS 2025
UNVEILING THE SECRET RECIPE: A GUIDE FOR SUPERVISED FINE-TUNING SMALL LLMSAldo ParejaNikhil Shivakumar Nayaket al.2025ICLR 2025
Beyond Uniform Sampling: Offline Reinforcement Learning with Imbalanced DatasetsZhang-wei HongAviral Kumaret al.2023NeurIPS 2023
The ThreeDWorld Transport Challenge: A Visually Guided Task-and-Motion Planning Benchmark Towards Physically Realistic Embodied AIChuang GanSiyuan Zhouet al.2022ICRA 2022
ThreeDWorld: A Platform for Interactive Multi-Modal Physical SimulationChuang GanJeremy Schwartzet al.2021NeurIPS 2021
OPEn: An Open-ended Physics Environment for Learning Without a TaskChuang GanAbhishek Bhandwaldaret al.2021IROS 2021
AGENT: A Benchmark for Core Psychological ReasoningTianmin ShuAbhishek Bhandwaldaret al.2021ICML 2021