Ben Huh, Avinash Baidya
NeurIPS 2022
We propose VRL3, a powerful data-driven framework with a minimalist design for solving highly challenging visual deep reinforcement learning (DRL) tasks. We analyze a number of major obstacles in taking a data-driven approach, and present a suite of design principles, novel findings, and critical insights about data-driven visual DRL. Our framework has three stages: in stage 1, we leverage non-RL datasets (e.g. ImageNet) to learn task-agnostic visual representations; in stage 2, we use offline RL data (e.g. a limited number of expert demonstrations) to convert the task-agnostic representations into more powerful task-specific representations; in stage 3, we fine-tune the agent with online RL. On a set of highly challenging hand manipulation tasks with sparse reward and realistic visual inputs, compared to the previous SOTA, VRL3 achieves an average of 780% better sample efficiency. And on the hardest task, VRL3 is 1220% more sample efficient and solves the task with only 10% of the computation. These highly significant results clearly demonstrate the great potential of data-driven deep reinforcement learning.
Ben Huh, Avinash Baidya
NeurIPS 2022
Hongyu Tu, Shantam Shorewala, et al.
NeurIPS 2022
Chanakya Ekbote, Moksh Jain, et al.
NeurIPS 2022
Shiqiang Wang, Nathalie Baracaldo Angel, et al.
NeurIPS 2022