Learning from Failure: Introducing Failure Ratio in RL

Minori Narita; Daiki Kimura

IJCAI 2020

Workshop paper

11 Jul 2020

Learning from Failure: Introducing Failure Ratio in RL

Download paper

Abstract

Deep reinforcement learning combined with Monte-Carlo tree search (MCTS) has demonstrated high performance and thus has been attracting much attention. However, the learning convergence is quite time consuming. In comparison, learning by playing board games with human opponents is more efficient because skills and strategies can be acquired from the failure patterns. We assume that failure patterns contain much meaningful information to expedite the training process, working as prior knowledge for reinforcement learning. To utilize this prior knowledge, we propose an efficient tree search method that introduces the use of a failure ratio that has a high value for failure patterns. We tested our hypothesis by applying this method to the Othello board game. The results show that our method has a higher winning ratio than a state-of-the-art method, especially in the early stage of learning.

Conference paper