Multiagent inverse reinforcement learning for two-person zero-sum games
The focus of this paper is a Bayesian framework for solving a class of problems termed multiagent inverse reinforcement learning (MIRL). Compared to the well-known inverse reinforcement learning (IRL) problem, MIRL is formalized in the context of stochastic games, which generalize Markov decision processes to game theoretic scenarios. We establish a theoretical foundation for competitive two-agent zero-sum MIRL problems and propose a Bayesian solution approach in which the generative model is based on an assumption that the two agents follow a minimax bipolicy. Numerical results are presented comparing the Bayesian MIRL method with two existing methods in the context of an abstract soccer game. Investigation centers on relationships between the extent of prior information and the quality of learned rewards. Results suggest that covariance structure is more important than mean value in reward priors.