Publication
NeurIPS 2019
Workshop paper

Distributional Actor-Critic for Risk-Sensitive Multi-Agent Reinforcement Learning

Abstract

Distributional Deep Reinforcement Learning connects many aspects of risks to Reinforcement Learning. In this paper, we introduce a distributional approach for multi-agent deep deterministic policy gradient (MADDPG), which moves away from using the expectation or point estimates of the value function. The distributional variant of MADDPG provides a distribution in its prediction of the return, and the CVaR-based approach enables the agents to exhibit different risk tolerance levels. Our results demonstrate that the stability of agent learning is dependent on the learning method of all agents, and that agent learning under mixed-method setting is more unstable. The introduction of a distributional agent into a multi-agent setting can change the game dynamics and equilibrium, in this case due to the agent being more risk-averse. We evaluate our algorithm on two examples of continuous actions on continuous states: a multi-agent case when the agent needs to reach the goal in the presence of an adversarial agent and when multiple adversarial agents chase one agent around obstacles. There is tradeoffs between risk sensitivity and agent performance, which can change the overall score equilibrium.

Date

08 Dec 2019

Publication

NeurIPS 2019

Authors

Share