ComVas: Contextual Moral Values Alignment System
Inkit Padhi, Pierre Dognin, et al.
IJCAI 2024
Autonomous cyber-physical agents play an increasingly large role in our lives. To ensure that they behave in ways aligned with the values of society, we must develop techniques that allow these agents to not only maximize their reward in an environment, but also to learn and follow the implicit constraints of society. We detail a novel approach that uses inverse reinforcement learning to learn a set of unspecified constraints from demonstrations and reinforcement learning to learn to maximize environmental rewards. A contextual-bandit-based orchestrator then picks between the two policies: constraint-based and environment reward-based. The contextual bandit orchestrator allows the agent to mix policies in novel ways, taking the best actions from either a reward-maximizing or constrained policy. In addition, the orchestrator is transparent on which policy is being employed at each time step. We test our algorithms using Pac-Man and show that the agent is able to learn to act optimally, act within the demonstrated constraints, and mix these two functions in complex ways.
Inkit Padhi, Pierre Dognin, et al.
IJCAI 2024
Emanuelle Burton, Judy Goldsmith, et al.
AI Magazine
Tian Gao, Dharmashankar Subramanian, et al.
AAAI 2020
Pu Zhao, Parikshit Ram, et al.
IJCAI 2022