Metric learning for value alignment

Andrea Loreggia; Nicholas Mattei; Francesca Rossi; K. Brent Venable

AISafety 2019

Conference paper

11 Aug 2019

Metric learning for value alignment

Abstract

Preference are central to decision making by both machines and humans. Representing, learning, and reasoning with preferences is an important area of study both within computer science and across the social sciences. When we give our preferences to an AI system we expect the system to make decisions or recommendations that are consistent with our preferences but the decisions should also adhere to certain norms, guidelines, and ethical principles. Hence, when working with preferences it is necessary to understand and compute a metric (distance) between preferences - especially if we encode both the user preferences and ethical systems in the same formalism. In this paper we investigate the use of CP-nets as a formalism for representing orderings over actions for AI systems. We leverage a recently proposed metric for CP-nets and a neural network architecture, CPMETRIC, for computing this metric. Using these two tools we look at the how one can build a fast and flexible value alignment system.

Paper