Personalized Assessment of Arousal and Valence from Videos

Matthew Pediaditis; Anca Nicoleta Ciubotaru; Thomas Brunschwiler; Maria Gabrani

doi:10.1109/ICHI48887.2020.9374354

ICHI 2020

Conference paper

01 Nov 2020

Personalized Assessment of Arousal and Valence from Videos

View publication

Abstract

Human behavior is influenced by numerous subjective factors such as the environment, culture, hormones, genes etc. This makes the development of a one-size-fits-All behavioral model for emotion recognition challenging, especially in the domain of affect recognition. In this paper we present a method to classify and assess arousal and valence from video in a personalized way. We represent the inherent information in the video independently through three semantically different types of signals, namely motion, appearance and physiology. We use a single-and multi-stream LSTM model for data fusion and classification, and compare our results against published values on a publicly available dataset consisting of 40 subjects. We further demonstrate that the personalized approach reaches better performance (Arousal: 78.16% avg. acc.; Valence 89.22% avg. acc.), while providing more insight into the role of each signal group. For arousal classification we can distinguish between subjects that show dominance of motion-related expressions against others that exhibit more static expressions. Fusion of all three signal types gave an advantage on very few subjects, a challenge that might be related to the video recordings being too short.

Conference paper