Publication
NeurIPS 2000
Conference paper

FaceSync: A linear operator for measuring synchronization of video facial images and audio tracks

Abstract

FaceSync is an optimal linear algorithm that finds the degree of synchronization between the audio and image recordings of a human speaker. Using canonical correlation, it finds the best direction to combine all the audio and image data, projecting them onto a single axis. FaceSync uses Pearson's correlation to measure the degree of synchronization between the audio and image data. We derive the optimal linear transform to combine the audio and visual information and describe an implementation that avoids the numerical problems caused by computing the correlation matrices.

Date

Publication

NeurIPS 2000

Authors

Share