View-invariant alignment and matching of video sequences
Abstract
In this paper, we propose a novel method to establish temporal correspondence between the frames of two videos. 3D epipolar geometry is used to eliminate the distortion generated by the projection from 3D to 2D. Although the fundamental matrix contains the extrinsic property of the projective geometry between views, it is sensitive to noise. Therefore, we propose the use of a rank constraint of corresponding points in two views to measure the similarity between trajectories. This rank constraint shows more robustness and avoids computation of the fundamental matrix. A dynamic programming approach using the similarity measurement is proposed to find the non-linear time-warping function for videos containing human activities. In this way, videos of different individuals taken at different times and from distinct viewpoints can be synchronized. A 'temporal pyramid of trajectories is applied to improve the accuracy of the view-invariant dynamic time-warping approach. We show various applications of this approach such as video synthesis, human action recognition, and computer aider training. Compared to state-of-the-art techniques, our method shows a great improvement.