Toward efficient action recognition: Principal backpropagation for training two-stream networks

Wenbing Huang; Lijie Fan; Mehrtash Harandi; Lin Ma; Huaping Liu; Wei Liu; Chuang Gan

doi:10.1109/TIP.2018.2877936

IEEE TIP

Paper

01 Apr 2019

Toward efficient action recognition: Principal backpropagation for training two-stream networks

View publication

Abstract

In this paper, we propose the novel principal backpropagation networks (PBNets) to revisit the backpropagation algorithms commonly used in training two-stream networks for video action recognition. We content that existing approaches always take all the frames/snippets for the backpropagation not optimal for video recognition since the desired actions only occur in a short period within a video. To remedy these drawbacks, we design a watch-and-choose mechanism. In particular, the watching stage exploits a dense snippet-wise temporal pooling strategy to discover the global characteristic for each input video, while the choosing phase only backpropagates a small number of representative snippets that are selected with two novel strategies, i.e., Max-rule and KL-rule. We prove that with the proposed selection strategies, performing the backpropagation on the selected subset is capable of decreasing the loss of the whole snippets as well. The proposed PBNets are evaluated on two standard video action recognition benchmarks UCF101 and HMDB51, where it surpasses the state of the arts consistently, but requiring less memory and computation to achieve high performance.

Conference paper