Attribute-based vehicle search in crowded surveillance videos
Abstract
We present a novel application for searching for vehicles in surveillance videos based on semantic attributes. At the interface, the user specifies a set of vehicle characteristics (such as color, direction of travel, speed, length, height, etc.) and the system automatically retrieves video events that match the provided description. A key differentiating aspect of our system is the ability to handle challenging urban conditions such as high volumes of activity and environmental factors. This is achieved through a novel multi-view vehicle detection approach which relies on what we call motionlet classifiers, i.e. classifiers that are learned with vehicle samples clustered in the motion configuration space. We employ massively parallel feature selection to learn compact and accurate motionlet detectors. Moreover, in order to deal with different vehicle types (buses, trucks, SUVs, cars), we learn the motionlet detectors in a shape-free appearance space, where all training samples are resized to the same aspect ratio, and then during test time the aspect ratio of the sliding window is changed to allow the detection of different vehicle types. Once a vehicle is detected and tracked over the video, fine-grained attributes are extracted and ingested into a database to allow future search queries such as "Show me all blue trucks larger than 7ft length traveling at high speed northbound last Saturday, from 2pm to 5pm". © 2011 ACM.