We present a novel approach for visual detection and attribute-based search of vehicles in crowded surveillance scenes. Large-scale processing is addressed along two dimensions: 1) large-scale indexing, where hundreds of billions of events need to be archived per month to enable effective search and 2) learning vehicle detectors with large-scale feature selection, using a feature pool containing millions of feature descriptors. Our method for vehicle detection also explicitly models occlusions and multiple vehicle types (e.g., buses, trucks, SUVs, cars), while requiring very few manual labeling. It runs quite efficiently at an average of 66 Hz on a conventional laptop computer. Once a vehicle is detected and tracked over the video, fine-grained attributes are extracted and ingested into a database to allow future search queries such as Show me all blue trucks larger than 7 ft. length traveling at high speed northbound last Saturday, from 2 pm to 5 pm. We perform a comprehensive quantitative analysis to validate our approach, showing its usefulness in realistic urban surveillance settings. © 2006 IEEE.