Query-adaptive fusion for multimodal search
Abstract
We conduct a broad survey of query-adaptive search strategies in a variety of application domains, where the internal retrieval mechanisms used for search are adapted in response to the anticipated needs for each individual query experienced by the system. While these query-adaptive approaches can range from meta-search over text collections to multimodal search over video databases, we propose that all such systems can be framed and discussed in the context of a single, unified framework. In our paper, we keep an eye towards the domain of video search, where search cues are available from a rich set of modalities, including textual speech transcripts, low-level visual features, and high-level semantic concept detectors. The relative efficacy of each of the modalities is highly variant between many types of queries. We observe that the state of the art in query-adaptive retrieval frameworks for video collections is highly dependent upon the definition of classes of queries, which are groups of queries that share similar optimal search strategies, while many applications in text and Web retrieval have included many advanced strategies, such as direct prediction of search method performance and inclusion of contextual cues from the searcher. We conclude that such advanced strategies previously developed for text retrieval have a broad range of possible applications in future research in multimodal video search. © 2008 IEEE.