Modeling concepts and their relationships for corpus-based query auto-completion
Query auto-completion helps users to formulate their information needs by providing suggestion lists at every typed key. This task is commonly addressed by exploiting query logs and the approaches proposed in the literature fit well in web scale scenarios, where usually huge amounts of past user queries can be analyzed to provide reliable suggestions. However, when query logs are not available, e.g. in enterprise or desktop search engines, these methods are not applicable at all. To face these challenging scenarios, we present a novel corpus-based approach which exploits the textual content of an indexed document collection in order to dynamically generate query completions. Our method extracts informative text fragments from the corpus and it combines them using a probabilistic graphical model in order to capture the relationships between the extracted concepts. Using this approach, it is possible to automatically complete partial queries with significant suggestions related to the keywords already entered by the user without requiring the analysis of the past queries. We evaluate our system through a user study on two different real-world document collections. The experiments show that our method is able to provide meaningful completions outperforming the state-of-the art approach.