Retrieving Potential Causes from a Query Event
Abstract
Different to traditional IR, which retrieves a set of topically relevant documents given a user query, we investigate causal retrieval, which involves retrieving a set of documents that describe a set of potential causes leading to an effect specified in the query. We argue that the nature of causal relevance should be different to that of traditional topical relevance. This is because although the causally relevant documents would have partial term overlap with the ones that are topically relevant for a query, yet it is expected that a majority of these documents would use a different set of terms to describe a number of causes possibly leading to their effects. To address this, we propose a feedback model to estimate a distribution of terms which are relatively infrequent but associated with high weights in the topically relevant distribution, leading to potential causal relevance. Our experiments demonstrate that such a feedback model turns out to be substantially more effective than traditional IR models and a number of other causality heuristic baselines.