Publication
CIKM 2009
Conference paper

A scalable and effective full-text search in P2P networks

View publication

Abstract

We consider the problem of full-text search involving multi-term queries in a network of self-organizing, autonomous peers. Existing approaches do not scale well with respect to the number of peers, because they either require access to a large number of peers or incur a high communication cost in order to achieve good query results. In this paper, we present a novel algorithmic framework for processing multi-term queries in P2P networks that achieves high recall while using (per-query) a small number of peers and a low communication cost, thereby enabling high query throughput. Our approach is based on per-query peer-selection strategy using two-dimensional histograms of score distributions. A full utilization of the histograms incurs a high communication cost. We show how to drastically reduce this cost by employing a two-phase peer-selection algorithm. We also describe an adaptive approach to peer selection that further increases the recall. Experiments on a large real-world collection show that the recall is indeed high while the number of involved peers and the communication cost are low. Copyright 2009 ACM.

Date

01 Dec 2009

Publication

CIKM 2009

Authors

Share