The black-box complexity of nearest-neighbor search

Robert Krauthgamer; James R. Lee

doi:10.1016/j.tcs.2005.09.017

Theoretical Computer Science

Paper

08 Dec 2005

The black-box complexity of nearest-neighbor search

View publication

Abstract

We define a natural notion of efficiency for approximate nearest-neighbor (ANN) search in general n-point metric spaces, namely the existence of a randomized algorithm which answers (1+ε)-ANN queries in polylog(n) time using only polynomial space. We then study which families of metric spaces admit efficient ANN schemes in the black-box model, where only oracle access to the distance function is given, and any query consistent with the triangle inequality may be asked. For ε<25, we offer a complete answer to this problem. Using the notion of metric dimension defined in [A. Gupta, et al., Bounded geometries, fractals, and low-distortion embeddings, in: 44th Annu. IEEE Symp. on Foundations of Computer Science, 2003, pp. 534-543] (ǎ la [P. Assouad, Plongements lipschitziens dans Rn, Bull. Soc. Math. France 111 (4) (1983) 429-448]), we show that a metric space X admits an efficient (1+ε)-ANN scheme for any ε<25 if and only if dim(X)=O(loglogn). For coarser approximations, clearly the upper bound continues to hold, but there is a threshold at which our lower bound breaks down - this is precisely when points in the "ambient space" may begin to affect the complexity of "hard" subspaces S⊆X. Indeed, we give examples which show that dim(X) does not characterize the black-box complexity of ANN above the threshold. Our scheme for ANN in low-dimensional metric spaces is the first to yield efficient algorithms without relying on any additional assumptions on the input. In previous approaches (e.g., [K.L. Clarkson, Nearest neighbor queries in metric spaces, Discrete Comput. Geom. 22(1) (1999) 63-93; D. Karger, M. Ruhl, Finding nearest neighbors in growth-restricted metrics, in: 34th Annu. ACM Symp. on the Theory of Computing, 2002, pp. 63-66; R. Krauthgamer, J.R. Lee, Navigating nets: simple algorithms for proximity search, in: 15th Annu. ACM-SIAM Symp. on Discrete Algorithms, 2004, pp. 791-801; K. Hildrum, et al., A note on finding nearest neighbors in growth-restricted metrics, in: Proc. of the 15th Annu. ACM-SIAM Symp. on Discrete Algorithms, 2004, pp. 560-561]), even spaces with dim(X)=O(1) sometimes required Ω(n) query times. © 2005 Elsevier B.V. All rights reserved.

Conference paper