About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
FOCS 2007
Conference paper
The computational hardness of estimating edit distance
Abstract
We prove the first non-trivial communication complexity lower bound for the problem of estimating the edit distance (aka Levenshtein distance) between two strings. A major feature of our result is that it provides the first setting in which the complexity of computing the edit distance is provably larger than that of Hamming distance. Our lower bound exhibits a trade-off between approximation and communication, asserting, for example, that protocols with 0(1) bits of communication can only obtain approximation α ≥ Ω(log d/ log log d), where d is the length of the input strings. This case of O(1) communication is of particular importance, since it captures constant-size sketches as well as embeddings into spaces like L1 and squared-L2, two prevailing algorithmic approaches for dealing with edit distance. Furthermore, the bound holds not only for strings over alphabet ∑ = {0, 1}, but also for strings that are permutations (called the Ulam metric). Besides being applicable to a much richer class of algorithms than all previous results, our bounds are neartight in at least one case, namely of embedding permutations into L 1. The proof uses a new technique, that relies on Fourier analysis in a rather elementary way. © 2007 IEEE.