About cookies on this site Our websites require some cookies to function properly (required). In addition, other cookies may be used with your consent to analyze site usage, improve the user experience and for advertising. For more information, please review your options. By visiting our website, you agree to our processing of information as described in IBM’sprivacy statement. To provide a smooth navigation, your cookie preferences will be shared across the IBM web domains listed here.
Publication
FOCS 2004
Conference paper
Approximating edit distance efficiently
Abstract
Edit distance has been extensively studied for the past several years. Nevertheless, no linear-time algorithm is known to compute the edit distance between two strings, or even to approximate it to within a modest factor. Furthermore, for various natural algorithmic problems such as low-distortion embeddings into normed spaces, approximate nearest-neighbor schemes, and sketching algorithms, known results for the edit distance are rather weak. We develop algorithms that solve gap versions of the edit distance problem: given two strings of length n with the promise that their edit distance is either at most k or greater than l, decide which of the two holds. We present two sketching algorithms for gap versions of edit distance. Our first algorithm solves the k vs. (kn)2/3 gap problem, using a constant size sketch. A more involved algorithm solves the stronger k vs. l gap problem, where l can be as small as O (k2) - still with a constant sketch - but works only for strings that are mildly "non-repetitive". Finally, we develop an n3/7-approximation quasi-linear time algorithm for edit distance, improving the previous best factor of n3/4 [5]; if the input strings are assumed to be non-repetitive, then the approximation factor can be strengthened to n1/3. © 2004 IEEE.