15,958,091 members
Home / Discussions / Algorithms

# Algorithms

 Re: Hi There is a function in PhotoShop Color match! shengcheng_jin10-Nov-11 20:57 shengcheng_jin 10-Nov-11 20:57
 Re: Hi There is a function in PhotoShop Color match! Richard MacCutchan29-Oct-11 7:16 Richard MacCutchan 29-Oct-11 7:16
 Re: Hi There is a function in PhotoShop Color match! shengcheng_jin30-Oct-11 19:37 shengcheng_jin 30-Oct-11 19:37
 Re: Hi There is a function in PhotoShop Color match! the north face15-Nov-11 15:57 the north face 15-Nov-11 15:57
 Levenshtein Distance PIEBALDconsult14-Oct-11 6:05 PIEBALDconsult 14-Oct-11 6:05
 Re: Levenshtein Distance TheGreatAndPowerfulOz18-Oct-11 12:44 TheGreatAndPowerfulOz 18-Oct-11 12:44
 Re: Levenshtein Distance PIEBALDconsult19-Oct-11 6:27 PIEBALDconsult 19-Oct-11 6:27
 Re: Levenshtein Distance Luc Pattyn18-Oct-11 14:08 Luc Pattyn 18-Oct-11 14:08
 For similar strings the relevant data in the 2D matrix eventually is located around the diagonal (although the naive implementation of the algorithm, such as shown by Wikipedia, needs all the matrix elements to come up with the distance). So what you could do is: 1. use a sparse matrix, as ahmed already suggested; [ADDED] maybe in the top triangle, you don't need to store those elements that equal their top neighbour plus one, and in the bottom triangle those that equal their left neighbour plus one. However this is likely to fail dramatically if the strings start out with a couple of differences...[/ADDED] 2. or use a band centered around the main diagonal, making the memory cost linear rather than quadratic (unless you want to have the band's thickness a constant fraction of the problem size); 3. or develop something similar to the sliding window approach I applied here: LPTextFileDiff: another textfile compare utility.[^] For #2 and #3 you would need: - to estimate the first row and column values from what you already have, that could be tricky and somewhat unreliable; - to detect pathological cases You would not get exact distance for very distant strings, which I guess isn't really important. Another way to attack the problem is by "tokenizing", i.e. replace your input (a sequence of characters) by a sequence of tokens, where a token would be a specific identifier, a keyword, an operator, white space, etc., whatever is relevant in the language you're dealing with. That may reduce the dimension of the problem by a factor of 3 to 5 I'd say. If you're not inclined to create a full tokenizer (which is much simpler than a full parser), you could still look for popular strings (say all the keywords) and replace each of them by a different Unicode character that is not supposed to be present otherwise. [ADDED]And then, maybe you don't really want Levinshtein at all, any good measure of difference could be sufficient. How about this stochastic one: ```int distance=0; for(a fixed number of iterations, say 300) { from the shortest string select a random substring with length 1% search for it in the longer string; if (no hit) { distance+=BIG_NUMBER; } else { if (multiple hits) find the one that is closest to the original, position wise (and relative, work in %) distance+=square of distance of positions in smaller and larger string } }``` This may take more cycles but it won't cost you a lot of memory! [/ADDED] Luc Pattyn [My Articles] Nil Volentibus Arduum iSadmodified 18-Oct-11 22:02pm.
 Re: Levenshtein Distance Luc Pattyn18-Oct-11 17:23 Luc Pattyn 18-Oct-11 17:23
 Re: Levenshtein Distance PIEBALDconsult18-Oct-11 17:54 PIEBALDconsult 18-Oct-11 17:54
 Re: Levenshtein Distance Luc Pattyn19-Oct-11 0:31 Luc Pattyn 19-Oct-11 0:31
 Re: Levenshtein Distance PIEBALDconsult19-Oct-11 2:40 PIEBALDconsult 19-Oct-11 2:40
 Re: Levenshtein Distance Luc Pattyn19-Oct-11 3:01 Luc Pattyn 19-Oct-11 3:01
 Re: Levenshtein Distance PIEBALDconsult19-Oct-11 3:16 PIEBALDconsult 19-Oct-11 3:16
 Re: Levenshtein Distance Luc Pattyn19-Oct-11 3:51 Luc Pattyn 19-Oct-11 3:51
 Re: Levenshtein Distance TheGreatAndPowerfulOz19-Oct-11 4:58 TheGreatAndPowerfulOz 19-Oct-11 4:58
 Re: Levenshtein Distance PIEBALDconsult19-Oct-11 6:24 PIEBALDconsult 19-Oct-11 6:24
 Re: Levenshtein Distance YvesDaoust19-Oct-11 3:31 YvesDaoust 19-Oct-11 3:31
 Re: Levenshtein Distance Luc Pattyn19-Oct-11 3:49 Luc Pattyn 19-Oct-11 3:49
 Re: Levenshtein Distance YvesDaoust19-Oct-11 4:06 YvesDaoust 19-Oct-11 4:06
 Re: Levenshtein Distance Luc Pattyn19-Oct-11 4:16 Luc Pattyn 19-Oct-11 4:16
 Re: Levenshtein Distance YvesDaoust19-Oct-11 4:27 YvesDaoust 19-Oct-11 4:27
 Re: Levenshtein Distance PIEBALDconsult19-Oct-11 6:26 PIEBALDconsult 19-Oct-11 6:26
 Re: Levenshtein Distance PIEBALDconsult19-Oct-11 6:24 PIEBALDconsult 19-Oct-11 6:24
 Re: Levenshtein Distance PIEBALDconsult19-Oct-11 6:28 PIEBALDconsult 19-Oct-11 6:28
 Last Visit: 31-Dec-99 18:00     Last Update: 3-Aug-24 9:42 Refresh ᐊ Prev1...86878889909192939495 Next ᐅ