[Pleac-commits] CVS: pleac/pleac pleac_haskell.data,1.71,1.72
Status: Alpha
Brought to you by:
ggc
From: Pixel <pi...@us...> - 2008-11-05 09:38:41
|
Update of /cvsroot/pleac/pleac/pleac In directory ddv4jf1.ch3.sourceforge.com:/tmp/cvs-serv31169 Modified Files: pleac_haskell.data Log Message: add "levenshtein" versions easier to understand Index: pleac_haskell.data =================================================================== RCS file: /cvsroot/pleac/pleac/pleac/pleac_haskell.data,v retrieving revision 1.71 retrieving revision 1.72 diff -u -r1.71 -r1.72 --- pleac_haskell.data 4 Nov 2008 21:07:07 -0000 1.71 +++ pleac_haskell.data 5 Nov 2008 09:38:28 -0000 1.72 @@ -2163,10 +2163,30 @@ import Data.List -- Calculates the Levenshtein, or edit distance, between two strings. --- (version taken on internet) -levenshtein sa sb = last $ foldl transform [0..length sa] sb +-- here is a basic/slow version: +levenshtein_basic s t = distance (length s) (length t) + where distance i 0 = i + distance 0 j = j + distance i j = minimum [ distance (i-1) j + 1 + , distance i (j-1) + 1 + , distance (i-1) (j-1) + (if s!!(i-1)==t!!(j-1) then 0 else 1) ] + +-- a fast version based on the previous one, adding memoization +-- (note the recursive use of "d") +levenshtein s t = d + where d = [ [distance m n | n<-[0..length t]] | m<-[0..length s] ] + distance i 0 = i + distance 0 j = j + distance i j = minimum [ d!!(i-1)!!j + 1 + , d!!i!!(j-1) + 1 + , d!!(i-1)!!(j-1) + (if s!!(i-1)==t!!(j-1) then 0 else 1) ] + +-- a more efficient/cryptic version. +-- (it computes the "distance" figures for each "sb" chars, +-- ie it memoizes only what is needed) +levenshtein' sa sb = foldl transform [0..length sa] sb where - transform xs@(x:xs') c = scanl compute (x+1) (zip3 sa xs xs') + transform xs@(x:xs') c = scanl compute (x+1) (zip3 sa xs xs') where compute z (c', x, y) = minimum [y+1, z+1, x + if c == c' then 0 else 1] |