# Posted here to open discussion

Sam

I think I've found an issue with your code in that what you have implemented for the Jaro distance metric (Jaro.java / Jaro.cs) does not correspond to the forumula you have listed for Jaro on your webpage (reference http://www.dcs.shef.ac.uk/~sam/stringmetrics.html#jarowinkler\).

(The formula you're trying to implement is attached as the GIF taken from your webpage.)

In your code, you calculate the final Jaro distance as below:

//calculate jaro metric

transpositions /= 2;

double tmp1;

tmp1 = commonMatches / (3.0 * firstWord.Length) + commonMatches / (3.0 * secondWord.Length) +

(commonMatches - transpositions) / (3.0 * commonMatches);

return tmp1;

However, the last term in the equation [(|s'| - Ts',t') / (2 |s'|)] does not seem to be captured with the above code. Even though you're dividing the transpositions value in half, it's the whole numerator the needs to be divided by two--not just the transpositions term. I checked the SecondString library and it makes the same error… Unless the error is in the equation you have listed (I can't know for sure since I haven't been able to locate the original Jaro papers).

The SecondString java source for Jaro is below. Note that it divides the # of transpositions in half as well at a prior point in the code.

double dist =

( common1.length()/((double)str1.length()) +

common2.length()/((double)str2.length()) +

(common1.length()-transpositions)/((double)common1.length()) ) / 3.0;

Can you shed some light on this?