I used cc8compt() as part of an analysis to determine the radius in pixels of black circles in an image (see attached asym_circles_11x8.png). To my surprise, the number of clusters found was not equal to the number of circles in the image. The problem is that the algorithm keeps a running could of possible labels and then eliminates the duplicates that form a single cluster. The number of possible labels can get larger than the value of a byte resulting in wrap-around and an erroneous result.
The relevant/problematic assignment is around line 1211 in PDL-2.015. I'm marking the bug as critical because it is silent but deadly.
if ($b(m=>i, n=>j) > 0) { /* Check 4 neighbour already seen */
if (i>0 && $b(m=>i1, n=>j)>0) /*West*/
neighbour[nfound++] = $b(m=>i1, n=>j); /* Store label of it */
if (j>0 && $b(m=>i, n=>j1)>0) /*North*/
neighbour[nfound++] = $b(m=>i, n=>j1);
if (j>0 && i>0 && $b(m=>i1, n=>j1)>0 && $COMP(con)==8) /*North-West*/
neighbour[nfound++] = $b(m=>i1, n=>j1);
if (j>0 && i<(nx-1) && $b(m=>i2, n=>j1)>0 && $COMP(con)==8) /*North-East*/
neighbour[nfound++] = $b(m=>i2, n=>j1);
if (nfound==0) { /* Assign new label */
$b(m=>i, n=>j) = newlabel++;
}
else {
$b(m=>i, n=>j) = neighbour[0];
if (nfound>1 && pass == 1) { /* Assign equivalents */
for(k=1; k<nfound; k++)
AddEquiv( equiv, (PDL_Long)$b(m=>i, n=>j),
neighbour[k] );
}
}
}
Interesting. What would the desired behavior be? I'm thinking either:
The second one's a lot easier to implement (I've already done it), and seems to do the trick. Of course if the algorithm tries to go over PDL_Long's max (2.1E9) then the same problem will manifest. But this is not likely to be a problem in any real usage that I can imagine.
Last edit: Derek Lamb 2016-03-16
Please check out the ccNcompt_byte branch and see if that works for you.