Loop in TokeniserWhitespace.tokenizeToArrayList
Brought to you by:
reverendsam
Never ending loop with specific inputs in uk.ac.shef.wit.simmetrics.tokenisers.TokeniserWhitespace.tokenizeToArrayList
sample program:
public static void main(String[] args) {
System.out.println("start");
InterfaceStringMetric l_metric = new MongeElkan();
String l_address_a = "POST OFFICE HOLD & PHONE";
String l_address_b = "2665 - C N HIGHLAND AVE";
float l_score = l_metric.getSimilarity(l_address_a, l_address_b);
System.out.println("end, score=" + l_score);
}
stack trace:
"main" prio=10 tid=0x00007ff70800d800 nid=0x5451 runnable [0x00007ff70ead2000]
java.lang.Thread.State: RUNNABLE
at uk.ac.shef.wit.simmetrics.tokenisers.TokeniserWhitespace.tokenizeToArrayList(TokeniserWhitespace.java:121)
at uk.ac.shef.wit.simmetrics.similaritymetrics.MongeElkan.getSimilarity(MongeElkan.java:170)
at com.mm.server.inventory.app.MergeWorklistFunctions.main(MergeWorklistFunctions.java:213)
I have a fixed version here:
https://github.com/mpkorstanje/simmetrics/blob/master/src/main/java/uk/ac/shef/wit/simmetrics/tokenisers/TokeniserCSVBasic.java
But you'll have to build from source. I'm doing an overhaul of the whole thing.