Loop in TokeniserWhitespace.tokenizeToArrayList
Brought to you by:
reverendsam
Never ending loop with specific inputs in uk.ac.shef.wit.simmetrics.tokenisers.TokeniserWhitespace.tokenizeToArrayList
sample program:
public static void main(String[] args) { System.out.println("start"); InterfaceStringMetric l_metric = new MongeElkan(); String l_address_a = "POST OFFICE HOLD & PHONE"; String l_address_b = "2665 - C N HIGHLAND AVE"; float l_score = l_metric.getSimilarity(l_address_a, l_address_b); System.out.println("end, score=" + l_score); }
stack trace:
"main" prio=10 tid=0x00007ff70800d800 nid=0x5451 runnable [0x00007ff70ead2000] java.lang.Thread.State: RUNNABLE at uk.ac.shef.wit.simmetrics.tokenisers.TokeniserWhitespace.tokenizeToArrayList(TokeniserWhitespace.java:121) at uk.ac.shef.wit.simmetrics.similaritymetrics.MongeElkan.getSimilarity(MongeElkan.java:170) at com.mm.server.inventory.app.MergeWorklistFunctions.main(MergeWorklistFunctions.java:213)
I have a fixed version here:
https://github.com/mpkorstanje/simmetrics/blob/master/src/main/java/uk/ac/shef/wit/simmetrics/tokenisers/TokeniserCSVBasic.java
But you'll have to build from source. I'm doing an overhaul of the whole thing.