I am trying to use your package for similarity metrics for strings.
When i use cosineSimilarity, Euclidean distance and Jaccard Similarity i always get zero or one, never in between though i use float type for the results.
Levenshtein and mongeElkan works fine.
I just use the simple example file and make changes on the metrics.
It seems as though some of the metrics require a different Tokeniser on initialization. Simply initialize it with a new Tokiniser and it should work.
CosineSimilarity cosSim = new CosineSimilarity(new TokeniserQGram2());
Log in to post a comment.
Sign up for the SourceForge newsletter:
You seem to have CSS turned off.
Please don't fill out this field.