All token based metrics version 1.6+ can differ with previous versions in output similarity scores with rare cases of repeating whitespace characters, this was due to a bug in the tokeniser which has now been rectified. This shouldn't cause any major score deviation but will explain the slight deviations in the cases where strings contain such features.
Log in to post a comment.
Sign up for the SourceForge newsletter:
You seem to have CSS turned off.
Please don't fill out this field.