From: Ted P. <tpederse@d.umn.edu> - 2010-06-13 15:55:13
|
We are pleased to announce the release of version 0.08 of Text-Similarity. This versions one important change - when you are using a stoplist, you can now specify stop words using regular expressions. In previous versions a stoplist can be specified as follows (in a single file, one line per word) a of in This will cause a, of and in to be treated as stop words (and not use them in computing similarity). As of 0.08 you may continue to use the above format, or you can use regular expressions... For example... /\b\w\b/ /\b\d+\b/ ...would cause all single character words and numeric values to be removed... You can get this new version via CPAN or sourceforge - find links to both at : http://text-similarity.sourceforge.net Enjoy, Ted and Ying -- Ted Pedersen http://www.d.umn.edu/~tpederse |