I don't know if this project is still well maintained but I've been using the latest from CVS and had to make a few changes which I think should be feature requests
- added UTF-8 support using InputStreamWriter and OutputStreamWriter on training input
- don't load in all entries from the fingerprint file, only to discard them at comparison time. this causes app to run out of memory. changed...
2009-09-09 05:38:06 UTC in Java Text Categorizing Library