I've made some fairly significant commits today.
These include:
-- Stop Words Support: Allows words not to be used for classification. (see
theIStopWordProvider interface)
-- Training support: Training of the classifier can now be done via the
BayesianClassifier, and the datasource will be updated with the new word
statistics - thanks to Pete Leschev for the inital code for this (See the
ITrainable interface, which is implemented by BayesianClassifer).
-- A "createTable" method on JDBCWordsDataSource which will create the
database table if they don't already exist
-- BayesianClassifier is now case insensitive by default.
I plan on doing a 0.3 release tomorrow.
Nick
|