RE: [Classifier4j-devel] Bayesian Case Study
Status: Beta
Brought to you by:
nicklothian
From: Nick L. <nl...@es...> - 2003-11-14 01:33:36
|
> -----Original Message----- > From: Matt Collier [mailto:MCo...@my...] > Sent: Friday, 14 November 2003 11:44 AM > To: cla...@li... > Subject: RE: [Classifier4j-devel] Bayesian Case Study > > > Very nice. Should we keep these in a flat file? This would > make alot of > sense in my opinion. > Tha makes sense to me. > Do we want to modify the default tokenizer and stop list > provider, or do we > want to extend it? > Create a new implemnetation of the IStopWordProvider interface that reads from a resource. You might want to read a bit abotu java interfaces if you haven't already. > If we want to extend it, can you please shortcut me to doing > this. I think I > understand that we will create a class that "extends default > tokenizer" etc, > but how will this new class be used by the other classes and > methods such as > bayesian.classify? Surely we won't have to modify all this > code, or perhaps > we do. I don't know... which is why I'm asking... :) > Yes, it is a valid question. Fortuanly, we thought of this when we coded it a while ago (pat myself on my back!). There is a constructor for BayesianClassifier that looks like: public BayesianClassifier(IWordsDataSource wd, ITokenizer tokenizer, IStopWordProvider swp) Which allows you to specify your own stop-word provider. As a general rule most of Classifier4J is coded against interfaces, to make this kind of change pretty easy. It means it is very flexible - it's just that we don't have many non-standard implementations.... |