[Classifier4j-devel] How to Classify Subject Field with defaultStopWords.txt
Status: Beta
Brought to you by:
nicklothian
From: Kashif <ks...@ai...> - 2004-07-16 08:11:23
|
Hi Filter is working now on black list and white list when I compare the "from" field. If I want to apply the filtering on "subject" field (but its giving me 0.5 or 0.99 no matter what subject I use) At the moment I am doing this: 1) Transfer each line (which is a single word) of "defaultStopWords.txt" in an array stopWordListArray[ ] 2) Then I create another instance of IwordDatasource as (swds) and ITrainableClassifier as (sclassifier). 3) I used a for loop to teach match. I know that I should also train non match as well. But not sure with What? 4) I was wondering with that does the c4J uses defaultStopWords.txt, automatically or we have to call the list some how? Here's my code: IWordsDataSource swds = new SimpleWordsDataSource(); ITrainableClassifier sclassifier = new BayesianClassifier(swds); for (int i=0; i<stopWordListArray.length; i++) { sclassifier.teachMatch(stopWordListArray[i]); } for (int i=0; i<n; i++) { double result[] = new double[n]; result[i] = sclassifier.classify(message[i].getSubject()); System.out.println("The Probability of the message no. " + i + " is: " + result[i] ); } Thanks heaps for your help |