RE: [Classifier4j-devel] Simple Implementation
Status: Beta
Brought to you by:
nicklothian
From: Nick L. <nl...@es...> - 2004-07-13 04:17:23
|
Yes, it's possible to use a flat text file. You'll need to write your own implementation of IWordsDataSource to do it. Here's some pseudo code to do the training: IWordsDataSource wds = new SimpleWordsDataSource(); ITrainableClassifier classifier = new BayesianClassifier(wds); for each message in blackMsgs classifier.teachNonMatch(message) for each message in whiteMsgs classifier.teachMatch(message) for each message in bayMsgs int result = classifier.classify(message) Nick -----Original Message----- From: Kashif [mailto:ks...@ai...] Sent: Tuesday, 13 July 2004 12:45 PM To: cla...@li... Subject: [Classifier4j-devel] Simple Implementation Hi I am doing a research on Bayesian filters. I am trying to implement classifier 4J and will appreciate a bit of help. Please note that I am doing a very basic implementation, without using a JDBC Connection. Later I might move on to JDBC and MySQL. Background: I am using searchterm and arrays for my blacklist, whitelist and baylist (ie potential emails for Bayesian Filtering). SearchTerm blackSt = new OrTerm(blackListSearch); SearchTerm whiteSt = new OrTerm(whiteListSearch); SearchTerm baySt = new NotTerm(new OrTerm(blackListSearch)); // If not in BlackList Message[ ] blackMsgs = folder.search(blackSt); Message[ ] whiteMsgs = folder.search(whiteSt); Message[ ] bayMsgs = folder.search(baySt); // These are the messages which I want to filter with bayesian System.out.println("No of messages found in whitelist : " + whiteMsgs.length); System.out.println("No of messages found in blacklist : " + blackMsgs.length); System.out.println(); System.out.println("No of messages ready for bayesian filter : " + bayMsgs.length); //Implementation of Bayesian Classifier 4J IWordsDataSource wds = new SimpleWordsDataSource(); IClassifier classifier = new BayesianClassifier(wds); System.out.println("Matches = " + classifier.classify("This is a sentence") ); Here's the problem: 1) I understand that I have to train the filter, need to know how I can do it. 2) Is it possible to use a flat file (ie text file) rather than jdbc connection |