RE: [Classifier4j-devel] Simple Implementation
Status: Beta
Brought to you by:
nicklothian
|
From: Nick L. <nl...@es...> - 2004-07-13 04:17:23
|
Yes, it's possible to use a flat text file. You'll need to write your own
implementation of IWordsDataSource to do it.
Here's some pseudo code to do the training:
IWordsDataSource wds = new SimpleWordsDataSource();
ITrainableClassifier classifier = new BayesianClassifier(wds);
for each message in blackMsgs
classifier.teachNonMatch(message)
for each message in whiteMsgs
classifier.teachMatch(message)
for each message in bayMsgs
int result = classifier.classify(message)
Nick
-----Original Message-----
From: Kashif [mailto:ks...@ai...]
Sent: Tuesday, 13 July 2004 12:45 PM
To: cla...@li...
Subject: [Classifier4j-devel] Simple Implementation
Hi
I am doing a research on Bayesian filters. I am trying to implement
classifier 4J and will appreciate a bit of help.
Please note that I am doing a very basic implementation, without using a
JDBC Connection. Later I might move on to JDBC and MySQL.
Background:
I am using searchterm and arrays for my blacklist, whitelist and baylist (ie
potential emails for Bayesian Filtering).
SearchTerm blackSt = new OrTerm(blackListSearch);
SearchTerm whiteSt = new OrTerm(whiteListSearch);
SearchTerm baySt = new NotTerm(new OrTerm(blackListSearch));
// If not in BlackList
Message[ ] blackMsgs = folder.search(blackSt);
Message[ ] whiteMsgs = folder.search(whiteSt);
Message[ ] bayMsgs = folder.search(baySt); // These are
the messages which I want to filter with bayesian
System.out.println("No of messages found in whitelist : " +
whiteMsgs.length);
System.out.println("No of messages found in blacklist : " +
blackMsgs.length);
System.out.println();
System.out.println("No of messages ready for bayesian filter : " +
bayMsgs.length);
//Implementation of Bayesian Classifier 4J
IWordsDataSource wds = new SimpleWordsDataSource();
IClassifier classifier = new BayesianClassifier(wds);
System.out.println("Matches = " + classifier.classify("This is a
sentence") );
Here's the problem:
1) I understand that I have to train the filter, need to know how I
can do it.
2) Is it possible to use a flat file (ie text file) rather than jdbc
connection
|