[Classifier4j-devel] surprising classify score 0.01
Status: Beta
Brought to you by:
nicklothian
From: Mike M. <set...@ya...> - 2004-01-07 06:44:28
|
I am enjoying experimenting with your Classifer4J 0.5, but I ran across a result I did not expect. I have trained a BayesianClassifier with 22 positive examples and 1600 negative examples. Many of the positive examples contain the word "http". None of the negative examples contain this word. The surprising result is that the score of a sentence with "http" is 0.01. Can you help me to understand why? Here is the sentence and the WordProbability probabilities for each of the words in the sentence that were in the training data: score = 0.01 for "Mozilla/4.0 (compatible; grub-client-1.3.7; Crawl your own stuff with http://grub.org)" 0.11822660098522167 Mozilla 0.020618556701030927 4 0.07223476297968397 0 0.029239766081871343 compatible 0.10619469026548672 1 0.5454545454545454 3 0.01 7 0.99 http Thanks for providing this great software. It genrally works well, but this one result is surprising and I would like to understand it better. Mike Moore __________________________________ Do you Yahoo!? Yahoo! Hotjobs: Enter the "Signing Bonus" Sweepstakes http://hotjobs.sweepstakes.yahoo.com/signingbonus |