[Jboost-users] Questions about JBoost implementations

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hi,

I'm looking into JBoost to do text classification.  I've generated
Java output code (Predict.java) with "demo/$ java -Xmx100M
jboost.controller.Controller -p 2 -S spambase -j spambase.java", and
run it, and had some questions.

Incidentally, to compile with "javac -cp ../dist/jboost.jar
Predict.java" from demo/  I had to change the paths of some classes in
the main() method.

I ran Predict ("java -cp .:../dist/jboost.jar Predict <
spambase.data") against the original data.  I got two columns of
output that looked like

5.00073612523801        -5.00073612523801
11.864681207163063      -11.864681207163063
8.780744089260097       -8.780744089260097
...

Why are there two columns with the same magnitudes?  I'm guessing that
these are is/is not spam scores, but they seem redundant.

It would seem that changing a value in the first line of spambase.data
would change the classification score I see above, but it doesn't.  I
changed the first value in

0,0.64,0.64,0,0.32,0,0,0,0,0,0,0.64,0,0,0,0.32,0,1.29,1.93,0,0.96,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.778,0,0,3.756,61,278,+1;

from 0 to other values, but the first classification score
(5.00073612523801) didn't change.  Why is that?

Thanks,
Glenn