From: Glenn M. <gle...@gm...> - 2010-12-13 19:17:50
|
Hi, I'm looking into JBoost to do text classification. I've generated Java output code (Predict.java) with "demo/$ java -Xmx100M jboost.controller.Controller -p 2 -S spambase -j spambase.java", and run it, and had some questions. Incidentally, to compile with "javac -cp ../dist/jboost.jar Predict.java" from demo/ I had to change the paths of some classes in the main() method. I ran Predict ("java -cp .:../dist/jboost.jar Predict < spambase.data") against the original data. I got two columns of output that looked like 5.00073612523801 -5.00073612523801 11.864681207163063 -11.864681207163063 8.780744089260097 -8.780744089260097 ... Why are there two columns with the same magnitudes? I'm guessing that these are is/is not spam scores, but they seem redundant. It would seem that changing a value in the first line of spambase.data would change the classification score I see above, but it doesn't. I changed the first value in 0,0.64,0.64,0,0.32,0,0,0,0,0,0,0.64,0,0,0,0.32,0,1.29,1.93,0,0.96,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0.778,0,0,3.756,61,278,+1; from 0 to other values, but the first classification score (5.00073612523801) didn't change. Why is that? Thanks, Glenn |