[Jboost-users] A basic question about the output of JBoost

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Hi,

I'm currentrly involved in a project about Natural Language =20
Processing. My system takes as input an plain text (a medical =20
diagnose) and I need to output a ranking with the n most plausible =20
labels (each label is a standard code for the diagnose). I have found =20=

JBoost and because it produces a Java class, it is perfect to include =20=

as a part of my whole system, but I have some basic question about the =20=

output. I have produced a "predict.class" for the "stem" example in =20
the "demo" folder. The output with the "java predict < stem.train" =20
command give me this:

-36.173392809294306 36.17339280935683 -36.17339280949395 =20
-36.173392809448266
20.278047383272927 -20.85468565374684 -35.32543210576156 =20
-35.81457452982327

I have read that each value corresponds to a label. My question is: =20
How can I to interpret this output? what means the numbers? confidence?

The labels are (rich, smart, happy, none). What about the sign in =20
multilabel problems? Can I build some kind of ranking of the most =20
plausible labels for each input example?

Many thanks beforehand!

PS: sorry for the basicness of my question

Rodrigo Pizarro G.
Ingenier=EDa Inform=E1tica
Universidad de Santiago de Chile