Re: [Jboost-users] A basic question about the output

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Hello.

Each value does correspond to a label.  The value is the predictor's =20
confidence in selecting that label.  In the simplest case, selecting =20
the max positive value gives you your "best" prediction; noting that =20
you may not be that confident in your  "best" prediction if it is =20
close to zero (or negative).

In the examples provided, the labels would be "smart" and "rich", =20
respectively.

Let me know if that sufficiently answers your question.

best,
-william

------------
William Beaver
wb...@cs...

On Jan 16, 2008, at 9:34 AM, Rodrigo Pizarro wrote:

> Hi,
>
> I'm currentrly involved in a project about Natural Language
> Processing. My system takes as input an plain text (a medical
> diagnose) and I need to output a ranking with the n most plausible
> labels (each label is a standard code for the diagnose). I have found
> JBoost and because it produces a Java class, it is perfect to include
> as a part of my whole system, but I have some basic question about the
> output. I have produced a "predict.class" for the "stem" example in
> the "demo" folder. The output with the "java predict < stem.train"
> command give me this:
>
> -36.173392809294306 36.17339280935683 -36.17339280949395
> -36.173392809448266
> 20.278047383272927 -20.85468565374684 -35.32543210576156
> -35.81457452982327
>
> I have read that each value corresponds to a label. My question is:
> How can I to interpret this output? what means the numbers? =20
> confidence?
>
> The labels are (rich, smart, happy, none). What about the sign in
> multilabel problems? Can I build some kind of ranking of the most
> plausible labels for each input example?
>
> Many thanks beforehand!
>
> PS: sorry for the basicness of my question
>
> Rodrigo Pizarro G.
> Ingenier=EDa Inform=E1tica
> Universidad de Santiago de Chile