From: Rodrigo P. <piz...@gm...> - 2008-01-16 19:47:50
|
Hi, I'm currentrly involved in a project about Natural Language =20 Processing. My system takes as input an plain text (a medical =20 diagnose) and I need to output a ranking with the n most plausible =20 labels (each label is a standard code for the diagnose). I have found =20= JBoost and because it produces a Java class, it is perfect to include =20= as a part of my whole system, but I have some basic question about the =20= output. I have produced a "predict.class" for the "stem" example in =20 the "demo" folder. The output with the "java predict < stem.train" =20 command give me this: -36.173392809294306 36.17339280935683 -36.17339280949395 =20 -36.173392809448266 20.278047383272927 -20.85468565374684 -35.32543210576156 =20 -35.81457452982327 I have read that each value corresponds to a label. My question is: =20 How can I to interpret this output? what means the numbers? confidence? The labels are (rich, smart, happy, none). What about the sign in =20 multilabel problems? Can I build some kind of ranking of the most =20 plausible labels for each input example? Many thanks beforehand! PS: sorry for the basicness of my question Rodrigo Pizarro G. Ingenier=EDa Inform=E1tica Universidad de Santiago de Chile |