From: Aaron A. <aa...@cs...> - 2008-02-16 05:23:55
|
After personal communicae, it was determined that this was a bug in the ngram code. A hack has been committed to CVS with the intention of a permenant fix in the near future. Currently, all ngram sizes are being ignored and the text is being split into single word bag of features. Aaron On Mon, 28 Jan 2008, Rodrigo Pizarro wrote: > I think that is a bug. > > I try with another training and test sets (about 5 examples of { text, label > }) and i still getting one level tree with negative leafs. I guess that the > problem is dealing with just one text atribute. > > In the file FILE_NAME.log, the text attributes says "set with zero elements". > The algorithm is not considering text atributes? > > The predict_int() method from the generated java file (-j flag) is > > static private double[] predict_int() { > reset_pred(); > add_pred( /* R */ > -0.8047189562170501, > -0.34657359027997264, > -0.34657359027997264, > -0.8047189562170501); > > return finalize_pred(); > } > > Is a tree with just one branch, all with negative values! If you try with > other demo examples (with numeric, set and text atributes in each example), > the method predict_int() contanins many if-then-else statements, representing > all the tree. > > I'm in the right? > > Rodrigo Pizarro G. > Ingeniería Informática > Universidad de Santiago de Chile > > > > > El 28-01-2008, a las 14:58, Aaron Arvey escribió: > >> Try doing something along the lines of >> >> ./jboost -b AdaBoost -numRounds 1000 -a -2 -S FILE_NAMES -ATreeType ADD_ALL >> >> The output tree is going to be incomprehensible no matter what (620 labels >> is just too much to view visually). You also probably have an outdated >> copy of atree2dot2ps.pl. Grab the latest version from CVS and see if that >> works any better for you. Zoom into the postscript file (png,gif zoom will >> be blury, postscript is vector graphics -- I think). Directions for CVS are >> at http://sourceforge.net/cvs/?group_id=195659. >> >> Also, look at the FILE_NAME.info file and see what the error is. This will >> give you a good idea as to the actuall progress of the booster. If error >> is going down, then you know that something good is happening. If error is >> staying very high, then there may be a bug or some other problem. >> >> Aaron >> >> >> On Mon, 28 Jan 2008, Rodrigo Pizarro wrote: >> >>> Hi, >>> >>> i'm training jBoost with examples of the type (text, label). There is >>> 620 different labels, 1720 training examples, and 1600 test examples. >>> The problem is that the output tree has only one level with 620 >>> branches and all the branches has negative values!!! >>> >>> JBoost deals correctly with text attributes? is there some bug? I >>> tried with many different run parameters, but i still get similar >>> wreid outputs. Please, any suggestion? >>> >>> Many Thanks beforehand! >>> >>> Rodrigo Pizarro G. >>> Ingeniería Informática >>> Universidad de Santiago de Chile >>> >>> >>> >>> >>> >>> ------------------------------------------------------------------------- >>> This SF.net email is sponsored by: Microsoft >>> Defy all challenges. Microsoft(R) Visual Studio 2008. >>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >>> _______________________________________________ >>> jboost-users mailing list >>> jbo...@li... >>> https://lists.sourceforge.net/lists/listinfo/jboost-users >>> > |