From: Yoav F. <yoa...@gm...> - 2008-07-22 18:35:14
|
Hi William, Thanks for the info, this is very helpful. Can you or aaron update the documentation, right now the focumentation on the jboost web site says nothing about comments in the data file. Another thing I am confused about is what are the specs for labels. I thought it can be any char string. Mayank thinks it can only be numbers. I got some error when I tried to use the strings "cell-edge" and "random-seg", but the error was completely uninterpretable. Something about no allowing "-" in a flag. Maybe this means that "-" is not allowed in a label value? How about the - in -1 ? Aaron, the error reports from jboost when it cannot parse the input file should be more elaborate. At a minimum, it should quote the error line. On the other hand, printing the whole manpage for any error makes not sense. I would complete remove this internal man page, so only one needs to be maintained! Yoav On Jul 22, 2008, at 1:49 PM, William Beaver wrote: > Hello Yoav, > > There is no automatic way to get from the examples in boosting.info > back to the source of the example. Using the example ID in > boosting.info (first column) to get the row number of the example in > the .data file, you then parse out the comment you've placed in the > data file. This comment is application specific and points you to > the source of that example. For example, in dros work the comment > contains a [stack id, slice id, object id] tuple. For dc audio i > store [audio sourcepath, sample start, sample length]. > > The comments are C style: '//' line comments and '/*' '*/' inline > comments. You can place them anywhere now, as long as the .spec > structure is maintained. I place the comments after the example > terminator and end the whole thing with a '\n' (just to make it more > human friendly). > > I have some matlab parsing code for the boosting.info file; it's > mostly a hack but have on my todo list to build a nicer, object- > based general purpose parser. In the meantime can send you the > paring snippets if you'd like. > > from somewhere in Oregon, > -William > > > > On Jul 20, 2008, at 8:33 AM, Yoav Freund wrote: > >> Hi William, >> >> I would like to use comments to trace back examples with small >> margins. I see that you fixed a bug in that regard, >> but I am not sure what is the specification for comments, it is not >> documented in >> >> http://jboost.sourceforge.net/doc.html#input >> >> Is a comment anything that appears after the example terminator >> symbol in the line? Does it need to have // in front of it (lines >> that start with // are ignored). Does jboost store the comment >> somewhere or can you retrieve the comment based on the example >> index number in the jboost log file? >> >> Yoav >> >> On May 16, 2008, at 2:50 PM, william beaver wrote: >> >>> Update of /cvsroot/jboost/jboost/src/jboost/tokenizer >>> In directory sc8-pr-cvs17.sourceforge.net:/tmp/cvs-serv10384/src/ >>> jboost/tokenizer >>> >>> Modified Files: >>> LTStreamTokenizer.java >>> Log Message: >>> fixed legacy bugs that prevented comments from being parsed >>> reliably, or at all. >>> >>> Index: LTStreamTokenizer.java >>> =================================================================== >>> RCS file: /cvsroot/jboost/jboost/src/jboost/tokenizer/ >>> LTStreamTokenizer.java,v >>> retrieving revision 1.1.1.1 >>> retrieving revision 1.2 >>> diff -C2 -d -r1.1.1.1 -r1.2 >>> *** LTStreamTokenizer.java 16 May 2007 04:06:02 -0000 1.1.1.1 >>> --- LTStreamTokenizer.java 16 May 2008 18:50:01 -0000 1.2 >>> *************** >>> *** 71,75 **** >>> this.terminator= terminator; >>> terLen= terminator.length(); >>> ! minLen= terLen >= 2 ? 2 : terLen; >>> for (int i= 0, terLines= 0; i < terLen; i++) >>> if (terminator.charAt(i) == '\n') >>> --- 71,75 ---- >>> this.terminator= terminator; >>> terLen= terminator.length(); >>> ! minLen= terLen <= 2 ? 2 : terLen; >>> for (int i= 0, terLines= 0; i < terLen; i++) >>> if (terminator.charAt(i) == '\n') >>> *************** >>> *** 135,139 **** >>> while (true) { >>> // if(Monitor.logLevel>3) Monitor.log("strLen=" + >>> strLen + " toCopy=" + toCopy + " i=" + i); >>> ! if (i > strLen - minLen) { >>> try { >>> numRead= br.read(cBuf, 0, bufLen); >>> --- 135,139 ---- >>> while (true) { >>> // if(Monitor.logLevel>3) Monitor.log("strLen=" + >>> strLen + " toCopy=" + toCopy + " i=" + i); >>> ! if (i > strLen - minLen) { >>> try { >>> numRead= br.read(cBuf, 0, bufLen); >>> *************** >>> *** 206,210 **** >>> if (Util.even(numEscapes)) { // line comment starts! >>> curTok += strBuf.substring(toCopy, i - >>> numEscapes / 2); >>> ! ongoingComment= true; >>> } else // escpaed, part of token >>> curTok >>> --- 206,210 ---- >>> if (Util.even(numEscapes)) { // line comment starts! >>> curTok += strBuf.substring(toCopy, i - >>> numEscapes / 2); >>> ! lineComment= true; >>> } else // escpaed, part of token >>> curTok >>> >>> >>> ------------------------------------------------------------------------- >>> This SF.net email is sponsored by: Microsoft >>> Defy all challenges. Microsoft(R) Visual Studio 2008. >>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >>> _______________________________________________ >>> jboost-cvs mailing list >>> jbo...@li... >>> https://lists.sourceforge.net/lists/listinfo/jboost-cvs > > -william > > ------------ > William Beaver > wb...@cs... > > > |