From: Aaron A. <aa...@cs...> - 2008-07-23 13:08:15
|
> Thanks for the info, this is very helpful. Can you or aaron update the > documentation, right now the focumentation on the jboost web site says > nothing about comments in the data file. Directions for how to get to the JBoost site are available from SourceForge when signed in as an admin. > Another thing I am confused about is what are the specs for labels. I thought > it can be any char string. Mayank thinks it can only be numbers. The demos have both numbers and characters, though no hyphens. I don't know the exact specs, but I'm guessing there's either some other problem or its the hyphens. > Aaron, the error reports from jboost when it cannot parse the input file > should be more elaborate. At a minimum, it should quote the error line. Agreed. Someone should update this some day... I think part of the difficulty for changing this is how exceptions are currently used/caught, though it's been a while since I looked at this. (Though the last time I looked at this code was to put in exactly what you're talking about.) > On the other hand, printing the whole manpage for any error makes not sense. > I would complete remove this internal man page, so only one needs to be > maintained! The internal man page should be the one maintained. An error may be filesystem related or because jboost is being ran in an unexpected directory. As such, the only guarenteed object to be accessible is the interal man page (unless there's a memory error, but then everything is going down anyways). Aaron > On Jul 22, 2008, at 1:49 PM, William Beaver wrote: > >> Hello Yoav, >> >> There is no automatic way to get from the examples in boosting.info back to >> the source of the example. Using the example ID in boosting.info (first >> column) to get the row number of the example in the .data file, you then >> parse out the comment you've placed in the data file. This comment is >> application specific and points you to the source of that example. For >> example, in dros work the comment contains a [stack id, slice id, object >> id] tuple. For dc audio i store [audio sourcepath, sample start, sample >> length]. >> >> The comments are C style: '//' line comments and '/*' '*/' inline comments. >> You can place them anywhere now, as long as the .spec structure is >> maintained. I place the comments after the example terminator and end the >> whole thing with a '\n' (just to make it more human friendly). >> >> I have some matlab parsing code for the boosting.info file; it's mostly a >> hack but have on my todo list to build a nicer, object-based general >> purpose parser. In the meantime can send you the paring snippets if you'd >> like. >> >> from somewhere in Oregon, >> -William >> >> >> >> On Jul 20, 2008, at 8:33 AM, Yoav Freund wrote: >> >>> Hi William, >>> >>> I would like to use comments to trace back examples with small margins. I >>> see that you fixed a bug in that regard, >>> but I am not sure what is the specification for comments, it is not >>> documented in >>> >>> http://jboost.sourceforge.net/doc.html#input >>> >>> Is a comment anything that appears after the example terminator symbol in >>> the line? Does it need to have // in front of it (lines that start with // >>> are ignored). Does jboost store the comment somewhere or can you retrieve >>> the comment based on the example index number in the jboost log file? >>> >>> Yoav >>> >>> On May 16, 2008, at 2:50 PM, william beaver wrote: >>> >>>> Update of /cvsroot/jboost/jboost/src/jboost/tokenizer >>>> In directory >>>> sc8-pr-cvs17.sourceforge.net:/tmp/cvs-serv10384/src/jboost/tokenizer >>>> >>>> Modified Files: >>>> LTStreamTokenizer.java >>>> Log Message: >>>> fixed legacy bugs that prevented comments from being parsed reliably, or >>>> at all. >>>> >>>> Index: LTStreamTokenizer.java >>>> =================================================================== >>>> RCS file: >>>> /cvsroot/jboost/jboost/src/jboost/tokenizer/LTStreamTokenizer.java,v >>>> retrieving revision 1.1.1.1 >>>> retrieving revision 1.2 >>>> diff -C2 -d -r1.1.1.1 -r1.2 >>>> *** LTStreamTokenizer.java 16 May 2007 04:06:02 -0000 1.1.1.1 >>>> --- LTStreamTokenizer.java 16 May 2008 18:50:01 -0000 1.2 >>>> *************** >>>> *** 71,75 **** >>>> this.terminator= terminator; >>>> terLen= terminator.length(); >>>> ! minLen= terLen >= 2 ? 2 : terLen; >>>> for (int i= 0, terLines= 0; i < terLen; i++) >>>> if (terminator.charAt(i) == '\n') >>>> --- 71,75 ---- >>>> this.terminator= terminator; >>>> terLen= terminator.length(); >>>> ! minLen= terLen <= 2 ? 2 : terLen; >>>> for (int i= 0, terLines= 0; i < terLen; i++) >>>> if (terminator.charAt(i) == '\n') >>>> *************** >>>> *** 135,139 **** >>>> while (true) { >>>> // if(Monitor.logLevel>3) Monitor.log("strLen=" + >>>> strLen + " toCopy=" + toCopy + " i=" + i); >>>> ! if (i > strLen - minLen) { >>>> try { >>>> numRead= br.read(cBuf, 0, bufLen); >>>> --- 135,139 ---- >>>> while (true) { >>>> // if(Monitor.logLevel>3) Monitor.log("strLen=" + >>>> strLen + " toCopy=" + toCopy + " i=" + i); >>>> ! if (i > strLen - minLen) { >>>> try { >>>> numRead= br.read(cBuf, 0, bufLen); >>>> *************** >>>> *** 206,210 **** >>>> if (Util.even(numEscapes)) { // line comment starts! >>>> curTok += strBuf.substring(toCopy, i - numEscapes / 2); >>>> ! ongoingComment= true; >>>> } else // escpaed, part of token >>>> curTok >>>> --- 206,210 ---- >>>> if (Util.even(numEscapes)) { // line comment starts! >>>> curTok += strBuf.substring(toCopy, i - numEscapes / 2); >>>> ! lineComment= true; >>>> } else // escpaed, part of token >>>> curTok >>>> >>>> >>>> ------------------------------------------------------------------------- >>>> This SF.net email is sponsored by: Microsoft >>>> Defy all challenges. Microsoft(R) Visual Studio 2008. >>>> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >>>> _______________________________________________ >>>> jboost-cvs mailing list >>>> jbo...@li... >>>> https://lists.sourceforge.net/lists/listinfo/jboost-cvs >> >> -william >> >> ------------ >> William Beaver >> wb...@cs... >> >> > |