You can subscribe to this list here.
2007 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(3) |
Sep
|
Oct
|
Nov
|
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2008 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(3) |
Aug
|
Sep
(5) |
Oct
(1) |
Nov
(2) |
Dec
|
From: Aaron A. <aa...@cb...> - 2008-11-17 22:49:14
|
EqualitySplitterBuilder was already commented out in the CVS. InequalitySplitterBuilder (or other code) has a bug that results in not setting cutIndex (leaving it as -1). The current code if (cutIndex==-1) { cutIndex=-1; System.out.println(this); } is not good (in the sense that it does not fix the problem) but it does give a more informative print-out prior to dying from the ArrayIndexOutOfBoundsException. To make this clearer for future readers, the code has been changed to if (cutIndex==-1) { System.out.println(this); } If you have a reproducible case where cutIndex is -1, you may be able to track down the cause of the bug. Thanks for the attention to detail and let us know if you find anything else! Aaron On Sun, 16 Nov 2008, Nathan Silberman wrote: > I just downloaded the latest src of Jboost and found 2 possible bugs: > > 1. EqualitySplitterBuilder > this file imports "import sun.tools.tree.CastExpression;" which it doesn't > use. Additinoally, the ant script doesn't specify that this jar file should > be pulled in from anywhere so this import should really be removed to avoid > compilation problems. > > 2. InequalitySplitterBuilder > Starting on line 251, there is a dubious assignment: > > if (cutIndex==-1) { > cutIndex=-1; > System.out.println(this); > } > > did somebody mean to reassign cutIndex to 1 instead?: > > if (cutIndex==-1) { > cutIndex=1; // possible correction? > System.out.println(this); > } > > The current code causes an ArrayIndexOutOfBoundsException > > Nathan > |
From: Nathan S. <nat...@gm...> - 2008-11-16 19:35:30
|
I just downloaded the latest src of Jboost and found 2 possible bugs: 1. EqualitySplitterBuilder this file imports "import sun.tools.tree.CastExpression;" which it doesn't use. Additinoally, the ant script doesn't specify that this jar file should be pulled in from anywhere so this import should really be removed to avoid compilation problems. 2. InequalitySplitterBuilder Starting on line 251, there is a dubious assignment: if (cutIndex==-1) { cutIndex=-1; System.out.println(this); } did somebody mean to reassign cutIndex to 1 instead?: if (cutIndex==-1) { cutIndex=1; // possible correction? System.out.println(this); } The current code causes an ArrayIndexOutOfBoundsException Nathan |
From: Sina J. <si...@CS...> - 2008-10-23 03:13:03
|
Hi all, I have completed the package for running JBoost in Weka. It needs no change in the JBoost code (jboost.java) and almost no change in weka code (weka.jar) [except upgrading the two .props files in Weka which show the directories where the classifiers are in. It needs concurrent.jar and junit.jar files since the JBoost code needs them but again no change is needed. I've provided the doc file as well as a readme about how the code works and how to install the wekajboost.jar file (the wrapper) appropriately. I have tested the code with several different arff files and I will be happy if you provide feedback about it. The package is in http://www.cs.princeton.edu/~sina/wjboost/wekajboost.zip , and small-size zipped source files are in http://www.cs.princeton.edu/~sina/wjboost/wekajboostsrc.zip Best, --Sina |
From: Aaron A. <aa...@cb...> - 2008-09-30 16:30:38
|
Dear Vimal, You can download JBoost and use BrownBoost from the command line. To determine the initial potential (final probability of giving up on examples), you can use ./jboost -S filestem -b BrownBoost -r 0.9 -potential After you determine a valid value for "r", you can run the algorithm via ./jboost -S filestem -b BrownBoost -r 0.9 For example, the following seems to work alright: ./jboost -b BrownBoost -S demo/noisy_line -r 0.8 Let me know if you have any BrownBoost related questions. Aaron On Tue, 30 Sep 2008, Vimal Vaghela wrote: > Respected Sir, > > i am doing my research work on Boosting method and specially on Adaboost. > i need the BrownBoost Algorithm that you have used in jBoost Software. > so please provide me so i can do more work on that. > i am waiting for your positive reply. > thanking you, > vimal vaghela |
From: Yoav F. <yoa...@gm...> - 2008-09-30 15:54:30
|
Dear Vimal, The BrownBoost / NormalBoost code is still in development, we are working on it right now, it will be available in a few months. Best Yoav Freund On Sep 30, 2008, at 2:09 AM, Vimal Vaghela wrote: > Respected Sir, > > i am doing my research work on Boosting method and specially on > Adaboost. > > i need the BrownBoost Algorithm that you have used in jBoost Software. > > so please provide me so i can do more work on that. > > i am waiting for your positive reply. > > thanking you, > > vimal vaghela > > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your Move Developer's > challenge > Build the coolest Linux based applications with Moblin SDK & win > great prizes > Grand prize is a trip for two to an Open Source event anywhere in > the world > http://moblin-contest.org/redirect.php?banner_id=100&url=/_______________________________________________ > jboost-devel mailing list > jbo...@li... > https://lists.sourceforge.net/lists/listinfo/jboost-devel |
From: Vimal V. <vim...@ya...> - 2008-09-30 09:12:42
|
Respected Sir, i am doing my research work on Boosting method and specially on Adaboost. i need the BrownBoost Algorithm that you have used in jBoost Software. some statatics related to BrownBoost. so please provide me so i can do more work on that. i am waiting for your positive reply. thanking you, vimal vaghela |
From: Vimal V. <vim...@ya...> - 2008-09-30 09:10:05
|
Respected Sir, i am doing my research work on Boosting method and specially on Adaboost. i need the BrownBoost Algorithm that you have used in jBoost Software. so please provide me so i can do more work on that. i am waiting for your positive reply. thanking you, vimal vaghela |
From: Vimal V. <vim...@ya...> - 2008-09-30 09:09:56
|
Respected Sir, i am doing my research work on Boosting method and specially on Adaboost. i need the BrownBoost Algorithm that you have used in jBoost Software. so please provide me so i can do more work on that. i am waiting for your positive reply. thanking you, vimal vaghela |
From: Aaron A. <aa...@cs...> - 2008-07-04 22:44:08
|
Hi Sina, Nice bug catch. Fix has been committed to the CVS repository. FYI, In the next month, there are going to be a couple restructurings of JBoost code (ControllerConfiguration.java removed and functionality moved to Configuration.java, all python/perl code will be replaced with R, and maybe a couple of other small changes). I'm not sure if the WEKA interface code relies on any of these classes. Also, I received your other email related to adding weak learners to JBoost from WEKA. I'll check into this some over the weekend and get back to you sometime soon. Give me a ping on Monday if you haven't heard back. Aaron From: Sina Jafarpour <sina@pr...> - 2008-07-04 18:00 > Hi all, > > Just about the -info argument, it seems that PrintWriter inforStem gets > its file name fromthe outputStem parameter instead of the infoFilename > and hence the infoFilename variable is never used in the program. It > seems to be OK whenever we use -S parameter, but for -n -t -info > parameters separately it ignores the -info parameter and usesnonam_out. > I will be grateful if you have a look at it (Class Monitor, Public > Constructor,inforFilename=...). > > Bests, --Sina |
From: Sina J. <si...@pr...> - 2008-07-04 18:00:47
|
Hi all, Just about the -info argument, it seems that PrintWriter inforStem gets its file name fromthe outputStem parameter instead of the infoFilename and hence the infoFilename variable is never used in the program. It seems to be OK whenever we use -S parameter, but for -n -t -info parameters separately it ignores the -info parameter and usesnonam_out. I will be grateful if you have a look at it (Class Monitor, Public Constructor,inforFilename=...). Bests, --Sina |
From: Sina J. <si...@pr...> - 2008-07-04 03:56:09
|
Dear all, As a part of my research with Prof. Rob Schapire, I needed some experiments on combining boosting meta learning algorithms with some different base learners. As a result I decided to write a wrapper to connect JBoost to Weka. Right now, a class extending the Classifier class in Weka converts the data format from arff and runs a modification of the Connector class in JBoost, so I can run JBoost from weka right now (although it is still the primary version not dealing with many special cases). However, the challenging part is to run Weka classifiers as the weak learners in JBoost. I have written the converter to convert back from JBoost input and output formats to weka. However, I could not find out exactly how the special weak learning classifier (ADTree) works and what are the interfaces that have to be changed in order to be able to run alien (Weka) classifiers inside the JBoost code. I will be very grateful if you help me with this. Thanks in advance, --Sina Jafarpour http://www.cs.princeton.edu/~sina |
From: Aaron A. <aa...@cs...> - 2007-08-20 22:17:45
|
On Sat, 18 Aug 2007, Haiyong Xu wrote: > Thanks for your reply. > > Currently, I am involving in a project using pattern recognition > techniques and we build an ensemble system based on AdaBoost method. > JBoost is the only library I found which could be practically employed > in our project. Therefore, I'd like to dig into the detailed > implementation to see how to adapt it to our requirement. Great! > Through reading part of source codes, mainly about AdaBoost.java, > InequalitySpilterBuilder.java, and InstrumentedAlternatingTree.java, I > found that when choosing a candidate spliter, AdaBoost.java always > choose the one to minimize a loss criteria, which is defined as > "2*Sqrt(w_0 * w_1) - w_0 - w_1" in function BinaryBag::getLoss( ). > Actually, this minimization is not exactly the same as the one defined > in Freund's original paper "The alternating decision tree learning > algorithm" (1999). (See step 1 and 2 in Figure 3). That is absolutely correct. This is because the weak classifiers c1 and c2 are evaluated in a separate module and the booster is only informed of the (weighted) errors w.r.t. the classes. Using the notation of Freund & Mason 1999, the booster is only told of \sum W- and \sum W+ Then there are two prediction weights determined in two separate phases (one for "YES/TRUE" and one for "NO/FALSE"). This is reminiscent of where this loss function originally came from (Schapire & Singer "Improved algorithms using confidence rated predictions" 1999), which actually might be a better reference for understanding much of the reasoning behind the ADTree paper (Freund & Mason 1999). While the loss function is not identical, the motivation is to use a loss which will best separate the classes, not necessarily give the lowest weighted error w.r.t. some potential function (which was the reasoning in the original AdaBoost paper). You can also implement other loss functions. Two that are on the todo list (http://www.cs.ucsd.edu/~aarvey/jboost/todo.html) are weighted-error and information gain. There are many other loss functions that can be used, granted the Z-loss of Schapire & Singer 99 has some very nice properties. > Also, I am still trying to figure out the connection between ADT and > AdaBoost algorithm. As we know, the fininal prediction in AdaBoost is f > = a1*h1 + a2*h2 + a3*h3 +...+ aT*hT, with a1 is coefficient and H1 is > weak learner. In ADT, the fininal prediction has similar form as the > summary of prediction values. Here, I suppose the prediction value in > ADT is equivalent to a1*h1 in AdaBoost. Not exactly. I think the best way to understand the evaluation process of ADTrees is by example. There is an example provided on wikipedia at http://en.wikipedia.org/wiki/Alternating_decision_tree. There is another example in http://www.cse.ucsd.edu/~yfreund/papers/final-geneclass.pdf. The major difference in ADTrees is that h_{t} may depend on the evaluation of h_{1}...h_{t-1}. Namely, each path is a conjunction of weak hypotheses. For instance, not all examples are evaluated by the t^{th} hypothesis; only those examples that make it to the decision node of the t^{th} hypothesis are actually evaluated. This allows very weak classes of hypotheses (e.g. decision stumps) to create fairly complex hyptheses (e.g. decision rectangles) without doing the large number of iterations to create the same classifier with standard boosting. While this may lead to overfitting, it does not seem to do so empirically. > The marginal based generalization bound analysis proposed by Schapire is a > very nice thing. And in recent years, many reserachers have put efforts on > searching algorithms to maximize margin. See papers: > > "Soft Margins for AdaBoost", Ratsch, 17th Annual Conference on > Computational Learning Theory, 2004. > > "Boosting Based on a Smooth Margin", Rudin, 17th Annual Conference on > Computational Learning Theory, 2004. > > "ANALYSIS OF BOOSTING ALGORITHMS USING THE SMOOTH MARGIN FUNCTION", > Rudin, Submitted to the Annals of Statistics. 2007 > > And I would like to make use of those maximizing magin algorithms to build > an AdaBoost ensemble system. At this time, I am thinking about implement it > on the basis of JBoost architecture. I have actually not read these three particular papers (they're certainly on my stack). However, I know that a pitfall of some "margin maximizing" algorithms is that they maximize the *minimal* margin, not the margin distribution (which is perhaps somewhat ill defined for maximizing). Also keep in mind that while margin maximization is very important, it may not be the only explanation for why AdaBoost generalizes so well (see Reyzin & Schapire 06). Also, maximizing the minimum margin may allow noisy data to severely impair the learning process. I believe that the "Soft margins..." (Ratsch) paper discusses noise resistant boosting. While JBoost doesn't have the exact algorithm of Ratsch, BrownBoost (use CVS version, there may be small bugs in version 1.3.1) can be used to be noise resistant. This works by setting the exact amount of noise you want to allow. This can be tweaked quickly by using the -potential flag. You can then visualize the creation of the margin distribution using the surfing.py script (again, use the CVS version). This may not be exactly what you need, but it may give some intuition as to what is happening in the "boosting the margin" interpretation of boosting. If you wish to implement a "margin maximizing" booster, I'd recommend starting with one of the simpler methods, e.g. LPBoost. Also, since some of these methods require keeping all previous hypotheses available, I'd recommend keeping the data structures (matrices, etc) in the booster class. See the (inefficient, though correct) BrownBoost implementation for an example of how to keep past hypotheses stored in the booster class. > Thank you for your maintenance of open sourced project JBoost, and hope we > can cooperate on JBoost project. I'm glad to hear that you're interested in contributing! I'm not sure if you've contributed to open source projects in the past; either way, I'd recommend reading http://www.cs.ucsd.edu/~aarvey/jboost/contribute.html (I just added some more content to it) and some of the links. Cheers, Aaron > On 8/17/07, Aaron Arvey <aa...@cs...> wrote: >> >> Apologies for the late response. See inline comments. >> >> On Wed, 15 Aug 2007, Haiyong Xu wrote: >> >>> I found current implementation of JBoost mainly focus on Alternating >>> Decision Tree (ADT), which means the weak learners are similar to >>> decision stumps. >> >> ADTrees are capable of using any weak learner. The nice thing about using >> decision stumps with ADTrees is that since each path is a conjunction, >> ADTrees are able to form quite a rich representation with a limited base >> class of hypotheses. See the original paper for more details: >> http://www.cse.ucsd.edu/~yfreund/papers/atrees.pdf >> >>> Is there any way to construct a boosting system based on another type of >>> weak learner, say neural networks? Thanks. >> >> In JBoost, a weak learner (typically something similar to a decision >> stump) can be replaced by just about anything, including neural networks. >> If you wanted to implement a neural network weak learner, you could mimic >> the existing weak learners at ./src/jboost/leaner/. The easiest weak >> learner to understand is InequalitySplitter.java (which is created by >> InequalitySplitterBuilder.java). A simple neural net implementation >> should be fairly straight forward in this framework. >> >> Note that there are a variety of reasons why boosting neural nets may not >> be the best idea. The margin analysis given by Schapire et al (1998, >> Annals of stats) gives generalization bounds in terms of the VC dimension >> of the base classifier, while these bounds are certainly quite lose, they >> do provide some intuition why boosting may not overfit. >> >> Also note that JBoost can boost decision trees with the command >> >> ./jboost ... -ATreeType ADD_ROOT_OR_SINGLES >> >> You also have the other options: >> >> -ATreeType type The type of ATree to create. There are several >> options: >> ADD_ALL Create a full ADTree >> (default) >> ADD_ROOT Add splits only at the >> root producing a glat tree. >> This is equivalent to >> boosting decision stumps >> ADD_SINGLES Create a decision tree >> ADD_ROOT_OR_SINGLES Create a linear >> combination of decision trees. >> This is equivalent to >> simultaneously growing >> boosted decision trees. >> >> An existing implementation of Boosting with neural nets (and many other >> weak classifiers) exists in WEKA. Last time I checked, neural nets were >> under the name of Multilayer Perceptron. >> >> Aaron >> >> > |
From: Aaron A. <aa...@cs...> - 2007-08-17 22:16:22
|
Apologies for the late response. See inline comments. On Wed, 15 Aug 2007, Haiyong Xu wrote: > I found current implementation of JBoost mainly focus on Alternating > Decision Tree (ADT), which means the weak learners are similar to > decision stumps. ADTrees are capable of using any weak learner. The nice thing about using decision stumps with ADTrees is that since each path is a conjunction, ADTrees are able to form quite a rich representation with a limited base class of hypotheses. See the original paper for more details: http://www.cse.ucsd.edu/~yfreund/papers/atrees.pdf > Is there any way to construct a boosting system based on another type of > weak learner, say neural networks? Thanks. In JBoost, a weak learner (typically something similar to a decision stump) can be replaced by just about anything, including neural networks. If you wanted to implement a neural network weak learner, you could mimic the existing weak learners at ./src/jboost/leaner/. The easiest weak learner to understand is InequalitySplitter.java (which is created by InequalitySplitterBuilder.java). A simple neural net implementation should be fairly straight forward in this framework. Note that there are a variety of reasons why boosting neural nets may not be the best idea. The margin analysis given by Schapire et al (1998, Annals of stats) gives generalization bounds in terms of the VC dimension of the base classifier, while these bounds are certainly quite lose, they do provide some intuition why boosting may not overfit. Also note that JBoost can boost decision trees with the command ./jboost ... -ATreeType ADD_ROOT_OR_SINGLES You also have the other options: -ATreeType type The type of ATree to create. There are several options: ADD_ALL Create a full ADTree (default) ADD_ROOT Add splits only at the root producing a glat tree. This is equivalent to boosting decision stumps ADD_SINGLES Create a decision tree ADD_ROOT_OR_SINGLES Create a linear combination of decision trees. This is equivalent to simultaneously growing boosted decision trees. An existing implementation of Boosting with neural nets (and many other weak classifiers) exists in WEKA. Last time I checked, neural nets were under the name of Multilayer Perceptron. Aaron |
From: Haiyong X. <hai...@gm...> - 2007-08-15 14:40:53
|
Hi there, I found current implementation of JBoost mainly focus on Alternating Decision Tree (ADT), which means the weak learners are similar to decision stumps. Is there any way to construct a boosting system based on another type of weak learner, say neural networks? Thanks. -- Haiyong |