Menu

#17 Gini is incorrectly calculated

open
nobody
Methods (3)
5
2010-03-18
2010-03-18
Anonymous
No

I have found that the calculation for the Gini is done incorrectly (albeit, just slightly). The current implementation does p(1-p) where p is the signal purity defined as s/(s+b). The text "Classification and Regression Trees" by Breiman states on page 103 that the Gini should be 2p(1-p) for the two class problem. So the Gini is missing the leading multiplier of 2 as it is implemented in TMVA. The documentation for the Gini in TMVA states that it should be 2p(1-p) also.

Again, I know that this is just a small change and would just scale things uniformly. The line that needs to be adjusted is line number 59 of the file GiniIndex.cxx

Discussion


Log in to post a comment.

MongoDB Logo MongoDB