For categorical attributes, it only makes sense to split on each attribute
once (because each child branch will be homogeneous in that attribute). For
continuous values, it makes sense to split on the same attribute multiple
times (because splitting does not make the child branches homogeneous in that
attribute). So, I think it is better the way it is.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
hi
a piece of code in GDecisionTree.cpp ( line 656-658):
.........
std::swap(attrPool, attrPool);
attrPool.erase(attrPool.end() - 1);
}
it's intended for removing the attribute from the pool, and it should work for
both nominal and continuous attributes
so it should be outside the bracket, right?
thanks
tangyan
For categorical attributes, it only makes sense to split on each attribute
once (because each child branch will be homogeneous in that attribute). For
continuous values, it makes sense to split on the same attribute multiple
times (because splitting does not make the child branches homogeneous in that
attribute). So, I think it is better the way it is.
thanks, i got it
another question, sorry, in line 670, why not push attr back to attrPool
before returning the new leaf node, just like what you did in line 675-676.
Oh, that looks like a bug. Thank you for finding it! I put a fix in the git
repository.
thanks for your great work. waffles is what i want