Menu

Learner example

Help
2010-10-28
2012-09-14
  • Nobody/Anonymous

    Can you give me a complete example how to train GDecisionTree withing program
    using the array of floats and with one target attribute and how to predict
    results with it? Im not sure what should i do because there are no examples on
    the offsite and some things arent obvious to me.

     
  • Nobody/Anonymous

    Here is an example. Please let me know if this is not clear, or if you want it
    to do something else.

    // Train a decision tree

    GData d(3); // cols=3

    d.newRows(5); // rows=5

    d = 1.4; d = 7.3; d = 4.0;

    d = 2.4; d = 6.2; d = 4.0;

    d = 5.1; d = 5.8; d = 5.0;

    d = 2.4; d = 4.7; d = 5.0;

    d = 0.2; d = 6.3; d = 4.0;

    GRand prng(0);

    GDecisionTree tree(&prng);

    tree.train(&d, 1);

    // Test it

    double v;

    v = 5.0; v = 5.7;

    tree.predict(v, v + 2);

    cout << "Prediction: " << v << "\n";

     
  • Nobody/Anonymous

    Hi. I've discovered Waffles a few days ago when I was looking for a decision
    tree classifier implementation.

    I could use decision trees trough your command line "learner" utility but I
    didn't manage to understand your API well enough to use them from my code.
    Your last example has been very helpful and I want to ask you a question about
    it : what modifications should I do if I need to work with categorical
    attributes ? and, if the target attribute is categorical ?

    Thanks in advance! I would appreciate a lot more simple examples like this
    explaining the API usage!

    Regards from Argentina.

     
  • Mike Gashler

    Mike Gashler - 2010-11-04

    The GDecisionTree class automatically handles both continuous and nominal
    attributes, so all you need to do is put nominal (categorical) attributes in
    your data, and it should just work.

    Often, people like to store their data in the text-based ARFF format. Here is
    an example ARFF file:

    @RELATION mydata

    @ATTRIBUTE x { red, green, blue }

    @ATTRIBUTE y continuous

    @ATTRIBUTE z { true, false }

    @DATA

    red,3.4,true

    green,1.0,false

    red,2.9,true

    blue,5.5,true

    Then, you can load this data and train a decision tree with code like this:

    GData* pData = GData::loadArff("mydata.arff");

    Holder<gdata> hData(pData);</gdata>

    GRand prng(0);

    GDecisionTree tree(&prng);

    tree.train(pData, 1);

    The first line loads the data. The second line makes sure it gets deleted
    later. The third line makes a pseudo-random number generator. The fourth line
    makes a decision tree. The fifth line trains it.

    You could also construct this same data in code like this:

    GMixedRelation* pRel = new GMixedRelation();

    pRel->addAttribute(3); // 3 categories

    pRel->addAttribute(0); // continuous

    pRel->addAttribute(2); // 2 categories

    sp_relation spRel = pRel;

    GData data(spRel);

    data.newRows(4);

    data.row(0) = 0; data.row(0) = 3.4; data.row(0) = 0;

    data.row(1) = 1; data.row(1) = 1.0; data.row(1) = 1;

    data.row(2) = 0; data.row(2) = 2.9; data.row(2) = 0;

    data.row(3) = 2; data.row(3) = 5.5; data.row(3) = 0;

    GRand prng(0);

    GDecisionTree tree(&prng);

    tree.train(pData, 1);

     
  • Nobody/Anonymous

    (Sorry, I should have built that code before I posted it. To make it build,
    you will need to replace "addAttribute" with "addAttr", and replace "pData"
    with "&data".)

     
  • Nobody/Anonymous

    Hi, I also have a question about GDecissionTree class. I am writing a program
    in which the tree will be drawn. After i get tree, how can i get information
    about how it looks? I mean nodes, leaf, split function.

     
  • Mike Gashler

    Mike Gashler - 2010-11-18

    The waffles_plot tool has a command called "printdecisiontree" that will print
    an ascii-representation of the decision tree model to the console. Here is an
    example for how to use it:

    waffles_learn train mydata.arff decisiontree > dt.twt

    waffles_plot printdecisiontree dt.twt mydata.arff

     
  • Mike Gashler

    Mike Gashler - 2010-11-18

    ...if you want to do it in code, instead of using the command-line tools, the
    interface you want may not be implemented. If you take a look at how the
    GDecisionTree::print method works, you will see where such an interface could
    be added.

     
  • Nobody/Anonymous

    I have read all documentation but i still can't figure it out, what is that
    random value stands for when you create a tree? In example above:

    GRand prng(0);

    GDecisionTree tree(&prng);

    Why it is so important that is always needed for creating a tree? And why in
    this example will always be equal to "0"

     
  • Nobody/Anonymous

    Sorry, I just noticed that I never replied to this question. I'm sure it is
    now too late, but I will answer in case someone else has the same question...

    The value "0" is a seed for the pseudo-random number generator (PRNG). The
    GDecisionTree class requires a reference to a PRNG because the use may specify
    for it to make random divisions. If you use the tree with its default
    parameters, it will not really use the PRNG that you supply to the constructor
    (except maybe in obscure cases, like to break ties, etc.).

    I suppose I could add a constructor that does not require this parameter, but
    then the user would be limited to only use options that never require a PRNG
    for any reason.

     

Anonymous
Anonymous

Add attachments
Cancel





Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.