Using GRandomForest to classify handwritten digits

Help
Anonymous
2012-10-08
2012-10-09

  • Anonymous
    2012-10-08

    I'm trying to use GRandomForest to classify handwritten digits from 0 to 9.

    Given N training images I extract 12 attributes and train the random forest with the Nx12 matrix of extracted attributes (features) and the Nx1 GMatrix of labels (digit represented by image n).

    When I call predict I get a double like 3.479 instead of an integral value from 0-9.

    How can the GRandomForest pickup on the fact that I need categorical classification, not a continuous prediction?

     
  • Mike Gashler
    Mike Gashler
    2012-10-09

    The meta-data of the label matrix that you use for training determines whether the labels are categorical or continuous.

    If you load your training data from ARFF format, you can change the line:

    @ATTRIBUTE label continuous
    

    to

    @ATTRIBUTE label {0,1,2,3,4,5,6,7,8,9}
    

    . If you load your training data from CSV format (which has no meta-data), then it assumes the attribute is continuous if all values are numbers. You can insert a non-numeric value (for example, change "9" to "nine") to make it treat this attribute as categorical.

    If you construct your data manually, then you need to do something like this:

    GMixedRelation* pMyRelation = new GMixedRelation();
    pMyRelation->addAttr(10);
    sp_relation spMyRelation = pMyRelation;
    GMatrix myLabels(sp_relation);
    myLabels.newRows(n);

     

  • Anonymous
    2012-10-09

    Excellent, loading manually and that last step worked like a charm. Thank you very much.

     


Anonymous


Cancel   Add attachments