Using GRandomForest to classify handwritten digits

  • Anonymous - 2012-10-08

    I'm trying to use GRandomForest to classify handwritten digits from 0 to 9.

    Given N training images I extract 12 attributes and train the random forest with the Nx12 matrix of extracted attributes (features) and the Nx1 GMatrix of labels (digit represented by image n).

    When I call predict I get a double like 3.479 instead of an integral value from 0-9.

    How can the GRandomForest pickup on the fact that I need categorical classification, not a continuous prediction?

  • Mike Gashler

    Mike Gashler - 2012-10-09

    The meta-data of the label matrix that you use for training determines whether the labels are categorical or continuous.

    If you load your training data from ARFF format, you can change the line:

    @ATTRIBUTE label continuous


    @ATTRIBUTE label {0,1,2,3,4,5,6,7,8,9}

    . If you load your training data from CSV format (which has no meta-data), then it assumes the attribute is continuous if all values are numbers. You can insert a non-numeric value (for example, change "9" to "nine") to make it treat this attribute as categorical.

    If you construct your data manually, then you need to do something like this:

    GMixedRelation* pMyRelation = new GMixedRelation();
    sp_relation spMyRelation = pMyRelation;
    GMatrix myLabels(sp_relation);

  • Anonymous - 2012-10-09

    Excellent, loading manually and that last step worked like a charm. Thank you very much.



Cancel  Add attachments

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:

JavaScript is required for this form.

No, thanks