Mulan / Support Requests / #6 question regarding - The format of label attribute 'predvar1' is not valid"

#6 question regarding - The format of label attribute 'predvar1' is not valid"

Milestone: v1.0_(example)

Status: open

Owner: tsoumakas

Labels: None

Priority: 2

Updated: 2015-02-16

Created: 2015-02-16

Creator: Anonymous

Private: No

I was working on a multi label classification problem using multi label k nearest neighbour(mlknn) in mulan.

Your library is fantastic, but not a lot of resources or documentation available, I was trying out different things and got some results, but I'm not sure if I'm right or totally wrong. I have a few questions.

Say, I want to predict the two variables, predvar1 and predvar2, am
I supposed to give these in the xml file?
One of the variables I want to predict, say, predvar1 can take three
values - 0, 1 or 2. When I give these in the xml file, I get the
error, "The format of label attribute 'predvar1' is not valid".
However, if I do the same for a variable which can take only two
values 0 or 1, it works fine and gives me some accuracy and other
metrics. Why is this?
Also, for the variables, which I want to predict, should I give a
'?' in the arff file?
4.If I give less than three labels in the xml file, it gives me an error (using mulan 1.3)

This is the code I'm using

import mulan.classifier.lazy.MLkNN;
import mulan.classifier.meta.RAkEL;
import mulan.classifier.transformation.LabelPowerset;
import mulan.data.MultiLabelInstances;
import mulan.evaluation.Evaluator;
import mulan.evaluation.MultipleEvaluation;
import weka.classifiers.trees.J48;
import weka.core.Utils;

public class MulanExp1 {

    public static void main(String[] args) throws Exception {
        String arffFilename = Utils.getOption("arff", args); // e.g. -arff emotions.arff
        String xmlFilename = Utils.getOption("xml", args); // e.g. -xml emotions.xml

        MultiLabelInstances dataset = new MultiLabelInstances(arffFilename, xmlFilename);

        RAkEL learner1 = new RAkEL(new LabelPowerset(new J48()));
        MLkNN learner2 = new MLkNN();

        Evaluator eval = new Evaluator();
        MultipleEvaluation results;

        int numFolds = 10;
        results = eval.crossValidate(learner1, dataset, numFolds);
        System.out.println(results);
        results = eval.crossValidate(learner2, dataset, numFolds);
        System.out.println(results);
    }
}

Discussion

tsoumakas - 2015-02-16

Dear Aftab,

I answer your questions inline.

On 16/2/2015 4:33 πμ, Aftab Hassan wrote:

I was working on a multi label classification problem using multi label k nearest neighbour(mlknn) in mulan.

Your library is fantastic, but not a lot of resources or documentation available, I was trying out different things and got some results, but I'm not sure if I'm right or totally wrong. I have a few questions.

Say, I want to predict the two variables, predvar1 and predvar2, am
I supposed to give these in the xml file?

There are two ways you can do this:
a) Just have these two variables as the last two variables of your dataset and call the MultiLabelInstances constructor with the second argument being the number of output variables, e.g. new MultiLabelInstances(arffFilename, 2);
b) Put these in an xml file according to our schema (http://mulan.sourceforge.net/format.html) and call the constructor you are already using.

One of the variables I want to predict, say, predvar1 can take three
values - 0, 1 or 2. When I give these in the xml file, I get the
error, "The format of label attribute 'predvar1' is not valid".
However, if I do the same for a variable which can take only two
values 0 or 1, it works fine and gives me some accuracy and other
metrics. Why is this?

Mulan addressed primarily multi-label learning tasks, i.e. all target variables should be binary. Recently we are also addressing problems with all target variables being numeric. Having a mixed typed of target variables as well as nominal attributes like {0, 1, 2} is a future goal.

Also, for the variables, which I want to predict, should I give a
'?' in the arff file?

Yes, this is Weka's way of indicating unknown values.

4.If I give less than three labels in the xml file, it gives me an error (using mulan 1.3)

Less than three labels doesn't make sense for RAkEL.

Cheers,
Greg
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Anonymous

question regarding - The format of label attribute 'predvar1' is not valid"

Group

Searches

Help

#6 question regarding - The format of label attribute 'predvar1' is not valid"

Discussion