question regarding - The format of label attribute 'predvar1' is not valid"
Brought to you by:
stevelaskaridis,
tsoumakas
I was working on a multi label classification problem using multi label k nearest neighbour(mlknn) in mulan.
Your library is fantastic, but not a lot of resources or documentation available, I was trying out different things and got some results, but I'm not sure if I'm right or totally wrong. I have a few questions.
This is the code I'm using
import mulan.classifier.lazy.MLkNN; import mulan.classifier.meta.RAkEL; import mulan.classifier.transformation.LabelPowerset; import mulan.data.MultiLabelInstances; import mulan.evaluation.Evaluator; import mulan.evaluation.MultipleEvaluation; import weka.classifiers.trees.J48; import weka.core.Utils; public class MulanExp1 { public static void main(String[] args) throws Exception { String arffFilename = Utils.getOption("arff", args); // e.g. -arff emotions.arff String xmlFilename = Utils.getOption("xml", args); // e.g. -xml emotions.xml MultiLabelInstances dataset = new MultiLabelInstances(arffFilename, xmlFilename); RAkEL learner1 = new RAkEL(new LabelPowerset(new J48())); MLkNN learner2 = new MLkNN(); Evaluator eval = new Evaluator(); MultipleEvaluation results; int numFolds = 10; results = eval.crossValidate(learner1, dataset, numFolds); System.out.println(results); results = eval.crossValidate(learner2, dataset, numFolds); System.out.println(results); } }
Anonymous
Dear Aftab,
I answer your questions inline.
On 16/2/2015 4:33 πμ, Aftab Hassan wrote:
There are two ways you can do this:
a) Just have these two variables as the last two variables of your dataset and call the MultiLabelInstances constructor with the second argument being the number of output variables, e.g. new MultiLabelInstances(arffFilename, 2);
b) Put these in an xml file according to our schema (http://mulan.sourceforge.net/format.html) and call the constructor you are already using.
Mulan addressed primarily multi-label learning tasks, i.e. all target variables should be binary. Recently we are also addressing problems with all target variables being numeric. Having a mixed typed of target variables as well as nominal attributes like {0, 1, 2} is a future goal.
Yes, this is Weka's way of indicating unknown values.
Less than three labels doesn't make sense for RAkEL.
Cheers,
Greg