Menu

#7 Error LabelPowerset with FilteredClassifier and StringToWordVector as a filter

v1.0_(example)
closed
nobody
None
1
2015-03-04
2015-02-23
Anonymous
No

Hi,

Mulan Version : 1.4.0

I am trying to use LabelPowerset with FilteredClassifier and StringToWordVector as a filter for text dataset, The same configuration is working fine with BinaryRelavance but with labelPowerset I am getting below exception msg :

Feb 23, 2015 12:29:56 PM mulan.classifier.transformation.LabelPowerset makePredictionInternal
SEVERE: null
java.lang.IndexOutOfBoundsException: Index: 26, Size: 26
at java.util.ArrayList.rangeCheck(Unknown Source)
at java.util.ArrayList.get(Unknown Source)
at weka.core.Attribute.value(Attribute.java:795)
at weka.core.AbstractInstance.stringValue(AbstractInstance.java:618)
at weka.core.AbstractInstance.stringValue(AbstractInstance.java:597)
at weka.filters.unsupervised.attribute.StringToWordVector.convertInstancewoDocNorm(StringToWordVector.java:1632)
at weka.filters.unsupervised.attribute.StringToWordVector.input(StringToWordVector.java:688)
at weka.classifiers.meta.FilteredClassifier.distributionForInstance(FilteredClassifier.java:424)
at mulan.classifier.transformation.LabelPowerset.makePredictionInternal(LabelPowerset.java:156)
at mulan.classifier.MultiLabelLearnerBase.makePrediction(MultiLabelLearnerBase.java:110)
at mulan.evaluation.Evaluator.evaluate(Evaluator.java:96)
at mulan.evaluation.Evaluator.evaluate(Evaluator.java:152)
at mulan.evaluation.Evaluator.innerCrossValidate(Evaluator.java:275)
at mulan.evaluation.Evaluator.crossValidate(Evaluator.java:238)
at mulan.wrapper.MulanCrossValidationWrapper.crossValidation(MulanCrossValidationWrapper.java:129)
at mulan.wrapper.MulanCrossValidationWrapper.evaluateAndGenerateOutputfromDir(MulanCrossValidationWrapper.java:58)
at mulan.execution.Execution.main(Execution.java:14)

Feb 23, 2015 12:29:56 PM mulan.evaluation.Evaluator innerCrossValidate
SEVERE: null
java.lang.NullPointerException
at mulan.core.Util.RandomIndexOfMax(Util.java:48)
at mulan.classifier.transformation.LabelPowerset.makePredictionInternal(LabelPowerset.java:161)
at mulan.classifier.MultiLabelLearnerBase.makePrediction(MultiLabelLearnerBase.java:110)
at mulan.evaluation.Evaluator.evaluate(Evaluator.java:96)
at mulan.evaluation.Evaluator.evaluate(Evaluator.java:152)
at mulan.evaluation.Evaluator.innerCrossValidate(Evaluator.java:275)
at mulan.evaluation.Evaluator.crossValidate(Evaluator.java:238)
at mulan.wrapper.MulanCrossValidationWrapper.crossValidation(MulanCrossValidationWrapper.java:129)
at mulan.wrapper.MulanCrossValidationWrapper.evaluateAndGenerateOutputfromDir(MulanCrossValidationWrapper.java:58)
at mulan.execution.Execution.main(Execution.java:14)

Discussion

  • tsoumakas

    tsoumakas - 2015-02-23

    Could not replicate this error with the latest version of Mulan and sample text data.

     
    • Anonymous

      Anonymous - 2015-02-24

      Hi,

      The Mulan's sample text data (e.g. bibtex dataset) is already converted to StringToWordVector while I am using filtered classifier to do the task during classification.
      I am attaching a sample dataset for you to run the test.
      Kindly let me know if there is some issue with data format.
      below is the code from main function :

      public static void main(String[] args) throws Exception {
      String arffFilename = Utils.getOption("arff", args);
      String xmlFilename = Utils.getOption("xml", args);
      MultiLabelInstances dataset = new MultiLabelInstances(arffFilename, xmlFilename);
      FilteredClassifier classifier = new FilteredClassifier();
      classifier.setClassifier(new NaiveBayesMultinomial());
      classifier.setFilter(new StringToWordVectors());
      LabelPowerset learner1 = new LabelPowerset(classifier);
      Evaluator eval = new Evaluator();
      MultipleEvaluation results;

          int numFolds = 4;
          results = eval.crossValidate(learner1, dataset, numFolds);
          System.out.println(results);
      }
      

      Thanks in advance :)

       
  • Anonymous

    Anonymous - 2015-02-24

    Hi,

    The Sample dataset given on Mulan website is already converted to StringToWordVector (e.g bibtex dataset) while I am using FilteredClassifier to do the conversion internally during classification. I am attaching some sample data kindly use these files to run the experiment.
    I am also pasting the sample main method used for cross validation :

    public static void main(String[] args) throws Exception {
    String arffFilename = Utils.getOption("arff", args);
    String xmlFilename = Utils.getOption("xml", args);

        MultiLabelInstances dataset = new MultiLabelInstances(arffFilename, xmlFilename);
        FilteredClassifier classifier = new FilteredClassifier();
        classifier.setClassifier(new NaiveBayesMultinomial());
        classifier.setFilter(new StringToWordVectors());
        LabelPowerset learner1 = new LabelPowerset(classifier);
        Evaluator eval = new Evaluator();
        MultipleEvaluation results;
    
        int numFolds = 4;
        results = eval.crossValidate(learner1, dataset, numFolds);
        System.out.println(results);
    }
    
     
  • tsoumakas

    tsoumakas - 2015-02-24

    Hi,

    Thanks for the sample data, I managed to replicate the bug.

    It seems that this bug is due to Weka, as when I switched from weka-3.7.10.jar to weka-3.7.12.jar it dissapeared.

    Best,
    Greg

     
  • Anonymous

    Anonymous - 2015-02-24

    Hi,

    Thanks a lot.

    Please let me know which version of Mulan you are using.. I am currently using Mulan-1.4.0 which I think not compatible with Weka-3.7.12

    Thanks :)

     
  • tsoumakas

    tsoumakas - 2015-02-24

    Hi,

    I'm using the latest Mulan-1.5.0, but I believe that Mulan-1.4.0 is also compatible with weka-3.7.12

    Greg

     
  • apoorv

    apoorv - 2015-02-25

    Hi Greg,

    I tried multiple times with the updated versions as you mentioned but I am still getting the same error.

    can you please check once again for cross validation and confirm the error is gone.

    Thanks

     
  • tsoumakas

    tsoumakas - 2015-02-27

    Hi,

    Apologies for the misunderstanding. Upgrading to Weka 3.7.12 fixed the problem for simple holdout evaluation but not for cross-validation

    There was a bug in the LabelPowerset algorithm after all, which has now been fixed. You should grab the latest code from our Git repository.

    Greg

     
  • tsoumakas

    tsoumakas - 2015-03-04
    • status: open --> closed
     

Anonymous
Anonymous

Add attachments
Cancel