Menu

#31 adding data event-by-event with non-trivial variable expressions

HEAD
open
nobody
None
5
2014-05-22
2014-05-19
carsten
No

The following behaviour occurs when adding data event-by-event to a TMVA::Factory:

Book a Variable with TMVA::Factory like this:

factory->AddVariable("myVar := myVarInMeV/1000.", "My Variable", "GeV", 'F', 0., 1000.)

Now fill a 'std::vector<double> evdata' with some data and call

factory->AddSignalTrainingEvent(evdata,1.)

This will lead to the following error message(s):

Error in <TBranch::TLeaf>: Illegal data type for myVarInMev/1000./myVarInMev/1000./F

and finally cause

Error in <TTreeFormula::Compile>: Bad numerical expression : "myVarInMeV"
--- <FATAL> DataSetFactory : Expression myVarInMeV/1000. could not be resolved to a valid formula.

during

factory->TrainAllMethods()
factory->TestAllMethods()
factory->EvaluateAllMethods()

From what I understand, this behavior is due to the fact that the internal trees are created with a branch that matches the "expression" given, not the name or label.
It would be nice to allow giving the variables proper expressions to appear in the XML file even when filling the MVA event-wise.

Please find attached a test case reproducing the error under ROOT 5.34/18. Execute with

root -q testcase.C++

1 Attachments

Discussion

  • Eckhard von Toerne

    Hi Carsten,

    This is not really a bug report but more of a feature request. The issue that you reported may easily avoided by using vanilla-flavor variable definitions like:
    factory->AddVariable( "var0", "Variable 0", 'F' );
    and calculating the "expressions" by hand before entering the data vector by hand using factory->AddSignalTrainingEvent(evdata,1.).
    The complexity of user operations is not much more, compared to what you suggested.

    Regards, Eckhard for the TMVA developers

     
  • carsten

    carsten - 2014-05-22

    Hi!

    Call it as you wish (bug report or feature request), but for me, this actually limits the usability of TMVA. The test case I sent you is of course a vastly simplified variation for the case at hand. Here's my case:

    I'm working with a set of ~100 different TTrees. They should all be fed into the TMVA::Factory, but the event weight is computed differently for each tree, hence I have to add the events event-by-event in a way similar to the testcase and cannot add the TTrees directly, because AFAIK the TMVA does not support tree-specific weight expressions.

    After the training is completed, I create a TMVA::Reader, load the XML file and evaluate the method on all of the original trees and events. Because most of this is automated, I parse the TMVA XML weight files myself and extract the variable definitions from there. Of course, this requires that the expressions given in the XML file are correct with respect to the original TTrees.

    If you see a nice way to avoid the problem in my case, please let me know.

    Regards,
    Carsten

     
    • Eckhard von Toerne

      Hi Carsten,

      Ok, now I see your point more clearly...
      The deeper reason for variable expression support lies in ROOT's TTreeFormula and is thus limited in its use to trees.
      Let me think about this a little bit as there may be an easy way to help and I might post another comment in a few days.

      Regards, Eckhard

       

Log in to post a comment.