From: Kim A. <kia...@ce...> - 2017-08-21 12:24:29
|
Hi all, Just to follow up on this, the DNN implementation has been updated so the workaround should no longer be necessary. Cheers, Kim Kim Albertsson wrote: > Hi Chris! >> Thanks, I wondered if it was a ‘feature’, but I failed to find any >> good documentation explaining the new loader class. Is there anything >> I can read explaining how to go about using it ? I’ve found the >> doxygen docs, but thats not really what I am looking for. > There is supposed be information about the dataloader in the User's > Guide > <https://root.cern.ch/gitweb/?p=root.git;a=blob;f=documentation/tmva/UsersGuide/TMVAUsersGuide.pdf> > but it has unfortunately not been updated with this yet. It is a > simple transformation however, methods that dealt with loading and > preparing input was moved to a separate class; It should work the same > as the factory did before. > > Digging a little further into this I see now that this behaviour has > been in TMVA since Jun 22, 2009 and I realise that it is possibly a > bug of the DNN. Could you check whether the output of the MLP is as > you expect? I will look into a proper fix. For now I can only provide > the workarounds previously discussed :/. >> Specifically, on your suggestion below, its not clear to me how I go >> about ‘ensuring the expected order’ as you describe. Is it just a case >> of adding the Signal first, then the background, with >> 'tmvaLoader->GetDataSetInfo().AddClass("ClassName”)’ ? > I think you want > tmvaLoader->GetDataSetInfo().AddClass("Background”) // adds class 0 > tmvaLoader->GetDataSetInfo().AddClass("Signal”) // adds class 1 > to get the expected output signal(background) => 1(0). > > Thanks for reporting this to us! > > Cheers, > Kim >> >> cheers Chris >> >>> >>> On 12 Jul 2017, at 4:18 pm, Kim Albertsson <kia...@ce... >>> <mailto:kia...@ce...>> wrote: >>> >>> Hi Chris, >>> >>> This is a feature of the Dataloader, it creates the class indices >>> dynamically making the order that classes are added important. This >>> to allow more than two classes and custom class names. One can check >>> what index the signal class has by querying the DataSetInfo method >>> GetSignalClassIndex. In your case this would >>> betmvaLoader->GetDataSetInfo().GetSignalClassIndex(). >>> >>> Another approach would be to add the classes first and ensure the >>> expected order through >>> tmvaLoader->GetDataSetInfo().AddClass("ClassName"). If you use this >>> second approach the names must be "Background" and "Signal" as you >>> use AddSignalTrainingEvent and friends which expects these classes to >>> exist. >>> >>> Cheers, >>> Kim >>>> >>>> Begin forwarded message: >>>> >>>> *From: *Chris Jones <jo...@he... >>>> <mailto:jo...@he...>> >>>> *Subject: **[TMVA-users] Signal/Background target responses inverted.* >>>> *Date: *12 July 2017 at 15:15:31 GMT+2 >>>> *To: *"tmv...@li... >>>> <mailto:tmv...@li...>" >>>> <tmv...@li... >>>> <mailto:tmv...@li...>> >>>> >>>> Hi, >>>> >>>> After a while away from running TMVA I am back looking at the new >>>> DNN MVA in ROOT 6.10.02. I have noticed what appears to me slightly >>>> odd behavior, which is in some of my trainings the target response >>>> (1 or 0) for signal or background are inverted. By which I mean >>>> signal(background) is trained to give 0(1), instead of the expected >>>> 1(0). >>>> >>>> In the end I think I have tracked this down to the fact I use the >>>> following logic to fill my training and testing samples. >>>> >>>> for ( 'some loop over data entries' ) { >>>> >>>> if ( target ) >>>> { >>>> if ( !useForTesting ) >>>> { >>>> tmvaLoader->AddSignalTrainingEvent( InputDoubles, 1.0 ); >>>> } >>>> else >>>> { >>>> tmvaLoader->AddSignalTestEvent ( InputDoubles, 1.0 ); >>>> } >>>> } >>>> else >>>> { >>>> if ( !useForTesting ) >>>> { >>>> tmvaLoader->AddBackgroundTrainingEvent( InputDoubles, 1.0 ); >>>> } >>>> else >>>> { >>>> tmvaLoader->AddBackgroundTestEvent ( InputDoubles, 1.0 ); >>>> } >>>> } >>>> >>>> } >>>> >>>> where 'target' is a boolean that indicates if the data entry is >>>> signal or background, and 'useForTesting' another boolean to >>>> indicate if the entry should be used for training or testing. >>>> InputDoubles is an array with all the input parameters for the given >>>> data entry. >>>> >>>> tmvaLoader is an instance of TMVA::DataLoader. >>>> >>>> The issues is, the order that the above calls are first made is not >>>> always the same. It depends on the conditionals, and if the first >>>> data entry is declared to be signal or not. What I have found is if >>>> the first entry is signal, so 'AddSignalTrainingEvent' is called >>>> first, then TMVA trains the network so give signal the expected >>>> response of 1, and background 0. However, if the first data entry is >>>> background, so AddBackgroundTrainingEvent is called first, then the >>>> logic is for some reason inverted, and signal is trained to give a >>>> response of 0.... >>>> >>>> Note I have used the above logic many times in the past, with >>>> previous ROOT versions (using the MLP classifier). So this issue is >>>> new to the new ROOT version (6.10.02). >>>> >>>> It is also the case that the use of TMVA::DataLoader is also new. So >>>> I am not clear if the issue is related to this, or the use of the >>>> DNN classifier. >>>> >>>> I have a work around, which is just to make sure >>>> AddSignalTrainingEvent is called first (I skip entries until I get >>>> to the first training signal entry) and this seems to do the job. >>>> However, I am curious as to what people think about the above >>>> behavior. I doubt somehow its intentional so looks to me like a bug >>>> somewhere in TMVA, either in TMVA::DataLoader or perhaps specific to >>>> the DNN MVA ? >>>> >>>> cheers Chris >>>> >>>> ------------------------------------------------------------------------------ >>>> Check out the vibrant tech community on one of the world's most >>>> engaging tech sites, Slashdot.org <http://slashdot.org/>! >>>> http://sdm.link/slashdot >>>> _______________________________________________ >>>> TMVA-users mailing list >>>> TMV...@li... >>>> <mailto:TMV...@li...> >>>> https://lists.sourceforge.net/lists/listinfo/tmva-users >>> >> > > > Christopher Jones wrote: >> Hi, >> >> Thanks, I wondered if it was a ‘feature’, but I failed to find any >> good documentation explaining the new loader class. Is there anything >> I can read explaining how to go about using it ? I’ve found the >> doxygen docs, but thats not really what I am looking for. >> >> Specifically, on your suggestion below, its not clear to me how I go >> about ‘ensuring the expected order’ as you describe. Is it just a >> case of adding the Signal first, then the background, with >> 'tmvaLoader->GetDataSetInfo().AddClass("ClassName”)’ ? >> >> cheers Chris >> >>> On 12 Jul 2017, at 4:18 pm, Kim Albertsson <kia...@ce... >>> <mailto:kia...@ce...>> wrote: >>> >>> Hi Chris, >>> >>> This is a feature of the Dataloader, it creates the class indices >>> dynamically making the order that classes are added important. This >>> to allow more than two classes and custom class names. One can check >>> what index the signal class has by querying the DataSetInfo method >>> GetSignalClassIndex. In your case this would >>> betmvaLoader->GetDataSetInfo().GetSignalClassIndex(). >>> >>> Another approach would be to add the classes first and ensure the >>> expected order through >>> tmvaLoader->GetDataSetInfo().AddClass("ClassName"). If you use this >>> second approach the names must be "Background" and "Signal" as you >>> use AddSignalTrainingEvent and friends which expects these classes >>> to exist. >>> >>> Cheers, >>> Kim >>>> Begin forwarded message: >>>> >>>> *From: *Chris Jones <jo...@he... >>>> <mailto:jo...@he...>> >>>> *Subject: **[TMVA-users] Signal/Background target responses inverted.* >>>> *Date: *12 July 2017 at 15:15:31 GMT+2 >>>> *To: *"tmv...@li... >>>> <mailto:tmv...@li...>" >>>> <tmv...@li... >>>> <mailto:tmv...@li...>> >>>> >>>> Hi, >>>> >>>> After a while away from running TMVA I am back looking at the new >>>> DNN MVA in ROOT 6.10.02. I have noticed what appears to me slightly >>>> odd behavior, which is in some of my trainings the target response >>>> (1 or 0) for signal or background are inverted. By which I mean >>>> signal(background) is trained to give 0(1), instead of the expected >>>> 1(0). >>>> >>>> In the end I think I have tracked this down to the fact I use the >>>> following logic to fill my training and testing samples. >>>> >>>> for ( 'some loop over data entries' ) { >>>> >>>> if ( target ) >>>> { >>>> if ( !useForTesting ) >>>> { >>>> tmvaLoader->AddSignalTrainingEvent( InputDoubles, 1.0 ); >>>> } >>>> else >>>> { >>>> tmvaLoader->AddSignalTestEvent ( InputDoubles, 1.0 ); >>>> } >>>> } >>>> else >>>> { >>>> if ( !useForTesting ) >>>> { >>>> tmvaLoader->AddBackgroundTrainingEvent( InputDoubles, 1.0 ); >>>> } >>>> else >>>> { >>>> tmvaLoader->AddBackgroundTestEvent ( InputDoubles, 1.0 ); >>>> } >>>> } >>>> >>>> } >>>> >>>> where 'target' is a boolean that indicates if the data entry is >>>> signal or background, and 'useForTesting' another boolean to >>>> indicate if the entry should be used for training or testing. >>>> InputDoubles is an array with all the input parameters for the >>>> given data entry. >>>> >>>> tmvaLoader is an instance of TMVA::DataLoader. >>>> >>>> The issues is, the order that the above calls are first made is not >>>> always the same. It depends on the conditionals, and if the first >>>> data entry is declared to be signal or not. What I have found is if >>>> the first entry is signal, so 'AddSignalTrainingEvent' is called >>>> first, then TMVA trains the network so give signal the expected >>>> response of 1, and background 0. However, if the first data entry >>>> is background, so AddBackgroundTrainingEvent is called first, then >>>> the logic is for some reason inverted, and signal is trained to >>>> give a response of 0.... >>>> >>>> Note I have used the above logic many times in the past, with >>>> previous ROOT versions (using the MLP classifier). So this issue is >>>> new to the new ROOT version (6.10.02). >>>> >>>> It is also the case that the use of TMVA::DataLoader is also new. >>>> So I am not clear if the issue is related to this, or the use of >>>> the DNN classifier. >>>> >>>> I have a work around, which is just to make sure >>>> AddSignalTrainingEvent is called first (I skip entries until I get >>>> to the first training signal entry) and this seems to do the job. >>>> However, I am curious as to what people think about the above >>>> behavior. I doubt somehow its intentional so looks to me like a bug >>>> somewhere in TMVA, either in TMVA::DataLoader or perhaps specific >>>> to the DNN MVA ? >>>> >>>> cheers Chris >>>> >>>> ------------------------------------------------------------------------------ >>>> Check out the vibrant tech community on one of the world's most >>>> engaging tech sites, Slashdot.org <http://slashdot.org/>! >>>> http://sdm.link/slashdot >>>> _______________________________________________ >>>> TMVA-users mailing list >>>> TMV...@li... >>>> <mailto:TMV...@li...> >>>> https://lists.sourceforge.net/lists/listinfo/tmva-users >> > |