You can subscribe to this list here.
2006 |
Jan
|
Feb
|
Mar
(1) |
Apr
(4) |
May
(1) |
Jun
(1) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(1) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2007 |
Jan
|
Feb
|
Mar
(2) |
Apr
(10) |
May
(1) |
Jun
(13) |
Jul
(69) |
Aug
(40) |
Sep
(45) |
Oct
(21) |
Nov
(15) |
Dec
(2) |
2008 |
Jan
(44) |
Feb
(21) |
Mar
(28) |
Apr
(33) |
May
(35) |
Jun
(16) |
Jul
(12) |
Aug
(29) |
Sep
(12) |
Oct
(24) |
Nov
(36) |
Dec
(22) |
2009 |
Jan
(25) |
Feb
(19) |
Mar
(47) |
Apr
(23) |
May
(39) |
Jun
(14) |
Jul
(33) |
Aug
(12) |
Sep
(31) |
Oct
(31) |
Nov
(19) |
Dec
(13) |
2010 |
Jan
(7) |
Feb
(27) |
Mar
(26) |
Apr
(17) |
May
(10) |
Jun
(11) |
Jul
(17) |
Aug
(20) |
Sep
(31) |
Oct
(13) |
Nov
(19) |
Dec
(6) |
2011 |
Jan
(13) |
Feb
(17) |
Mar
(36) |
Apr
(19) |
May
(4) |
Jun
(14) |
Jul
(24) |
Aug
(22) |
Sep
(47) |
Oct
(35) |
Nov
(24) |
Dec
(18) |
2012 |
Jan
(28) |
Feb
(19) |
Mar
(23) |
Apr
(36) |
May
(27) |
Jun
(39) |
Jul
(29) |
Aug
(23) |
Sep
(17) |
Oct
(36) |
Nov
(60) |
Dec
(28) |
2013 |
Jan
(34) |
Feb
(23) |
Mar
(44) |
Apr
(39) |
May
(89) |
Jun
(55) |
Jul
(31) |
Aug
(47) |
Sep
(6) |
Oct
(21) |
Nov
(21) |
Dec
(10) |
2014 |
Jan
(19) |
Feb
(32) |
Mar
(11) |
Apr
(33) |
May
(22) |
Jun
(7) |
Jul
(16) |
Aug
(4) |
Sep
(20) |
Oct
(17) |
Nov
(12) |
Dec
(6) |
2015 |
Jan
(9) |
Feb
(7) |
Mar
(16) |
Apr
(5) |
May
(13) |
Jun
(27) |
Jul
(25) |
Aug
(11) |
Sep
(10) |
Oct
(7) |
Nov
(47) |
Dec
(2) |
2016 |
Jan
(9) |
Feb
(2) |
Mar
(4) |
Apr
(18) |
May
(2) |
Jun
(8) |
Jul
|
Aug
(27) |
Sep
(47) |
Oct
(28) |
Nov
(3) |
Dec
(9) |
2017 |
Jan
(11) |
Feb
(23) |
Mar
(7) |
Apr
(7) |
May
(20) |
Jun
|
Jul
(6) |
Aug
(1) |
Sep
|
Oct
(3) |
Nov
(11) |
Dec
(8) |
2018 |
Jan
(9) |
Feb
(8) |
Mar
(2) |
Apr
(2) |
May
(2) |
Jun
|
Jul
(2) |
Aug
(1) |
Sep
(2) |
Oct
|
Nov
|
Dec
|
2020 |
Jan
|
Feb
|
Mar
(2) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(2) |
Nov
|
Dec
|
2021 |
Jan
|
Feb
(1) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
(2) |
Nov
|
Dec
|
2022 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(1) |
Jul
|
Aug
(1) |
Sep
|
Oct
|
Nov
|
Dec
|
2023 |
Jan
|
Feb
|
Mar
(1) |
Apr
(1) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2024 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(1) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(1) |
Dec
|
From: Konstantinos V. <kon...@ce...> - 2017-11-12 19:49:39
|
Dear Kim, Thank you for your advise. Indeed the Root6 distribution of TMVA on lxplus works fine and solves my problem. As a side remark (my apologies if it is not relevant to this forum), it is rather unfortunate that all Root applications on CVMFS, such as TMVA and RooFit, come under the directory $ROOTSYS/tutorials. One (like me) would expect to see documents and maybe some simple generic demos, not the whole application code there. A good idea might be to rename that directory to something like $ROOTSYS/applications, to guide people and prevent searching on the Web for possibly obsolete distributions. Thanks again for your kind help. Costas On 11 Nov 2017, at 9:27 PM, Kim Albertsson <kim...@ce...<mailto:kim...@ce...>> wrote: Hi Konstantinos, Sorry for my brevity, resonding from my phone. TMVA is distributed as part of ROOT6 as well. The package from sourceforge is deprecated (I know this information is unclear in some places, we are working on cleaning this up.). You should be able to use it ”out-of-the-box” on lxplus, and you have different versions available to you should you need it. Otherwise go with the latest. Cheers, Kim On 11 Nov 2017, at 18:31, Konstantinos Vellidis <kon...@ce...<mailto:kon...@ce...>> wrote: Dear TMVA experts, I encounter a problem running TMVA 4.2.0 on lxplus.cern.ch<http://lxplus.cern.ch/>. I downloaded the package from SourceForge and built it on lxplus following the instructions from the User’s Guide, without noticing any problem. Running the baseline example with TMVAClassification.C in a Root session, I noticed an error message at the beginning of execution, saying that ‘TMVAGlob is not a namespace, class or enumeration.’ At the end of the execution, when the GUI popped up, clicking on any button issued errors for undeclared methods belonging to that namespace. No plot could be made. The same situation occurred when I ended the Root session and tried to run TMVAGui.C on the output file TMVA.root from the classification. I checked the tmvaglob.C file, where the TMVAGlob namespace is defined, and found nothing obviously wrong. This file is properly included both in TMVAlogon.C, that sets up the environment for the GUI, and in test/variables.C, which treats the input variables from TMVA.root. Interestingly, the GUI runs correctly as part of the Root installation on my laptop (macOS Sierra 10.12.6). Same macros, no modification, and no complaint about namespaces. I could proceed by using TMVA exclusively on my laptop, but this is not the proper solution: it is much slower than the central service cluster, plus I want to launch an analysis with my students who will need to use files, applications, etc located on the cluster. It is impractical to transfer everything on laptops. I suspect that the problem on lxplus comes from its Root version (6.06.00), being not the same as on my laptop (5.34.23). I gave it some tries to fix the problem on lxplus myself, e.g. adding the namespace TMVAGlob:: in front of the method names, but nothing worked. Could you give me some help? Thank you in advance, Costas Vellidis ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org<http://slashdot.org/>! http://sdm.link/slashdot _______________________________________________ TMVA-users mailing list TMV...@li...<mailto:TMV...@li...> https://lists.sourceforge.net/lists/listinfo/tmva-users |
From: Kim A. <kim...@ce...> - 2017-11-11 20:27:22
|
Hi Konstantinos, Sorry for my brevity, resonding from my phone. TMVA is distributed as part of ROOT6 as well. The package from sourceforge is deprecated (I know this information is unclear in some places, we are working on cleaning this up.). You should be able to use it ”out-of-the-box” on lxplus, and you have different versions available to you should you need it. Otherwise go with the latest. Cheers, Kim On 11 Nov 2017, at 18:31, Konstantinos Vellidis <kon...@ce...<mailto:kon...@ce...>> wrote: Dear TMVA experts, I encounter a problem running TMVA 4.2.0 on lxplus.cern.ch<http://lxplus.cern.ch>. I downloaded the package from SourceForge and built it on lxplus following the instructions from the User’s Guide, without noticing any problem. Running the baseline example with TMVAClassification.C in a Root session, I noticed an error message at the beginning of execution, saying that ‘TMVAGlob is not a namespace, class or enumeration.’ At the end of the execution, when the GUI popped up, clicking on any button issued errors for undeclared methods belonging to that namespace. No plot could be made. The same situation occurred when I ended the Root session and tried to run TMVAGui.C on the output file TMVA.root from the classification. I checked the tmvaglob.C file, where the TMVAGlob namespace is defined, and found nothing obviously wrong. This file is properly included both in TMVAlogon.C, that sets up the environment for the GUI, and in test/variables.C, which treats the input variables from TMVA.root. Interestingly, the GUI runs correctly as part of the Root installation on my laptop (macOS Sierra 10.12.6). Same macros, no modification, and no complaint about namespaces. I could proceed by using TMVA exclusively on my laptop, but this is not the proper solution: it is much slower than the central service cluster, plus I want to launch an analysis with my students who will need to use files, applications, etc located on the cluster. It is impractical to transfer everything on laptops. I suspect that the problem on lxplus comes from its Root version (6.06.00), being not the same as on my laptop (5.34.23). I gave it some tries to fix the problem on lxplus myself, e.g. adding the namespace TMVAGlob:: in front of the method names, but nothing worked. Could you give me some help? Thank you in advance, Costas Vellidis ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org<http://Slashdot.org>! http://sdm.link/slashdot _______________________________________________ TMVA-users mailing list TMV...@li...<mailto:TMV...@li...> https://lists.sourceforge.net/lists/listinfo/tmva-users |
From: Konstantinos V. <kon...@ce...> - 2017-11-11 17:30:08
|
Dear TMVA experts, I encounter a problem running TMVA 4.2.0 on lxplus.cern.ch<http://lxplus.cern.ch>. I downloaded the package from SourceForge and built it on lxplus following the instructions from the User’s Guide, without noticing any problem. Running the baseline example with TMVAClassification.C in a Root session, I noticed an error message at the beginning of execution, saying that ‘TMVAGlob is not a namespace, class or enumeration.’ At the end of the execution, when the GUI popped up, clicking on any button issued errors for undeclared methods belonging to that namespace. No plot could be made. The same situation occurred when I ended the Root session and tried to run TMVAGui.C on the output file TMVA.root from the classification. I checked the tmvaglob.C file, where the TMVAGlob namespace is defined, and found nothing obviously wrong. This file is properly included both in TMVAlogon.C, that sets up the environment for the GUI, and in test/variables.C, which treats the input variables from TMVA.root. Interestingly, the GUI runs correctly as part of the Root installation on my laptop (macOS Sierra 10.12.6). Same macros, no modification, and no complaint about namespaces. I could proceed by using TMVA exclusively on my laptop, but this is not the proper solution: it is much slower than the central service cluster, plus I want to launch an analysis with my students who will need to use files, applications, etc located on the cluster. It is impractical to transfer everything on laptops. I suspect that the problem on lxplus comes from its Root version (6.06.00), being not the same as on my laptop (5.34.23). I gave it some tries to fix the problem on lxplus myself, e.g. adding the namespace TMVAGlob:: in front of the method names, but nothing worked. Could you give me some help? Thank you in advance, Costas Vellidis |
From: Stefano R. S. <rob...@gm...> - 2017-11-10 16:00:24
|
Hi Kim, thank you for your answer. However it’s not very clear to me how can I combine the two variables. “x” is not present in my TTree and it will be the same for every entry. How can I do it as a post-processing step? Just to be clear, I want that my variable E is larger than a fixed threshold x, where x can’t be larger than 0.1. Thank you very much! Roberto > Il giorno 10 nov 2017, alle ore 10:26, Kim Albertsson <kia...@ce...> ha scritto: > > Hi Stefano, > > An idea is to use a linear combination of input variables satisfying your constraints. Given `E > x; x <= 0.1` we could rewrite this as `E - x > 0; x <= 0.1`. Defining `CutRangeMax[2]=0.1` and defining a new variable as `E - x` with constraint `CutRangeMin[3]=0` should be helpful if I understood correctly what you are asking for. > > Or if this second constraint is fixed, you can do it as a post-processing step. > > Cheers, > Kim > > Stefano Roberto Soleti wrote: >> Hi, >> yes sorry I chose by the index. What I mean is that if I do CutRangeMax[2] = 0.1 the cut is: >> >> energy < x where x can be max 0.1 >> >> If I do CutRangeMin[2] = 0.1 the cut is >> >> energy > x where x is *at least* 0.1 >> >> while I want >> >> energy > x where x can be *max* 0.1 >> >> Don’t know if it’s clearer now… >> Thanks, >> Roberto >> >> >>> Il giorno 09 nov 2017, alle ore 00:43, Peter Speckmayer <pet...@gm...> ha scritto: >>> >>> Hi, >>> >>> CutRangeMin and CutRangeMax have to be set for a specific variable. I believe >>> you choose by the index: CutRangeMax[2]=0.1 or something like that. I'm not sure if it's done by index or by name. >>> >>> cheers, >>> Peter >>> >>> >>> Stefano Roberto Soleti <rob...@gm...> schrieb am Mi., 8. Nov. 2017, 20:41: >>>> Hi TMVA Users, >>>> I am trying to optimize a combination of rectangular cuts but I don’t know how to set a “maximum value” of the cut on a specific variable. E.g. I want that the cut on the variable “energy” is > 0.1 maximum: it can be energy > 0.05 but it can’t be energy > 0.15. I tried with CutRangeMax = 0.1 but it gives me null results. Any hint? Do I need maybe to convert my variable energy? >>>> >>>> Thank you very much! >>>> >>>> Roberto >>>> ------------------------------------------------------------------------------ >>>> Check out the vibrant tech community on one of the world's most >>>> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >>>> _______________________________________________ >>>> TMVA-users mailing list >>>> TMV...@li... >>>> https://lists.sourceforge.net/lists/listinfo/tmva-users >> >> ------------------------------------------------------------------------------ >> Check out the vibrant tech community on one of the world's most >> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >> _______________________________________________ >> TMVA-users mailing list >> TMV...@li... >> https://lists.sourceforge.net/lists/listinfo/tmva-users > |
From: Kim A. <kia...@ce...> - 2017-11-10 15:27:20
|
Hi Stefano, An idea is to use a linear combination of input variables satisfying your constraints. Given `E > x; x <= 0.1` we could rewrite this as `E - x > 0; x <= 0.1`. Defining `CutRangeMax[2]=0.1` and defining a new variable as `E - x` with constraint `CutRangeMin[3]=0` should be helpful if I understood correctly what you are asking for. Or if this second constraint is fixed, you can do it as a post-processing step. Cheers, Kim Stefano Roberto Soleti wrote: > Hi, > yes sorry I chose by the index. What I mean is that if I do > CutRangeMax[2] = 0.1 the cut is: > > energy < x where x can be max 0.1 > > If I do CutRangeMin[2] = 0.1 the cut is > > energy > x where x is *at least* 0.1 > > while I want > > energy > x where x can be *max* 0.1 > > Don’t know if it’s clearer now… > Thanks, > Roberto > > >> Il giorno 09 nov 2017, alle ore 00:43, Peter Speckmayer >> <pet...@gm... <mailto:pet...@gm...>> ha >> scritto: >> >> Hi, >> >> CutRangeMin and CutRangeMax have to be set for a specific variable. I >> believe >> you choose by the index: CutRangeMax[2]=0.1 or something like that. >> I'm not sure if it's done by index or by name. >> >> cheers, >> Peter >> >> >> Stefano Roberto Soleti <rob...@gm... >> <mailto:rob...@gm...>> schrieb am Mi., 8. Nov. 2017, 20:41: >> >> Hi TMVA Users, >> I am trying to optimize a combination of rectangular cuts but I >> don’t know how to set a “maximum value” of the cut on a specific >> variable. E.g. I want that the cut on the variable “energy” is > >> 0.1 maximum: it can be energy > 0.05 but it can’t be energy > >> 0.15. I tried with CutRangeMax = 0.1 but it gives me null >> results. Any hint? Do I need maybe to convert my variable energy? >> >> Thank you very much! >> >> Roberto >> ------------------------------------------------------------------------------ >> Check out the vibrant tech community on one of the world's most >> engaging tech sites, Slashdot.org <http://Slashdot.org>! >> http://sdm.link/slashdot >> _______________________________________________ >> TMVA-users mailing list >> TMV...@li... >> <mailto:TMV...@li...> >> https://lists.sourceforge.net/lists/listinfo/tmva-users >> > > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > _______________________________________________ > TMVA-users mailing list > TMV...@li... > https://lists.sourceforge.net/lists/listinfo/tmva-users |
From: Kim A. <kia...@ce...> - 2017-11-10 15:25:45
|
Hi Ben, Correlations are available both before and after a classifier. The correlation matrix for input variables vs input variables are always output. To get matrices for input variables vs output variables and more one has to add ":Correlations" to the factory option string. Cheers, Kim Ben Smith wrote: > > Hello, > > When TMVA shows the correlation matrix of the input variables does it > make this before or after classification? > > If it is before, is there a way to check correlations after a > classifier (say a BDT) is used? > > Thank you, > Ben > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > _______________________________________________ > TMVA-users mailing list > TMV...@li... > https://lists.sourceforge.net/lists/listinfo/tmva-users |
From: Ben S. <ben...@gm...> - 2017-11-10 09:39:19
|
Hello, When TMVA shows the correlation matrix of the input variables does it make this before or after classification? If it is before, is there a way to check correlations after a classifier (say a BDT) is used? Thank you, Ben |
From: Stefano R. S. <rob...@gm...> - 2017-11-09 05:49:10
|
Hi, yes sorry I chose by the index. What I mean is that if I do CutRangeMax[2] = 0.1 the cut is: energy < x where x can be max 0.1 If I do CutRangeMin[2] = 0.1 the cut is energy > x where x is *at least* 0.1 while I want energy > x where x can be *max* 0.1 Don’t know if it’s clearer now… Thanks, Roberto > Il giorno 09 nov 2017, alle ore 00:43, Peter Speckmayer <pet...@gm...> ha scritto: > > Hi, > > CutRangeMin and CutRangeMax have to be set for a specific variable. I believe > you choose by the index: CutRangeMax[2]=0.1 or something like that. I'm not sure if it's done by index or by name. > > cheers, > Peter > > > Stefano Roberto Soleti <rob...@gm... <mailto:rob...@gm...>> schrieb am Mi., 8. Nov. 2017, 20:41: > Hi TMVA Users, > I am trying to optimize a combination of rectangular cuts but I don’t know how to set a “maximum value” of the cut on a specific variable. E.g. I want that the cut on the variable “energy” is > 0.1 maximum: it can be energy > 0.05 but it can’t be energy > 0.15. I tried with CutRangeMax = 0.1 but it gives me null results. Any hint? Do I need maybe to convert my variable energy? > > Thank you very much! > > Roberto > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot <http://sdm.link/slashdot> > _______________________________________________ > TMVA-users mailing list > TMV...@li... <mailto:TMV...@li...> > https://lists.sourceforge.net/lists/listinfo/tmva-users <https://lists.sourceforge.net/lists/listinfo/tmva-users> |
From: Peter S. <pet...@gm...> - 2017-11-09 05:43:33
|
Hi, CutRangeMin and CutRangeMax have to be set for a specific variable. I believe you choose by the index: CutRangeMax[2]=0.1 or something like that. I'm not sure if it's done by index or by name. cheers, Peter Stefano Roberto Soleti <rob...@gm...> schrieb am Mi., 8. Nov. 2017, 20:41: > Hi TMVA Users, > I am trying to optimize a combination of rectangular cuts but I don’t know > how to set a “maximum value” of the cut on a specific variable. E.g. I want > that the cut on the variable “energy” is > 0.1 maximum: it can be energy > > 0.05 but it can’t be energy > 0.15. I tried with CutRangeMax = 0.1 but it > gives me null results. Any hint? Do I need maybe to convert my variable > energy? > > Thank you very much! > > Roberto > > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > _______________________________________________ > TMVA-users mailing list > TMV...@li... > https://lists.sourceforge.net/lists/listinfo/tmva-users > |
From: Stefano R. S. <rob...@gm...> - 2017-11-08 19:41:10
|
Hi TMVA Users, I am trying to optimize a combination of rectangular cuts but I don’t know how to set a “maximum value” of the cut on a specific variable. E.g. I want that the cut on the variable “energy” is > 0.1 maximum: it can be energy > 0.05 but it can’t be energy > 0.15. I tried with CutRangeMax = 0.1 but it gives me null results. Any hint? Do I need maybe to convert my variable energy? Thank you very much! Roberto |
From: Peter S. <pet...@gm...> - 2017-10-03 12:14:50
|
Hi, TMVA does not create an "expression" but depending on the method of your choice it creates a boosted decision tree or a neural net or else. This can be used as usual with TMVA via the TMVA::Reader as demonstrated in the tutorials. Look for TMVARegressionApplication.C in the tmva tutorials. cheers, Peter On Tue, Oct 3, 2017 at 8:22 AM uzair latif <lat...@gm...> wrote: > Hello, > > I just wanted to know that when we do a TMVA regression analysis how do we > extract the expression that the TMVA training and testing has come up with > for regression? I want to see that expression. Can someone please guide me? > > Thanks, > Uzair > > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > _______________________________________________ > TMVA-users mailing list > TMV...@li... > https://lists.sourceforge.net/lists/listinfo/tmva-users > |
From: uzair l. <lat...@gm...> - 2017-10-03 06:22:40
|
Hello, I just wanted to know that when we do a TMVA regression analysis how do we extract the expression that the TMVA training and testing has come up with for regression? I want to see that expression. Can someone please guide me? Thanks, Uzair |
From: uzair l. <lat...@gm...> - 2017-10-03 06:21:44
|
Hello, I just wanted to know that when we do a TMVA regression analysis how do we extract the expression that the TMVA training and testing has come up with for regression? I want to see that expression. Can someone please guide me? Thanks, Uzair |
From: Kim A. <kia...@ce...> - 2017-08-21 12:24:29
|
Hi all, Just to follow up on this, the DNN implementation has been updated so the workaround should no longer be necessary. Cheers, Kim Kim Albertsson wrote: > Hi Chris! >> Thanks, I wondered if it was a ‘feature’, but I failed to find any >> good documentation explaining the new loader class. Is there anything >> I can read explaining how to go about using it ? I’ve found the >> doxygen docs, but thats not really what I am looking for. > There is supposed be information about the dataloader in the User's > Guide > <https://root.cern.ch/gitweb/?p=root.git;a=blob;f=documentation/tmva/UsersGuide/TMVAUsersGuide.pdf> > but it has unfortunately not been updated with this yet. It is a > simple transformation however, methods that dealt with loading and > preparing input was moved to a separate class; It should work the same > as the factory did before. > > Digging a little further into this I see now that this behaviour has > been in TMVA since Jun 22, 2009 and I realise that it is possibly a > bug of the DNN. Could you check whether the output of the MLP is as > you expect? I will look into a proper fix. For now I can only provide > the workarounds previously discussed :/. >> Specifically, on your suggestion below, its not clear to me how I go >> about ‘ensuring the expected order’ as you describe. Is it just a case >> of adding the Signal first, then the background, with >> 'tmvaLoader->GetDataSetInfo().AddClass("ClassName”)’ ? > I think you want > tmvaLoader->GetDataSetInfo().AddClass("Background”) // adds class 0 > tmvaLoader->GetDataSetInfo().AddClass("Signal”) // adds class 1 > to get the expected output signal(background) => 1(0). > > Thanks for reporting this to us! > > Cheers, > Kim >> >> cheers Chris >> >>> >>> On 12 Jul 2017, at 4:18 pm, Kim Albertsson <kia...@ce... >>> <mailto:kia...@ce...>> wrote: >>> >>> Hi Chris, >>> >>> This is a feature of the Dataloader, it creates the class indices >>> dynamically making the order that classes are added important. This >>> to allow more than two classes and custom class names. One can check >>> what index the signal class has by querying the DataSetInfo method >>> GetSignalClassIndex. In your case this would >>> betmvaLoader->GetDataSetInfo().GetSignalClassIndex(). >>> >>> Another approach would be to add the classes first and ensure the >>> expected order through >>> tmvaLoader->GetDataSetInfo().AddClass("ClassName"). If you use this >>> second approach the names must be "Background" and "Signal" as you >>> use AddSignalTrainingEvent and friends which expects these classes to >>> exist. >>> >>> Cheers, >>> Kim >>>> >>>> Begin forwarded message: >>>> >>>> *From: *Chris Jones <jo...@he... >>>> <mailto:jo...@he...>> >>>> *Subject: **[TMVA-users] Signal/Background target responses inverted.* >>>> *Date: *12 July 2017 at 15:15:31 GMT+2 >>>> *To: *"tmv...@li... >>>> <mailto:tmv...@li...>" >>>> <tmv...@li... >>>> <mailto:tmv...@li...>> >>>> >>>> Hi, >>>> >>>> After a while away from running TMVA I am back looking at the new >>>> DNN MVA in ROOT 6.10.02. I have noticed what appears to me slightly >>>> odd behavior, which is in some of my trainings the target response >>>> (1 or 0) for signal or background are inverted. By which I mean >>>> signal(background) is trained to give 0(1), instead of the expected >>>> 1(0). >>>> >>>> In the end I think I have tracked this down to the fact I use the >>>> following logic to fill my training and testing samples. >>>> >>>> for ( 'some loop over data entries' ) { >>>> >>>> if ( target ) >>>> { >>>> if ( !useForTesting ) >>>> { >>>> tmvaLoader->AddSignalTrainingEvent( InputDoubles, 1.0 ); >>>> } >>>> else >>>> { >>>> tmvaLoader->AddSignalTestEvent ( InputDoubles, 1.0 ); >>>> } >>>> } >>>> else >>>> { >>>> if ( !useForTesting ) >>>> { >>>> tmvaLoader->AddBackgroundTrainingEvent( InputDoubles, 1.0 ); >>>> } >>>> else >>>> { >>>> tmvaLoader->AddBackgroundTestEvent ( InputDoubles, 1.0 ); >>>> } >>>> } >>>> >>>> } >>>> >>>> where 'target' is a boolean that indicates if the data entry is >>>> signal or background, and 'useForTesting' another boolean to >>>> indicate if the entry should be used for training or testing. >>>> InputDoubles is an array with all the input parameters for the given >>>> data entry. >>>> >>>> tmvaLoader is an instance of TMVA::DataLoader. >>>> >>>> The issues is, the order that the above calls are first made is not >>>> always the same. It depends on the conditionals, and if the first >>>> data entry is declared to be signal or not. What I have found is if >>>> the first entry is signal, so 'AddSignalTrainingEvent' is called >>>> first, then TMVA trains the network so give signal the expected >>>> response of 1, and background 0. However, if the first data entry is >>>> background, so AddBackgroundTrainingEvent is called first, then the >>>> logic is for some reason inverted, and signal is trained to give a >>>> response of 0.... >>>> >>>> Note I have used the above logic many times in the past, with >>>> previous ROOT versions (using the MLP classifier). So this issue is >>>> new to the new ROOT version (6.10.02). >>>> >>>> It is also the case that the use of TMVA::DataLoader is also new. So >>>> I am not clear if the issue is related to this, or the use of the >>>> DNN classifier. >>>> >>>> I have a work around, which is just to make sure >>>> AddSignalTrainingEvent is called first (I skip entries until I get >>>> to the first training signal entry) and this seems to do the job. >>>> However, I am curious as to what people think about the above >>>> behavior. I doubt somehow its intentional so looks to me like a bug >>>> somewhere in TMVA, either in TMVA::DataLoader or perhaps specific to >>>> the DNN MVA ? >>>> >>>> cheers Chris >>>> >>>> ------------------------------------------------------------------------------ >>>> Check out the vibrant tech community on one of the world's most >>>> engaging tech sites, Slashdot.org <http://slashdot.org/>! >>>> http://sdm.link/slashdot >>>> _______________________________________________ >>>> TMVA-users mailing list >>>> TMV...@li... >>>> <mailto:TMV...@li...> >>>> https://lists.sourceforge.net/lists/listinfo/tmva-users >>> >> > > > Christopher Jones wrote: >> Hi, >> >> Thanks, I wondered if it was a ‘feature’, but I failed to find any >> good documentation explaining the new loader class. Is there anything >> I can read explaining how to go about using it ? I’ve found the >> doxygen docs, but thats not really what I am looking for. >> >> Specifically, on your suggestion below, its not clear to me how I go >> about ‘ensuring the expected order’ as you describe. Is it just a >> case of adding the Signal first, then the background, with >> 'tmvaLoader->GetDataSetInfo().AddClass("ClassName”)’ ? >> >> cheers Chris >> >>> On 12 Jul 2017, at 4:18 pm, Kim Albertsson <kia...@ce... >>> <mailto:kia...@ce...>> wrote: >>> >>> Hi Chris, >>> >>> This is a feature of the Dataloader, it creates the class indices >>> dynamically making the order that classes are added important. This >>> to allow more than two classes and custom class names. One can check >>> what index the signal class has by querying the DataSetInfo method >>> GetSignalClassIndex. In your case this would >>> betmvaLoader->GetDataSetInfo().GetSignalClassIndex(). >>> >>> Another approach would be to add the classes first and ensure the >>> expected order through >>> tmvaLoader->GetDataSetInfo().AddClass("ClassName"). If you use this >>> second approach the names must be "Background" and "Signal" as you >>> use AddSignalTrainingEvent and friends which expects these classes >>> to exist. >>> >>> Cheers, >>> Kim >>>> Begin forwarded message: >>>> >>>> *From: *Chris Jones <jo...@he... >>>> <mailto:jo...@he...>> >>>> *Subject: **[TMVA-users] Signal/Background target responses inverted.* >>>> *Date: *12 July 2017 at 15:15:31 GMT+2 >>>> *To: *"tmv...@li... >>>> <mailto:tmv...@li...>" >>>> <tmv...@li... >>>> <mailto:tmv...@li...>> >>>> >>>> Hi, >>>> >>>> After a while away from running TMVA I am back looking at the new >>>> DNN MVA in ROOT 6.10.02. I have noticed what appears to me slightly >>>> odd behavior, which is in some of my trainings the target response >>>> (1 or 0) for signal or background are inverted. By which I mean >>>> signal(background) is trained to give 0(1), instead of the expected >>>> 1(0). >>>> >>>> In the end I think I have tracked this down to the fact I use the >>>> following logic to fill my training and testing samples. >>>> >>>> for ( 'some loop over data entries' ) { >>>> >>>> if ( target ) >>>> { >>>> if ( !useForTesting ) >>>> { >>>> tmvaLoader->AddSignalTrainingEvent( InputDoubles, 1.0 ); >>>> } >>>> else >>>> { >>>> tmvaLoader->AddSignalTestEvent ( InputDoubles, 1.0 ); >>>> } >>>> } >>>> else >>>> { >>>> if ( !useForTesting ) >>>> { >>>> tmvaLoader->AddBackgroundTrainingEvent( InputDoubles, 1.0 ); >>>> } >>>> else >>>> { >>>> tmvaLoader->AddBackgroundTestEvent ( InputDoubles, 1.0 ); >>>> } >>>> } >>>> >>>> } >>>> >>>> where 'target' is a boolean that indicates if the data entry is >>>> signal or background, and 'useForTesting' another boolean to >>>> indicate if the entry should be used for training or testing. >>>> InputDoubles is an array with all the input parameters for the >>>> given data entry. >>>> >>>> tmvaLoader is an instance of TMVA::DataLoader. >>>> >>>> The issues is, the order that the above calls are first made is not >>>> always the same. It depends on the conditionals, and if the first >>>> data entry is declared to be signal or not. What I have found is if >>>> the first entry is signal, so 'AddSignalTrainingEvent' is called >>>> first, then TMVA trains the network so give signal the expected >>>> response of 1, and background 0. However, if the first data entry >>>> is background, so AddBackgroundTrainingEvent is called first, then >>>> the logic is for some reason inverted, and signal is trained to >>>> give a response of 0.... >>>> >>>> Note I have used the above logic many times in the past, with >>>> previous ROOT versions (using the MLP classifier). So this issue is >>>> new to the new ROOT version (6.10.02). >>>> >>>> It is also the case that the use of TMVA::DataLoader is also new. >>>> So I am not clear if the issue is related to this, or the use of >>>> the DNN classifier. >>>> >>>> I have a work around, which is just to make sure >>>> AddSignalTrainingEvent is called first (I skip entries until I get >>>> to the first training signal entry) and this seems to do the job. >>>> However, I am curious as to what people think about the above >>>> behavior. I doubt somehow its intentional so looks to me like a bug >>>> somewhere in TMVA, either in TMVA::DataLoader or perhaps specific >>>> to the DNN MVA ? >>>> >>>> cheers Chris >>>> >>>> ------------------------------------------------------------------------------ >>>> Check out the vibrant tech community on one of the world's most >>>> engaging tech sites, Slashdot.org <http://slashdot.org/>! >>>> http://sdm.link/slashdot >>>> _______________________________________________ >>>> TMVA-users mailing list >>>> TMV...@li... >>>> <mailto:TMV...@li...> >>>> https://lists.sourceforge.net/lists/listinfo/tmva-users >> > |
From: Chris J. <jo...@he...> - 2017-07-13 08:44:32
|
On 12/07/17 21:43, Kim Albertsson wrote: > Hi Chris! >> Thanks, I wondered if it was a ‘feature’, but I failed to find any >> good documentation explaining the new loader class. Is there anything >> I can read explaining how to go about using it ? I’ve found the >> doxygen docs, but thats not really what I am looking for. > There is supposed be information about the dataloader in the User's > Guide > <https://root.cern.ch/gitweb/?p=root.git;a=blob;f=documentation/tmva/UsersGuide/TMVAUsersGuide.pdf> > but it has unfortunately not been updated with this yet. It is a simple > transformation however, methods that dealt with loading and preparing > input was moved to a separate class; It should work the same as the > factory did before. > > Digging a little further into this I see now that this behaviour has > been in TMVA since Jun 22, 2009 and I realise that it is possibly a bug > of the DNN. Could you check whether the output of the MLP is as you > expect? I will look into a proper fix. For now I can only provide the > workarounds previously discussed :/. No problem. Indeed, I can confirm I did NOT see this behaviour with the MLP, only the DNN. >> Specifically, on your suggestion below, its not clear to me how I go >> about ‘ensuring the expected order’ as you describe. Is it just a case >> of adding the Signal first, then the background, with >> 'tmvaLoader->GetDataSetInfo().AddClass("ClassName”)’ ? > I think you want > tmvaLoader->GetDataSetInfo().AddClass("Background”) // adds class 0 > tmvaLoader->GetDataSetInfo().AddClass("Signal”) // adds class 1 > to get the expected output signal(background) => 1(0). Thanks, I will try this. Chris > > Thanks for reporting this to us! > > Cheers, > Kim >> >> cheers Chris >> >>> >>> On 12 Jul 2017, at 4:18 pm, Kim Albertsson <kia...@ce... >>> <mailto:kia...@ce...>> wrote: >>> >>> Hi Chris, >>> >>> This is a feature of the Dataloader, it creates the class indices >>> dynamically making the order that classes are added important. This >>> to allow more than two classes and custom class names. One can check >>> what index the signal class has by querying the DataSetInfo method >>> GetSignalClassIndex. In your case this would >>> betmvaLoader->GetDataSetInfo().GetSignalClassIndex(). >>> >>> Another approach would be to add the classes first and ensure the >>> expected order through >>> tmvaLoader->GetDataSetInfo().AddClass("ClassName"). If you use this >>> second approach the names must be "Background" and "Signal" as you >>> use AddSignalTrainingEvent and friends which expects these classes to >>> exist. >>> >>> Cheers, >>> Kim >>>> >>>> Begin forwarded message: >>>> >>>> *From: *Chris Jones <jo...@he... >>>> <mailto:jo...@he...>> >>>> *Subject: **[TMVA-users] Signal/Background target responses inverted.* >>>> *Date: *12 July 2017 at 15:15:31 GMT+2 >>>> *To: *"tmv...@li... >>>> <mailto:tmv...@li...>" >>>> <tmv...@li... >>>> <mailto:tmv...@li...>> >>>> >>>> Hi, >>>> >>>> After a while away from running TMVA I am back looking at the new >>>> DNN MVA in ROOT 6.10.02. I have noticed what appears to me slightly >>>> odd behavior, which is in some of my trainings the target response >>>> (1 or 0) for signal or background are inverted. By which I mean >>>> signal(background) is trained to give 0(1), instead of the expected >>>> 1(0). >>>> >>>> In the end I think I have tracked this down to the fact I use the >>>> following logic to fill my training and testing samples. >>>> >>>> for ( 'some loop over data entries' ) { >>>> >>>> if ( target ) >>>> { >>>> if ( !useForTesting ) >>>> { >>>> tmvaLoader->AddSignalTrainingEvent( InputDoubles, 1.0 ); >>>> } >>>> else >>>> { >>>> tmvaLoader->AddSignalTestEvent ( InputDoubles, 1.0 ); >>>> } >>>> } >>>> else >>>> { >>>> if ( !useForTesting ) >>>> { >>>> tmvaLoader->AddBackgroundTrainingEvent( InputDoubles, 1.0 ); >>>> } >>>> else >>>> { >>>> tmvaLoader->AddBackgroundTestEvent ( InputDoubles, 1.0 ); >>>> } >>>> } >>>> >>>> } >>>> >>>> where 'target' is a boolean that indicates if the data entry is >>>> signal or background, and 'useForTesting' another boolean to >>>> indicate if the entry should be used for training or testing. >>>> InputDoubles is an array with all the input parameters for the given >>>> data entry. >>>> >>>> tmvaLoader is an instance of TMVA::DataLoader. >>>> >>>> The issues is, the order that the above calls are first made is not >>>> always the same. It depends on the conditionals, and if the first >>>> data entry is declared to be signal or not. What I have found is if >>>> the first entry is signal, so 'AddSignalTrainingEvent' is called >>>> first, then TMVA trains the network so give signal the expected >>>> response of 1, and background 0. However, if the first data entry is >>>> background, so AddBackgroundTrainingEvent is called first, then the >>>> logic is for some reason inverted, and signal is trained to give a >>>> response of 0.... >>>> >>>> Note I have used the above logic many times in the past, with >>>> previous ROOT versions (using the MLP classifier). So this issue is >>>> new to the new ROOT version (6.10.02). >>>> >>>> It is also the case that the use of TMVA::DataLoader is also new. So >>>> I am not clear if the issue is related to this, or the use of the >>>> DNN classifier. >>>> >>>> I have a work around, which is just to make sure >>>> AddSignalTrainingEvent is called first (I skip entries until I get >>>> to the first training signal entry) and this seems to do the job. >>>> However, I am curious as to what people think about the above >>>> behavior. I doubt somehow its intentional so looks to me like a bug >>>> somewhere in TMVA, either in TMVA::DataLoader or perhaps specific to >>>> the DNN MVA ? >>>> >>>> cheers Chris >>>> >>>> ------------------------------------------------------------------------------ >>>> Check out the vibrant tech community on one of the world's most >>>> engaging tech sites, Slashdot.org <http://slashdot.org/>! >>>> http://sdm.link/slashdot >>>> _______________________________________________ >>>> TMVA-users mailing list >>>> TMV...@li... >>>> <mailto:TMV...@li...> >>>> https://lists.sourceforge.net/lists/listinfo/tmva-users >>> >> > > > Christopher Jones wrote: >> Hi, >> >> Thanks, I wondered if it was a ‘feature’, but I failed to find any >> good documentation explaining the new loader class. Is there anything >> I can read explaining how to go about using it ? I’ve found the >> doxygen docs, but thats not really what I am looking for. >> >> Specifically, on your suggestion below, its not clear to me how I go >> about ‘ensuring the expected order’ as you describe. Is it just a case >> of adding the Signal first, then the background, with >> 'tmvaLoader->GetDataSetInfo().AddClass("ClassName”)’ ? >> >> cheers Chris >> >>> On 12 Jul 2017, at 4:18 pm, Kim Albertsson <kia...@ce... >>> <mailto:kia...@ce...>> wrote: >>> >>> Hi Chris, >>> >>> This is a feature of the Dataloader, it creates the class indices >>> dynamically making the order that classes are added important. This >>> to allow more than two classes and custom class names. One can check >>> what index the signal class has by querying the DataSetInfo method >>> GetSignalClassIndex. In your case this would >>> betmvaLoader->GetDataSetInfo().GetSignalClassIndex(). >>> >>> Another approach would be to add the classes first and ensure the >>> expected order through >>> tmvaLoader->GetDataSetInfo().AddClass("ClassName"). If you use this >>> second approach the names must be "Background" and "Signal" as you >>> use AddSignalTrainingEvent and friends which expects these classes to >>> exist. >>> >>> Cheers, >>> Kim >>>> Begin forwarded message: >>>> >>>> *From: *Chris Jones <jo...@he... >>>> <mailto:jo...@he...>> >>>> *Subject: **[TMVA-users] Signal/Background target responses inverted.* >>>> *Date: *12 July 2017 at 15:15:31 GMT+2 >>>> *To: *"tmv...@li... >>>> <mailto:tmv...@li...>" >>>> <tmv...@li... >>>> <mailto:tmv...@li...>> >>>> >>>> Hi, >>>> >>>> After a while away from running TMVA I am back looking at the new >>>> DNN MVA in ROOT 6.10.02. I have noticed what appears to me slightly >>>> odd behavior, which is in some of my trainings the target response >>>> (1 or 0) for signal or background are inverted. By which I mean >>>> signal(background) is trained to give 0(1), instead of the expected >>>> 1(0). >>>> >>>> In the end I think I have tracked this down to the fact I use the >>>> following logic to fill my training and testing samples. >>>> >>>> for ( 'some loop over data entries' ) { >>>> >>>> if ( target ) >>>> { >>>> if ( !useForTesting ) >>>> { >>>> tmvaLoader->AddSignalTrainingEvent( InputDoubles, 1.0 ); >>>> } >>>> else >>>> { >>>> tmvaLoader->AddSignalTestEvent ( InputDoubles, 1.0 ); >>>> } >>>> } >>>> else >>>> { >>>> if ( !useForTesting ) >>>> { >>>> tmvaLoader->AddBackgroundTrainingEvent( InputDoubles, 1.0 ); >>>> } >>>> else >>>> { >>>> tmvaLoader->AddBackgroundTestEvent ( InputDoubles, 1.0 ); >>>> } >>>> } >>>> >>>> } >>>> >>>> where 'target' is a boolean that indicates if the data entry is >>>> signal or background, and 'useForTesting' another boolean to >>>> indicate if the entry should be used for training or testing. >>>> InputDoubles is an array with all the input parameters for the given >>>> data entry. >>>> >>>> tmvaLoader is an instance of TMVA::DataLoader. >>>> >>>> The issues is, the order that the above calls are first made is not >>>> always the same. It depends on the conditionals, and if the first >>>> data entry is declared to be signal or not. What I have found is if >>>> the first entry is signal, so 'AddSignalTrainingEvent' is called >>>> first, then TMVA trains the network so give signal the expected >>>> response of 1, and background 0. However, if the first data entry is >>>> background, so AddBackgroundTrainingEvent is called first, then the >>>> logic is for some reason inverted, and signal is trained to give a >>>> response of 0.... >>>> >>>> Note I have used the above logic many times in the past, with >>>> previous ROOT versions (using the MLP classifier). So this issue is >>>> new to the new ROOT version (6.10.02). >>>> >>>> It is also the case that the use of TMVA::DataLoader is also new. So >>>> I am not clear if the issue is related to this, or the use of the >>>> DNN classifier. >>>> >>>> I have a work around, which is just to make sure >>>> AddSignalTrainingEvent is called first (I skip entries until I get >>>> to the first training signal entry) and this seems to do the job. >>>> However, I am curious as to what people think about the above >>>> behavior. I doubt somehow its intentional so looks to me like a bug >>>> somewhere in TMVA, either in TMVA::DataLoader or perhaps specific to >>>> the DNN MVA ? >>>> >>>> cheers Chris >>>> >>>> ------------------------------------------------------------------------------ >>>> Check out the vibrant tech community on one of the world's most >>>> engaging tech sites, Slashdot.org <http://slashdot.org/>! >>>> http://sdm.link/slashdot >>>> _______________________________________________ >>>> TMVA-users mailing list >>>> TMV...@li... >>>> <mailto:TMV...@li...> >>>> https://lists.sourceforge.net/lists/listinfo/tmva-users >> > |
From: Kim A. <kia...@ce...> - 2017-07-13 06:57:03
|
Since we have two parallel discussions going on, let's merge them and continue it in the root-forum. Cheers, Kim Kim Albertsson wrote: > Hi Chris! >> Thanks, I wondered if it was a ‘feature’, but I failed to find any >> good documentation explaining the new loader class. Is there anything >> I can read explaining how to go about using it ? I’ve found the >> doxygen docs, but thats not really what I am looking for. > There is supposed be information about the dataloader in the User's > Guide > <https://root.cern.ch/gitweb/?p=root.git;a=blob;f=documentation/tmva/UsersGuide/TMVAUsersGuide.pdf> > but it has unfortunately not been updated with this yet. It is a > simple transformation however, methods that dealt with loading and > preparing input was moved to a separate class; It should work the same > as the factory did before. > > Digging a little further into this I see now that this behaviour has > been in TMVA since Jun 22, 2009 and I realise that it is possibly a > bug of the DNN. Could you check whether the output of the MLP is as > you expect? I will look into a proper fix. For now I can only provide > the workarounds previously discussed :/. >> Specifically, on your suggestion below, its not clear to me how I go >> about ‘ensuring the expected order’ as you describe. Is it just a case >> of adding the Signal first, then the background, with >> 'tmvaLoader->GetDataSetInfo().AddClass("ClassName”)’ ? > I think you want > tmvaLoader->GetDataSetInfo().AddClass("Background”) // adds class 0 > tmvaLoader->GetDataSetInfo().AddClass("Signal”) // adds class 1 > to get the expected output signal(background) => 1(0). > > Thanks for reporting this to us! > > Cheers, > Kim >> >> cheers Chris >> >>> >>> On 12 Jul 2017, at 4:18 pm, Kim Albertsson <kia...@ce... >>> <mailto:kia...@ce...>> wrote: >>> >>> Hi Chris, >>> >>> This is a feature of the Dataloader, it creates the class indices >>> dynamically making the order that classes are added important. This >>> to allow more than two classes and custom class names. One can check >>> what index the signal class has by querying the DataSetInfo method >>> GetSignalClassIndex. In your case this would >>> betmvaLoader->GetDataSetInfo().GetSignalClassIndex(). >>> >>> Another approach would be to add the classes first and ensure the >>> expected order through >>> tmvaLoader->GetDataSetInfo().AddClass("ClassName"). If you use this >>> second approach the names must be "Background" and "Signal" as you >>> use AddSignalTrainingEvent and friends which expects these classes to >>> exist. >>> >>> Cheers, >>> Kim >>>> >>>> Begin forwarded message: >>>> >>>> *From: *Chris Jones <jo...@he... >>>> <mailto:jo...@he...>> >>>> *Subject: **[TMVA-users] Signal/Background target responses inverted.* >>>> *Date: *12 July 2017 at 15:15:31 GMT+2 >>>> *To: *"tmv...@li... >>>> <mailto:tmv...@li...>" >>>> <tmv...@li... >>>> <mailto:tmv...@li...>> >>>> >>>> Hi, >>>> >>>> After a while away from running TMVA I am back looking at the new >>>> DNN MVA in ROOT 6.10.02. I have noticed what appears to me slightly >>>> odd behavior, which is in some of my trainings the target response >>>> (1 or 0) for signal or background are inverted. By which I mean >>>> signal(background) is trained to give 0(1), instead of the expected >>>> 1(0). >>>> >>>> In the end I think I have tracked this down to the fact I use the >>>> following logic to fill my training and testing samples. >>>> >>>> for ( 'some loop over data entries' ) { >>>> >>>> if ( target ) >>>> { >>>> if ( !useForTesting ) >>>> { >>>> tmvaLoader->AddSignalTrainingEvent( InputDoubles, 1.0 ); >>>> } >>>> else >>>> { >>>> tmvaLoader->AddSignalTestEvent ( InputDoubles, 1.0 ); >>>> } >>>> } >>>> else >>>> { >>>> if ( !useForTesting ) >>>> { >>>> tmvaLoader->AddBackgroundTrainingEvent( InputDoubles, 1.0 ); >>>> } >>>> else >>>> { >>>> tmvaLoader->AddBackgroundTestEvent ( InputDoubles, 1.0 ); >>>> } >>>> } >>>> >>>> } >>>> >>>> where 'target' is a boolean that indicates if the data entry is >>>> signal or background, and 'useForTesting' another boolean to >>>> indicate if the entry should be used for training or testing. >>>> InputDoubles is an array with all the input parameters for the given >>>> data entry. >>>> >>>> tmvaLoader is an instance of TMVA::DataLoader. >>>> >>>> The issues is, the order that the above calls are first made is not >>>> always the same. It depends on the conditionals, and if the first >>>> data entry is declared to be signal or not. What I have found is if >>>> the first entry is signal, so 'AddSignalTrainingEvent' is called >>>> first, then TMVA trains the network so give signal the expected >>>> response of 1, and background 0. However, if the first data entry is >>>> background, so AddBackgroundTrainingEvent is called first, then the >>>> logic is for some reason inverted, and signal is trained to give a >>>> response of 0.... >>>> >>>> Note I have used the above logic many times in the past, with >>>> previous ROOT versions (using the MLP classifier). So this issue is >>>> new to the new ROOT version (6.10.02). >>>> >>>> It is also the case that the use of TMVA::DataLoader is also new. So >>>> I am not clear if the issue is related to this, or the use of the >>>> DNN classifier. >>>> >>>> I have a work around, which is just to make sure >>>> AddSignalTrainingEvent is called first (I skip entries until I get >>>> to the first training signal entry) and this seems to do the job. >>>> However, I am curious as to what people think about the above >>>> behavior. I doubt somehow its intentional so looks to me like a bug >>>> somewhere in TMVA, either in TMVA::DataLoader or perhaps specific to >>>> the DNN MVA ? >>>> >>>> cheers Chris >>>> >>>> ------------------------------------------------------------------------------ >>>> Check out the vibrant tech community on one of the world's most >>>> engaging tech sites, Slashdot.org <http://slashdot.org/>! >>>> http://sdm.link/slashdot >>>> _______________________________________________ >>>> TMVA-users mailing list >>>> TMV...@li... >>>> <mailto:TMV...@li...> >>>> https://lists.sourceforge.net/lists/listinfo/tmva-users >>> >> > > > Christopher Jones wrote: >> Hi, >> >> Thanks, I wondered if it was a ‘feature’, but I failed to find any >> good documentation explaining the new loader class. Is there anything >> I can read explaining how to go about using it ? I’ve found the >> doxygen docs, but thats not really what I am looking for. >> >> Specifically, on your suggestion below, its not clear to me how I go >> about ‘ensuring the expected order’ as you describe. Is it just a >> case of adding the Signal first, then the background, with >> 'tmvaLoader->GetDataSetInfo().AddClass("ClassName”)’ ? >> >> cheers Chris >> >>> On 12 Jul 2017, at 4:18 pm, Kim Albertsson <kia...@ce... >>> <mailto:kia...@ce...>> wrote: >>> >>> Hi Chris, >>> >>> This is a feature of the Dataloader, it creates the class indices >>> dynamically making the order that classes are added important. This >>> to allow more than two classes and custom class names. One can check >>> what index the signal class has by querying the DataSetInfo method >>> GetSignalClassIndex. In your case this would >>> betmvaLoader->GetDataSetInfo().GetSignalClassIndex(). >>> >>> Another approach would be to add the classes first and ensure the >>> expected order through >>> tmvaLoader->GetDataSetInfo().AddClass("ClassName"). If you use this >>> second approach the names must be "Background" and "Signal" as you >>> use AddSignalTrainingEvent and friends which expects these classes >>> to exist. >>> >>> Cheers, >>> Kim >>>> Begin forwarded message: >>>> >>>> *From: *Chris Jones <jo...@he... >>>> <mailto:jo...@he...>> >>>> *Subject: **[TMVA-users] Signal/Background target responses inverted.* >>>> *Date: *12 July 2017 at 15:15:31 GMT+2 >>>> *To: *"tmv...@li... >>>> <mailto:tmv...@li...>" >>>> <tmv...@li... >>>> <mailto:tmv...@li...>> >>>> >>>> Hi, >>>> >>>> After a while away from running TMVA I am back looking at the new >>>> DNN MVA in ROOT 6.10.02. I have noticed what appears to me slightly >>>> odd behavior, which is in some of my trainings the target response >>>> (1 or 0) for signal or background are inverted. By which I mean >>>> signal(background) is trained to give 0(1), instead of the expected >>>> 1(0). >>>> >>>> In the end I think I have tracked this down to the fact I use the >>>> following logic to fill my training and testing samples. >>>> >>>> for ( 'some loop over data entries' ) { >>>> >>>> if ( target ) >>>> { >>>> if ( !useForTesting ) >>>> { >>>> tmvaLoader->AddSignalTrainingEvent( InputDoubles, 1.0 ); >>>> } >>>> else >>>> { >>>> tmvaLoader->AddSignalTestEvent ( InputDoubles, 1.0 ); >>>> } >>>> } >>>> else >>>> { >>>> if ( !useForTesting ) >>>> { >>>> tmvaLoader->AddBackgroundTrainingEvent( InputDoubles, 1.0 ); >>>> } >>>> else >>>> { >>>> tmvaLoader->AddBackgroundTestEvent ( InputDoubles, 1.0 ); >>>> } >>>> } >>>> >>>> } >>>> >>>> where 'target' is a boolean that indicates if the data entry is >>>> signal or background, and 'useForTesting' another boolean to >>>> indicate if the entry should be used for training or testing. >>>> InputDoubles is an array with all the input parameters for the >>>> given data entry. >>>> >>>> tmvaLoader is an instance of TMVA::DataLoader. >>>> >>>> The issues is, the order that the above calls are first made is not >>>> always the same. It depends on the conditionals, and if the first >>>> data entry is declared to be signal or not. What I have found is if >>>> the first entry is signal, so 'AddSignalTrainingEvent' is called >>>> first, then TMVA trains the network so give signal the expected >>>> response of 1, and background 0. However, if the first data entry >>>> is background, so AddBackgroundTrainingEvent is called first, then >>>> the logic is for some reason inverted, and signal is trained to >>>> give a response of 0.... >>>> >>>> Note I have used the above logic many times in the past, with >>>> previous ROOT versions (using the MLP classifier). So this issue is >>>> new to the new ROOT version (6.10.02). >>>> >>>> It is also the case that the use of TMVA::DataLoader is also new. >>>> So I am not clear if the issue is related to this, or the use of >>>> the DNN classifier. >>>> >>>> I have a work around, which is just to make sure >>>> AddSignalTrainingEvent is called first (I skip entries until I get >>>> to the first training signal entry) and this seems to do the job. >>>> However, I am curious as to what people think about the above >>>> behavior. I doubt somehow its intentional so looks to me like a bug >>>> somewhere in TMVA, either in TMVA::DataLoader or perhaps specific >>>> to the DNN MVA ? >>>> >>>> cheers Chris >>>> >>>> ------------------------------------------------------------------------------ >>>> Check out the vibrant tech community on one of the world's most >>>> engaging tech sites, Slashdot.org <http://slashdot.org/>! >>>> http://sdm.link/slashdot >>>> _______________________________________________ >>>> TMVA-users mailing list >>>> TMV...@li... >>>> <mailto:TMV...@li...> >>>> https://lists.sourceforge.net/lists/listinfo/tmva-users >> > |
From: Kim A. <kia...@ce...> - 2017-07-12 21:17:06
|
Hi Chris! > Thanks, I wondered if it was a ‘feature’, but I failed to find any > good documentation explaining the new loader class. Is there anything > I can read explaining how to go about using it ? I’ve found the > doxygen docs, but thats not really what I am looking for. There is supposed be information about the dataloader in the User's Guide <https://root.cern.ch/gitweb/?p=root.git;a=blob;f=documentation/tmva/UsersGuide/TMVAUsersGuide.pdf> but it has unfortunately not been updated with this yet. It is a simple transformation however, methods that dealt with loading and preparing input was moved to a separate class; It should work the same as the factory did before. Digging a little further into this I see now that this behaviour has been in TMVA since Jun 22, 2009 and I realise that it is possibly a bug of the DNN. Could you check whether the output of the MLP is as you expect? I will look into a proper fix. For now I can only provide the workarounds previously discussed :/. > Specifically, on your suggestion below, its not clear to me how I go > about ‘ensuring the expected order’ as you describe. Is it just a case > of adding the Signal first, then the background, with > 'tmvaLoader->GetDataSetInfo().AddClass("ClassName”)’ ? I think you want tmvaLoader->GetDataSetInfo().AddClass("Background”) // adds class 0 tmvaLoader->GetDataSetInfo().AddClass("Signal”) // adds class 1 to get the expected output signal(background) => 1(0). Thanks for reporting this to us! Cheers, Kim > > cheers Chris > >> >> On 12 Jul 2017, at 4:18 pm, Kim Albertsson <kia...@ce... >> <mailto:kia...@ce...>> wrote: >> >> Hi Chris, >> >> This is a feature of the Dataloader, it creates the class indices >> dynamically making the order that classes are added important. This >> to allow more than two classes and custom class names. One can check >> what index the signal class has by querying the DataSetInfo method >> GetSignalClassIndex. In your case this would >> betmvaLoader->GetDataSetInfo().GetSignalClassIndex(). >> >> Another approach would be to add the classes first and ensure the >> expected order through >> tmvaLoader->GetDataSetInfo().AddClass("ClassName"). If you use this >> second approach the names must be "Background" and "Signal" as you >> use AddSignalTrainingEvent and friends which expects these classes to >> exist. >> >> Cheers, >> Kim >>> >>> Begin forwarded message: >>> >>> *From: *Chris Jones <jo...@he... >>> <mailto:jo...@he...>> >>> *Subject: **[TMVA-users] Signal/Background target responses inverted.* >>> *Date: *12 July 2017 at 15:15:31 GMT+2 >>> *To: *"tmv...@li... >>> <mailto:tmv...@li...>" >>> <tmv...@li... >>> <mailto:tmv...@li...>> >>> >>> Hi, >>> >>> After a while away from running TMVA I am back looking at the new >>> DNN MVA in ROOT 6.10.02. I have noticed what appears to me slightly >>> odd behavior, which is in some of my trainings the target response >>> (1 or 0) for signal or background are inverted. By which I mean >>> signal(background) is trained to give 0(1), instead of the expected >>> 1(0). >>> >>> In the end I think I have tracked this down to the fact I use the >>> following logic to fill my training and testing samples. >>> >>> for ( 'some loop over data entries' ) { >>> >>> if ( target ) >>> { >>> if ( !useForTesting ) >>> { >>> tmvaLoader->AddSignalTrainingEvent( InputDoubles, 1.0 ); >>> } >>> else >>> { >>> tmvaLoader->AddSignalTestEvent ( InputDoubles, 1.0 ); >>> } >>> } >>> else >>> { >>> if ( !useForTesting ) >>> { >>> tmvaLoader->AddBackgroundTrainingEvent( InputDoubles, 1.0 ); >>> } >>> else >>> { >>> tmvaLoader->AddBackgroundTestEvent ( InputDoubles, 1.0 ); >>> } >>> } >>> >>> } >>> >>> where 'target' is a boolean that indicates if the data entry is >>> signal or background, and 'useForTesting' another boolean to >>> indicate if the entry should be used for training or testing. >>> InputDoubles is an array with all the input parameters for the given >>> data entry. >>> >>> tmvaLoader is an instance of TMVA::DataLoader. >>> >>> The issues is, the order that the above calls are first made is not >>> always the same. It depends on the conditionals, and if the first >>> data entry is declared to be signal or not. What I have found is if >>> the first entry is signal, so 'AddSignalTrainingEvent' is called >>> first, then TMVA trains the network so give signal the expected >>> response of 1, and background 0. However, if the first data entry is >>> background, so AddBackgroundTrainingEvent is called first, then the >>> logic is for some reason inverted, and signal is trained to give a >>> response of 0.... >>> >>> Note I have used the above logic many times in the past, with >>> previous ROOT versions (using the MLP classifier). So this issue is >>> new to the new ROOT version (6.10.02). >>> >>> It is also the case that the use of TMVA::DataLoader is also new. So >>> I am not clear if the issue is related to this, or the use of the >>> DNN classifier. >>> >>> I have a work around, which is just to make sure >>> AddSignalTrainingEvent is called first (I skip entries until I get >>> to the first training signal entry) and this seems to do the job. >>> However, I am curious as to what people think about the above >>> behavior. I doubt somehow its intentional so looks to me like a bug >>> somewhere in TMVA, either in TMVA::DataLoader or perhaps specific to >>> the DNN MVA ? >>> >>> cheers Chris >>> >>> ------------------------------------------------------------------------------ >>> Check out the vibrant tech community on one of the world's most >>> engaging tech sites, Slashdot.org <http://slashdot.org/>! >>> http://sdm.link/slashdot >>> _______________________________________________ >>> TMVA-users mailing list >>> TMV...@li... >>> <mailto:TMV...@li...> >>> https://lists.sourceforge.net/lists/listinfo/tmva-users >> > Christopher Jones wrote: > Hi, > > Thanks, I wondered if it was a ‘feature’, but I failed to find any > good documentation explaining the new loader class. Is there anything > I can read explaining how to go about using it ? I’ve found the > doxygen docs, but thats not really what I am looking for. > > Specifically, on your suggestion below, its not clear to me how I go > about ‘ensuring the expected order’ as you describe. Is it just a case > of adding the Signal first, then the background, with > 'tmvaLoader->GetDataSetInfo().AddClass("ClassName”)’ ? > > cheers Chris > >> On 12 Jul 2017, at 4:18 pm, Kim Albertsson <kia...@ce... >> <mailto:kia...@ce...>> wrote: >> >> Hi Chris, >> >> This is a feature of the Dataloader, it creates the class indices >> dynamically making the order that classes are added important. This >> to allow more than two classes and custom class names. One can check >> what index the signal class has by querying the DataSetInfo method >> GetSignalClassIndex. In your case this would >> betmvaLoader->GetDataSetInfo().GetSignalClassIndex(). >> >> Another approach would be to add the classes first and ensure the >> expected order through >> tmvaLoader->GetDataSetInfo().AddClass("ClassName"). If you use this >> second approach the names must be "Background" and "Signal" as you >> use AddSignalTrainingEvent and friends which expects these classes to >> exist. >> >> Cheers, >> Kim >>> Begin forwarded message: >>> >>> *From: *Chris Jones <jo...@he... >>> <mailto:jo...@he...>> >>> *Subject: **[TMVA-users] Signal/Background target responses inverted.* >>> *Date: *12 July 2017 at 15:15:31 GMT+2 >>> *To: *"tmv...@li... >>> <mailto:tmv...@li...>" >>> <tmv...@li... >>> <mailto:tmv...@li...>> >>> >>> Hi, >>> >>> After a while away from running TMVA I am back looking at the new >>> DNN MVA in ROOT 6.10.02. I have noticed what appears to me slightly >>> odd behavior, which is in some of my trainings the target response >>> (1 or 0) for signal or background are inverted. By which I mean >>> signal(background) is trained to give 0(1), instead of the expected >>> 1(0). >>> >>> In the end I think I have tracked this down to the fact I use the >>> following logic to fill my training and testing samples. >>> >>> for ( 'some loop over data entries' ) { >>> >>> if ( target ) >>> { >>> if ( !useForTesting ) >>> { >>> tmvaLoader->AddSignalTrainingEvent( InputDoubles, 1.0 ); >>> } >>> else >>> { >>> tmvaLoader->AddSignalTestEvent ( InputDoubles, 1.0 ); >>> } >>> } >>> else >>> { >>> if ( !useForTesting ) >>> { >>> tmvaLoader->AddBackgroundTrainingEvent( InputDoubles, 1.0 ); >>> } >>> else >>> { >>> tmvaLoader->AddBackgroundTestEvent ( InputDoubles, 1.0 ); >>> } >>> } >>> >>> } >>> >>> where 'target' is a boolean that indicates if the data entry is >>> signal or background, and 'useForTesting' another boolean to >>> indicate if the entry should be used for training or testing. >>> InputDoubles is an array with all the input parameters for the given >>> data entry. >>> >>> tmvaLoader is an instance of TMVA::DataLoader. >>> >>> The issues is, the order that the above calls are first made is not >>> always the same. It depends on the conditionals, and if the first >>> data entry is declared to be signal or not. What I have found is if >>> the first entry is signal, so 'AddSignalTrainingEvent' is called >>> first, then TMVA trains the network so give signal the expected >>> response of 1, and background 0. However, if the first data entry is >>> background, so AddBackgroundTrainingEvent is called first, then the >>> logic is for some reason inverted, and signal is trained to give a >>> response of 0.... >>> >>> Note I have used the above logic many times in the past, with >>> previous ROOT versions (using the MLP classifier). So this issue is >>> new to the new ROOT version (6.10.02). >>> >>> It is also the case that the use of TMVA::DataLoader is also new. So >>> I am not clear if the issue is related to this, or the use of the >>> DNN classifier. >>> >>> I have a work around, which is just to make sure >>> AddSignalTrainingEvent is called first (I skip entries until I get >>> to the first training signal entry) and this seems to do the job. >>> However, I am curious as to what people think about the above >>> behavior. I doubt somehow its intentional so looks to me like a bug >>> somewhere in TMVA, either in TMVA::DataLoader or perhaps specific to >>> the DNN MVA ? >>> >>> cheers Chris >>> >>> ------------------------------------------------------------------------------ >>> Check out the vibrant tech community on one of the world's most >>> engaging tech sites, Slashdot.org <http://slashdot.org/>! >>> http://sdm.link/slashdot >>> _______________________________________________ >>> TMVA-users mailing list >>> TMV...@li... >>> <mailto:TMV...@li...> >>> https://lists.sourceforge.net/lists/listinfo/tmva-users > |
From: Christopher J. <jo...@he...> - 2017-07-12 20:09:18
|
Hi, Thanks, I wondered if it was a ‘feature’, but I failed to find any good documentation explaining the new loader class. Is there anything I can read explaining how to go about using it ? I’ve found the doxygen docs, but thats not really what I am looking for. Specifically, on your suggestion below, its not clear to me how I go about ‘ensuring the expected order’ as you describe. Is it just a case of adding the Signal first, then the background, with 'tmvaLoader->GetDataSetInfo().AddClass("ClassName”)’ ? cheers Chris > On 12 Jul 2017, at 4:18 pm, Kim Albertsson <kia...@ce...> wrote: > > Hi Chris, > > This is a feature of the Dataloader, it creates the class indices dynamically making the order that classes are added important. This to allow more than two classes and custom class names. One can check what index the signal class has by querying the DataSetInfo method GetSignalClassIndex. In your case this would be tmvaLoader->GetDataSetInfo().GetSignalClassIndex(). > > Another approach would be to add the classes first and ensure the expected order through tmvaLoader->GetDataSetInfo().AddClass("ClassName"). If you use this second approach the names must be "Background" and "Signal" as you use AddSignalTrainingEvent and friends which expects these classes to exist. > > Cheers, > Kim >> Begin forwarded message: >> >> From: Chris Jones <jo...@he... <mailto:jo...@he...>> >> Subject: [TMVA-users] Signal/Background target responses inverted. >> Date: 12 July 2017 at 15:15:31 GMT+2 >> To: "tmv...@li... <mailto:tmv...@li...>" <tmv...@li... <mailto:tmv...@li...>> >> >> Hi, >> >> After a while away from running TMVA I am back looking at the new DNN MVA in ROOT 6.10.02. I have noticed what appears to me slightly odd behavior, which is in some of my trainings the target response (1 or 0) for signal or background are inverted. By which I mean signal(background) is trained to give 0(1), instead of the expected 1(0). >> >> In the end I think I have tracked this down to the fact I use the following logic to fill my training and testing samples. >> >> for ( 'some loop over data entries' ) { >> >> if ( target ) >> { >> if ( !useForTesting ) >> { >> tmvaLoader->AddSignalTrainingEvent( InputDoubles, 1.0 ); >> } >> else >> { >> tmvaLoader->AddSignalTestEvent ( InputDoubles, 1.0 ); >> } >> } >> else >> { >> if ( !useForTesting ) >> { >> tmvaLoader->AddBackgroundTrainingEvent( InputDoubles, 1.0 ); >> } >> else >> { >> tmvaLoader->AddBackgroundTestEvent ( InputDoubles, 1.0 ); >> } >> } >> >> } >> >> where 'target' is a boolean that indicates if the data entry is signal or background, and 'useForTesting' another boolean to indicate if the entry should be used for training or testing. InputDoubles is an array with all the input parameters for the given data entry. >> >> tmvaLoader is an instance of TMVA::DataLoader. >> >> The issues is, the order that the above calls are first made is not always the same. It depends on the conditionals, and if the first data entry is declared to be signal or not. What I have found is if the first entry is signal, so 'AddSignalTrainingEvent' is called first, then TMVA trains the network so give signal the expected response of 1, and background 0. However, if the first data entry is background, so AddBackgroundTrainingEvent is called first, then the logic is for some reason inverted, and signal is trained to give a response of 0.... >> >> Note I have used the above logic many times in the past, with previous ROOT versions (using the MLP classifier). So this issue is new to the new ROOT version (6.10.02). >> >> It is also the case that the use of TMVA::DataLoader is also new. So I am not clear if the issue is related to this, or the use of the DNN classifier. >> >> I have a work around, which is just to make sure AddSignalTrainingEvent is called first (I skip entries until I get to the first training signal entry) and this seems to do the job. However, I am curious as to what people think about the above behavior. I doubt somehow its intentional so looks to me like a bug somewhere in TMVA, either in TMVA::DataLoader or perhaps specific to the DNN MVA ? >> >> cheers Chris >> >> ------------------------------------------------------------------------------ >> Check out the vibrant tech community on one of the world's most >> engaging tech sites, Slashdot.org <http://slashdot.org/>! http://sdm.link/slashdot <http://sdm.link/slashdot> >> _______________________________________________ >> TMVA-users mailing list >> TMV...@li... <mailto:TMV...@li...> >> https://lists.sourceforge.net/lists/listinfo/tmva-users <https://lists.sourceforge.net/lists/listinfo/tmva-users> |
From: Kim A. <kia...@ce...> - 2017-07-12 15:18:46
|
Hi Chris, This is a feature of the Dataloader, it creates the class indices dynamically making the order that classes are added important. This to allow more than two classes and custom class names. One can check what index the signal class has by querying the DataSetInfo method GetSignalClassIndex. In your case this would betmvaLoader->GetDataSetInfo().GetSignalClassIndex(). Another approach would be to add the classes first and ensure the expected order through tmvaLoader->GetDataSetInfo().AddClass("ClassName"). If you use this second approach the names must be "Background" and "Signal" as you use AddSignalTrainingEvent and friends which expects these classes to exist. Cheers, Kim > Begin forwarded message: > > *From: *Chris Jones <jo...@he... > <mailto:jo...@he...>> > *Subject: **[TMVA-users] Signal/Background target responses inverted.* > *Date: *12 July 2017 at 15:15:31 GMT+2 > *To: *"tmv...@li... > <mailto:tmv...@li...>" > <tmv...@li... > <mailto:tmv...@li...>> > > Hi, > > After a while away from running TMVA I am back looking at the new DNN > MVA in ROOT 6.10.02. I have noticed what appears to me slightly odd > behavior, which is in some of my trainings the target response (1 or > 0) for signal or background are inverted. By which I mean > signal(background) is trained to give 0(1), instead of the expected 1(0). > > In the end I think I have tracked this down to the fact I use the > following logic to fill my training and testing samples. > > for ( 'some loop over data entries' ) { > > if ( target ) > { > if ( !useForTesting ) > { > tmvaLoader->AddSignalTrainingEvent( InputDoubles, 1.0 ); > } > else > { > tmvaLoader->AddSignalTestEvent ( InputDoubles, 1.0 ); > } > } > else > { > if ( !useForTesting ) > { > tmvaLoader->AddBackgroundTrainingEvent( InputDoubles, 1.0 ); > } > else > { > tmvaLoader->AddBackgroundTestEvent ( InputDoubles, 1.0 ); > } > } > > } > > where 'target' is a boolean that indicates if the data entry is signal > or background, and 'useForTesting' another boolean to indicate if the > entry should be used for training or testing. InputDoubles is an array > with all the input parameters for the given data entry. > > tmvaLoader is an instance of TMVA::DataLoader. > > The issues is, the order that the above calls are first made is not > always the same. It depends on the conditionals, and if the first data > entry is declared to be signal or not. What I have found is if the > first entry is signal, so 'AddSignalTrainingEvent' is called first, > then TMVA trains the network so give signal the expected response of > 1, and background 0. However, if the first data entry is background, > so AddBackgroundTrainingEvent is called first, then the logic is for > some reason inverted, and signal is trained to give a response of 0.... > > Note I have used the above logic many times in the past, with previous > ROOT versions (using the MLP classifier). So this issue is new to the > new ROOT version (6.10.02). > > It is also the case that the use of TMVA::DataLoader is also new. So I > am not clear if the issue is related to this, or the use of the DNN > classifier. > > I have a work around, which is just to make sure > AddSignalTrainingEvent is called first (I skip entries until I get to > the first training signal entry) and this seems to do the job. > However, I am curious as to what people think about the above > behavior. I doubt somehow its intentional so looks to me like a bug > somewhere in TMVA, either in TMVA::DataLoader or perhaps specific to > the DNN MVA ? > > cheers Chris > > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org <http://slashdot.org/>! > http://sdm.link/slashdot > _______________________________________________ > TMVA-users mailing list > TMV...@li... <mailto:TMV...@li...> > https://lists.sourceforge.net/lists/listinfo/tmva-users |
From: Chris J. <jo...@he...> - 2017-07-12 13:36:03
|
Hi, After a while away from running TMVA I am back looking at the new DNN MVA in ROOT 6.10.02. I have noticed what appears to me slightly odd behavior, which is in some of my trainings the target response (1 or 0) for signal or background are inverted. By which I mean signal(background) is trained to give 0(1), instead of the expected 1(0). In the end I think I have tracked this down to the fact I use the following logic to fill my training and testing samples. for ( 'some loop over data entries' ) { if ( target ) { if ( !useForTesting ) { tmvaLoader->AddSignalTrainingEvent( InputDoubles, 1.0 ); } else { tmvaLoader->AddSignalTestEvent ( InputDoubles, 1.0 ); } } else { if ( !useForTesting ) { tmvaLoader->AddBackgroundTrainingEvent( InputDoubles, 1.0 ); } else { tmvaLoader->AddBackgroundTestEvent ( InputDoubles, 1.0 ); } } } where 'target' is a boolean that indicates if the data entry is signal or background, and 'useForTesting' another boolean to indicate if the entry should be used for training or testing. InputDoubles is an array with all the input parameters for the given data entry. tmvaLoader is an instance of TMVA::DataLoader. The issues is, the order that the above calls are first made is not always the same. It depends on the conditionals, and if the first data entry is declared to be signal or not. What I have found is if the first entry is signal, so 'AddSignalTrainingEvent' is called first, then TMVA trains the network so give signal the expected response of 1, and background 0. However, if the first data entry is background, so AddBackgroundTrainingEvent is called first, then the logic is for some reason inverted, and signal is trained to give a response of 0.... Note I have used the above logic many times in the past, with previous ROOT versions (using the MLP classifier). So this issue is new to the new ROOT version (6.10.02). It is also the case that the use of TMVA::DataLoader is also new. So I am not clear if the issue is related to this, or the use of the DNN classifier. I have a work around, which is just to make sure AddSignalTrainingEvent is called first (I skip entries until I get to the first training signal entry) and this seems to do the job. However, I am curious as to what people think about the above behavior. I doubt somehow its intentional so looks to me like a bug somewhere in TMVA, either in TMVA::DataLoader or perhaps specific to the DNN MVA ? cheers Chris |
From: Helge V. <Hel...@ce...> - 2017-05-31 15:07:42
|
Exactly.. 'l' doesn't mean left and 'r' doesn't mean right, but "l" means "bkg" and "r" means "signal" :( Why don't you use the 'BDT plotting' From the 'TMVAGui' .. I think it plots the tree correctly... Helge On 31 May 2017 at 16:54, louis d'eramo <lou...@lp...> wrote: > Sorry I forgot the attachment. > > -- > > Louis > > > Le 31/05/2017 à 16:54, louis d'eramo a écrit : > > Hi Helge, > > first thanks for the quick reply. I've also drawn the tree as attached. > > What I understood for that piece of code is that if that event has a value > for the variable tested (here there is only one) higher than the cut value, > it goes right. So in my case if an event fails the cut B, which means that > it has a value under 119.3, it couldn't pass the 129.2 cut as well, and so > no events could go right at the node D... > > Concerning your answer it would mean that my represented tree is completely > false since pos=r or pos=l doesn't left or right? > > Cheers, > > -- > > Louis > > > Le 31/05/2017 à 16:29, Helge Voss a écrit : > > Hi Louis, > > sorry for the confusion :) The 'tree' you have in your xml file is > perfectly fine (I just wrote it down on my piece of paper) > > BUT the namings of the 'nodes' are confusing, as 'left' or 'right' > does not have the 'correct' (literal meaning) as it should rather > be 'bkg enhanced' and 'signal enhanced', respectively > > If you look into the code: the decision whether an event 'goes right' > is 'inverted' depending on the 'cType' of the node, such > that background events always go "left' and signal events go 'right' :) > > Bool_t TMVA::DecisionTreeNode::GoesRight(const TMVA::Event & e) const > { > Bool_t result; > // first check if the fisher criterium is used or ordinary cuts: > > ... > result = (e.GetValue(this->GetSelector()) >= this->GetCutValue() ); > > ... > > if (fCutType == kTRUE) return result; //the cuts are selecting Signal ; > else return !result; > } > > > cheers, > > Helge > > > On 31 May 2017 at 15:33, louis d'eramo <lou...@lp...> wrote: > > Dear TMVA users, > > as part of my PhD I was asked to study the feasibility of classifying two > different MC Generators for the same physics. We would like the BDT to learn > the differences of shapes of the various variables. > Since I didn't get any usefull results I decided to do some closure tests > with BDT to learn how to have the best options for my very specific > situation. > > That is why I decided to reduce the number of trees to one (so that I could > access easily the results without being polluted with the boosting), and as > for to be the simplest as possible I selected only one variable. > > I was expecting to have then my variable range divided into n sub-ranges and > to get for every sub-range the purity and so S/B. > > But I observed in the xml file some strange behavior, some of the left nodes > are requiring a cut that is higher that the parents node, which seems > strange. > > I join the xml file so that you could see the effect. > > P.S. : I tried to add a second variable, but the same effect was also seen > with this configuration > -- > Louis D'Eramo, > PhD Student LPNHE Paris > > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > _______________________________________________ > TMVA-users mailing list > TMV...@li... > https://lists.sourceforge.net/lists/listinfo/tmva-users > > > |
From: louis d'e. <lou...@lp...> - 2017-05-31 14:55:12
|
Sorry I forgot the attachment. -- Louis Le 31/05/2017 à 16:54, louis d'eramo a écrit : > > Hi Helge, > > first thanks for the quick reply. I've also drawn the tree as attached. > > What I understood for that piece of code is that if that event has a > value for the variable tested (here there is only one) higher than the > cut value, it goes right. So in my case if an event fails the cut B, > which means that it has a value under 119.3, it couldn't pass the > 129.2 cut as well, and so no events could go right at the node D... > > Concerning your answer it would mean that my represented tree is > completely false since pos=r or pos=l doesn't left or right? > > Cheers, > > -- > > Louis > > > Le 31/05/2017 à 16:29, Helge Voss a écrit : >> Hi Louis, >> >> sorry for the confusion :) The 'tree' you have in your xml file is >> perfectly fine (I just wrote it down on my piece of paper) >> >> BUT the namings of the 'nodes' are confusing, as 'left' or 'right' >> does not have the 'correct' (literal meaning) as it should rather >> be 'bkg enhanced' and 'signal enhanced', respectively >> >> If you look into the code: the decision whether an event 'goes right' >> is 'inverted' depending on the 'cType' of the node, such >> that background events always go "left' and signal events go 'right' :) >> >> Bool_t TMVA::DecisionTreeNode::GoesRight(const TMVA::Event & e) const >> { >> Bool_t result; >> // first check if the fisher criterium is used or ordinary cuts: >> >> ... >> result = (e.GetValue(this->GetSelector()) >= this->GetCutValue() ); >> >> ... >> >> if (fCutType == kTRUE) return result; //the cuts are selecting Signal ; >> else return !result; >> } >> >> >> cheers, >> >> Helge >> >> >> On 31 May 2017 at 15:33, louis d'eramo<lou...@lp...> wrote: >>> Dear TMVA users, >>> >>> as part of my PhD I was asked to study the feasibility of classifying two >>> different MC Generators for the same physics. We would like the BDT to learn >>> the differences of shapes of the various variables. >>> Since I didn't get any usefull results I decided to do some closure tests >>> with BDT to learn how to have the best options for my very specific >>> situation. >>> >>> That is why I decided to reduce the number of trees to one (so that I could >>> access easily the results without being polluted with the boosting), and as >>> for to be the simplest as possible I selected only one variable. >>> >>> I was expecting to have then my variable range divided into n sub-ranges and >>> to get for every sub-range the purity and so S/B. >>> >>> But I observed in the xml file some strange behavior, some of the left nodes >>> are requiring a cut that is higher that the parents node, which seems >>> strange. >>> >>> I join the xml file so that you could see the effect. >>> >>> P.S. : I tried to add a second variable, but the same effect was also seen >>> with this configuration >>> -- >>> Louis D'Eramo, >>> PhD Student LPNHE Paris >>> >>> ------------------------------------------------------------------------------ >>> Check out the vibrant tech community on one of the world's most >>> engaging tech sites, Slashdot.org!http://sdm.link/slashdot >>> _______________________________________________ >>> TMVA-users mailing list >>> TMV...@li... >>> https://lists.sourceforge.net/lists/listinfo/tmva-users >>> > |
From: louis d'e. <lou...@lp...> - 2017-05-31 14:54:33
|
Hi Helge, first thanks for the quick reply. I've also drawn the tree as attached. What I understood for that piece of code is that if that event has a value for the variable tested (here there is only one) higher than the cut value, it goes right. So in my case if an event fails the cut B, which means that it has a value under 119.3, it couldn't pass the 129.2 cut as well, and so no events could go right at the node D... Concerning your answer it would mean that my represented tree is completely false since pos=r or pos=l doesn't left or right? Cheers, -- Louis Le 31/05/2017 à 16:29, Helge Voss a écrit : > Hi Louis, > > sorry for the confusion :) The 'tree' you have in your xml file is > perfectly fine (I just wrote it down on my piece of paper) > > BUT the namings of the 'nodes' are confusing, as 'left' or 'right' > does not have the 'correct' (literal meaning) as it should rather > be 'bkg enhanced' and 'signal enhanced', respectively > > If you look into the code: the decision whether an event 'goes right' > is 'inverted' depending on the 'cType' of the node, such > that background events always go "left' and signal events go 'right' :) > > Bool_t TMVA::DecisionTreeNode::GoesRight(const TMVA::Event & e) const > { > Bool_t result; > // first check if the fisher criterium is used or ordinary cuts: > > ... > result = (e.GetValue(this->GetSelector()) >= this->GetCutValue() ); > > ... > > if (fCutType == kTRUE) return result; //the cuts are selecting Signal ; > else return !result; > } > > > cheers, > > Helge > > > On 31 May 2017 at 15:33, louis d'eramo <lou...@lp...> wrote: >> Dear TMVA users, >> >> as part of my PhD I was asked to study the feasibility of classifying two >> different MC Generators for the same physics. We would like the BDT to learn >> the differences of shapes of the various variables. >> Since I didn't get any usefull results I decided to do some closure tests >> with BDT to learn how to have the best options for my very specific >> situation. >> >> That is why I decided to reduce the number of trees to one (so that I could >> access easily the results without being polluted with the boosting), and as >> for to be the simplest as possible I selected only one variable. >> >> I was expecting to have then my variable range divided into n sub-ranges and >> to get for every sub-range the purity and so S/B. >> >> But I observed in the xml file some strange behavior, some of the left nodes >> are requiring a cut that is higher that the parents node, which seems >> strange. >> >> I join the xml file so that you could see the effect. >> >> P.S. : I tried to add a second variable, but the same effect was also seen >> with this configuration >> -- >> Louis D'Eramo, >> PhD Student LPNHE Paris >> >> ------------------------------------------------------------------------------ >> Check out the vibrant tech community on one of the world's most >> engaging tech sites, Slashdot.org! http://sdm.link/slashdot >> _______________________________________________ >> TMVA-users mailing list >> TMV...@li... >> https://lists.sourceforge.net/lists/listinfo/tmva-users >> |
From: Helge V. <Hel...@ce...> - 2017-05-31 14:30:45
|
Hi Louis, sorry for the confusion :) The 'tree' you have in your xml file is perfectly fine (I just wrote it down on my piece of paper) BUT the namings of the 'nodes' are confusing, as 'left' or 'right' does not have the 'correct' (literal meaning) as it should rather be 'bkg enhanced' and 'signal enhanced', respectively If you look into the code: the decision whether an event 'goes right' is 'inverted' depending on the 'cType' of the node, such that background events always go "left' and signal events go 'right' :) Bool_t TMVA::DecisionTreeNode::GoesRight(const TMVA::Event & e) const { Bool_t result; // first check if the fisher criterium is used or ordinary cuts: ... result = (e.GetValue(this->GetSelector()) >= this->GetCutValue() ); ... if (fCutType == kTRUE) return result; //the cuts are selecting Signal ; else return !result; } cheers, Helge On 31 May 2017 at 15:33, louis d'eramo <lou...@lp...> wrote: > Dear TMVA users, > > as part of my PhD I was asked to study the feasibility of classifying two > different MC Generators for the same physics. We would like the BDT to learn > the differences of shapes of the various variables. > Since I didn't get any usefull results I decided to do some closure tests > with BDT to learn how to have the best options for my very specific > situation. > > That is why I decided to reduce the number of trees to one (so that I could > access easily the results without being polluted with the boosting), and as > for to be the simplest as possible I selected only one variable. > > I was expecting to have then my variable range divided into n sub-ranges and > to get for every sub-range the purity and so S/B. > > But I observed in the xml file some strange behavior, some of the left nodes > are requiring a cut that is higher that the parents node, which seems > strange. > > I join the xml file so that you could see the effect. > > P.S. : I tried to add a second variable, but the same effect was also seen > with this configuration > -- > Louis D'Eramo, > PhD Student LPNHE Paris > > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > _______________________________________________ > TMVA-users mailing list > TMV...@li... > https://lists.sourceforge.net/lists/listinfo/tmva-users > |
From: louis d'e. <lou...@lp...> - 2017-05-31 13:33:24
|
Dear TMVA users, as part of my PhD I was asked to study the feasibility of classifying two different MC Generators for the same physics. We would like the BDT to learn the differences of shapes of the various variables. Since I didn't get any usefull results I decided to do some closure tests with BDT to learn how to have the best options for my very specific situation. That is why I decided to reduce the number of trees to one (so that I could access easily the results without being polluted with the boosting), and as for to be the simplest as possible I selected only one variable. I was expecting to have then my variable range divided into n sub-ranges and to get for every sub-range the purity and so S/B. But I observed in the xml file some strange behavior, some of the left nodes are requiring a cut that is higher that the parents node, which seems strange. I join the xml file so that you could see the effect. P.S. : I tried to add a second variable, but the same effect was also seen with this configuration -- Louis D'Eramo, PhD Student LPNHE Paris |