tmva-users Mailing List for TMVA Toolkit for Multi Variate Analysis (Page 2)

A ROOT-integrated toolkit for multivariate analysis

Brought to you by: andreas.hoecker, andreashoecker, evtoerne, helgevoss, and 4 others

tmva-users — Open mailing list for TMVA users

You can subscribe to this list here.

2006	Jan	Feb	Mar (1)	Apr (4)	May (1)	Jun (1)	Jul	Aug	Sep	Oct	Nov	Dec (1)
2007	Jan	Feb	Mar (2)	Apr (10)	May (1)	Jun (13)	Jul (69)	Aug (40)	Sep (45)	Oct (21)	Nov (15)	Dec (2)
2008	Jan (44)	Feb (21)	Mar (28)	Apr (33)	May (35)	Jun (16)	Jul (12)	Aug (29)	Sep (12)	Oct (24)	Nov (36)	Dec (22)
2009	Jan (25)	Feb (19)	Mar (47)	Apr (23)	May (39)	Jun (14)	Jul (33)	Aug (12)	Sep (31)	Oct (31)	Nov (19)	Dec (13)
2010	Jan (7)	Feb (27)	Mar (26)	Apr (17)	May (10)	Jun (11)	Jul (17)	Aug (20)	Sep (31)	Oct (13)	Nov (19)	Dec (6)
2011	Jan (13)	Feb (17)	Mar (36)	Apr (19)	May (4)	Jun (14)	Jul (24)	Aug (22)	Sep (47)	Oct (35)	Nov (24)	Dec (18)
2012	Jan (28)	Feb (19)	Mar (23)	Apr (36)	May (27)	Jun (39)	Jul (29)	Aug (23)	Sep (17)	Oct (36)	Nov (60)	Dec (28)
2013	Jan (34)	Feb (23)	Mar (44)	Apr (39)	May (89)	Jun (55)	Jul (31)	Aug (47)	Sep (6)	Oct (21)	Nov (21)	Dec (10)
2014	Jan (19)	Feb (32)	Mar (11)	Apr (33)	May (22)	Jun (7)	Jul (16)	Aug (4)	Sep (20)	Oct (17)	Nov (12)	Dec (6)
2015	Jan (9)	Feb (7)	Mar (16)	Apr (5)	May (13)	Jun (27)	Jul (25)	Aug (11)	Sep (10)	Oct (7)	Nov (47)	Dec (2)
2016	Jan (9)	Feb (2)	Mar (4)	Apr (18)	May (2)	Jun (8)	Jul	Aug (27)	Sep (47)	Oct (28)	Nov (3)	Dec (9)
2017	Jan (11)	Feb (23)	Mar (7)	Apr (7)	May (20)	Jun	Jul (6)	Aug (1)	Sep	Oct (3)	Nov (11)	Dec (8)
2018	Jan (9)	Feb (8)	Mar (2)	Apr (2)	May (2)	Jun	Jul (2)	Aug (1)	Sep (2)	Oct	Nov	Dec
2020	Jan	Feb	Mar (2)	Apr	May	Jun	Jul	Aug	Sep	Oct (2)	Nov	Dec
2021	Jan	Feb (1)	Mar	Apr	May	Jun	Jul	Aug	Sep (1)	Oct (2)	Nov	Dec
2022	Jan	Feb	Mar	Apr	May	Jun (1)	Jul	Aug (1)	Sep	Oct	Nov	Dec
2023	Jan	Feb	Mar (1)	Apr (1)	May	Jun	Jul	Aug	Sep	Oct	Nov	Dec
2024	Jan	Feb	Mar	Apr	May	Jun (1)	Jul	Aug	Sep	Oct	Nov (1)	Dec

Flat | Threaded

<< < 1 2 3 4 .. 110 > >> (Page 2 of 110)

Re: [TMVA-users] TMVA understanding related

From: Kim A. <kia...@ce...> - 2018-02-28 13:58:46

Hi Subhasish!

And welcome to TMVA!

Also, please note that the best support for TMVA is now given in the 
ROOT fourm <http://root-forum.cern.ch> :)
> 1. How do I generate the histogram root file ?
> : I have events file which contain many things, with many branches and 
> leaves. Now is there a simple programme that can read the complicated 
> root file and generate now root files with only required info for TMVA.
TMVA can work with your root files directly, but this might have a 
performance impact when TMVA converts the data from the rootfile format 
into what it uses internally. Thus it is, as you imply, a good idea to 
preprocess the root file if it is large and only a smaller part is 
needed for learning.

In general, this preprocessing must be done manually, by reading in the 
root file, selecting what branches you want to use, and writing these 
out to a new file.

This is most easily done using the new root feature TDataFrame if you 
have access to root 6.10 or later. TDataFrame is currently under heavy 
development so the more recent version of root you can use, the better 
will your experience with it be. Examples and documentation can be found 
here <https://root.cern.ch/doc/master/group__tutorial__tdataframe.html> 
and here 
<https://root.cern.ch/doc/master/classROOT_1_1Experimental_1_1TDataFrame.html> 
respectively.

If you are using root 5, or for some reason the above does not work for 
you, the way to go is with TTreeReader. A tutorial can be found here 
<https://root.cern.ch/doc/master/group__tutorial__tree.html> (in 
particular hsimpleReader and tree1), and documentation here 
<https://root.cern.ch/doc/master/classTTreeReader.html>.
> 2.    How can I see the ROC for an analysis ?
Please see the TMVA tutorial found here 
<https://root.cern.ch/doc/master/group__tutorial__tmva.html> (in 
particular TMVAClassification.C). Basically, you open the TMVA Gui and 
generate the plots you want, there is a button for ROC curves. Make sure 
to run the tutorial without the `-b` option, e.g. `root -l 
TMVAClassification.C`.
> 3.    I am also not sure about what kind of linear combination of 
> variables the sample codes are taking and do I have a control on them 
> as well ?
I am not sure I follow you here, could you elaborate? If you want to 
understand the input data in the `TMVAClassification.C` example you can 
inspect the file `tmva_class_example.root` with the command `rootbrowse 
tmva_class_example.root` after running the example.

Please get back to me if you have any further questions.

Cheers,
   Kim

>
> Thanks,
> Subhasish
>
>
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
>
>
> _______________________________________________
> TMVA-users mailing list
> TMV...@li...
> https://lists.sourceforge.net/lists/listinfo/tmva-users

[TMVA-users] TMVA understanding related

From: Subhasish B. <sub...@gm...> - 2018-02-28 13:34:05

Hi,

I am a Research scholar in High Energy Particle Physics working on Collider
physics theory. Assume I am a beginner to data analysis with TMVA, though I
have used Root earlier for Data analysis with some codes etc,. The
following few questions I have in mind, before doing analysis with TMVA.

1.   How do I generate the histogram root file ?
                                          : I have events file which
contain many things, with many branches and leaves. Now is there a simple
programme that can read the complicated root file and generate now root
files with only required info for TMVA.

2.    How can I see the ROC for an analysis ?

3.    I am also not sure about what kind of linear combination of variables
the sample codes are taking and do I have a control on them as well ?



Thanks,
Subhasish

Re: [TMVA-users] BDT training results dependence on normalization

From: Bobovnikov, I. <ily...@de...> - 2018-02-27 18:28:56

Dear Kim,

Thank you for the answer.
That works for me.

Sorry all for the spamming, there was is a long delay between sending and reaching the mailing list.

Best regards, Ilya 

----- Original Message -----
From: "Kim Albertsson" <kia...@ce...>
To: "Ilya Bobovnikov" <ily...@de...>
Cc: "TMVA-users" <TMV...@li...>
Sent: Thursday, 22 February, 2018 12:20:08
Subject: Re: [TMVA-users] BDT training results dependence on normalization

Hi Ilya,

Sorry for the late reply. In the future please post to the root forum at 
https://root-forum.cern.ch, this is where the primary support for TMVA 
is located now (This mailing list still works but the response times are 
expected to be better with the forum :) ).

When using BDT's for classification (excluding BoostType=GradBoost) 
there is an internal rescaling done before training as this "should" be 
performed implicitly as part of the boosting procedure. This can be 
disabled either by adding "SkipNormalisation=True" to the method options.

There should be a textual output of this in the log just before the 
training starts where you can read a bit more.

Cheers,
   Kim

Bobovnikov, Ilya wrote:
> Dear experts,
>
> I am doing BDT training for classification in TMVA.
> My expectation was that BDT training results should depend on overall normalization of my signal and background, since node splitting function like significance (for example S/Sqrt(S+B)) depend on it. But I got the same results for the training with different normalization. Can it be that I am missing some default options or it should be like this?
>
>
> I am using ROOT 6.06
>
> And I am using the options
> factory->PrepareTrainingAndTestTree( mycuts, "NormMode=None" );
> factory->BookMethod( TMVA::Types::kBDT, "BDT"+BDTname,                           "!H:!V:NTrees=850:MinNodeSize=2.5%:MaxDepth=3:BoostType=AdaBoost:AdaBoostBeta=0.5:UseBaggedBoost:BaggedSampleFraction=0.5:SeparationType=SDivSqrtSPlusB:nCuts=20:NegWeightTreatment=Pray" );
>
>
> For example I took
>
> --- DataSetFactory           :     Background      -- number of events       : 86123  / sum of weights: 52014.9
> --- DataSetFactory           :     Signal          -- number of events       : 9117   / sum of weights: 3.92221
>
> I got
>
> --- Factory                  : MVA              Signal efficiency at bkg eff.(error):       | Sepa-    Signifi-
> --- Factory                  : Method:          @B=0.01    @B=0.10    @B=0.30    ROC-integ. | ration:  cance:
> --- Factory                  : --------------------------------------------------------------------------------
> --- Factory                  : BDTmutau       : 0.705(510)  0.938(269)  0.978(164)    0.974    | 0.769    2.321
>
> And then scaled a bit
>
> --- DataSetFactory           :     Background      -- number of events       : 86123  / sum of weights: 183752
> --- DataSetFactory           :     Signal          -- number of events       : 9117   / sum of weights: 13.8161
>
> And got the same (the BDT distributions are identical)
>
> --- Factory                  : MVA              Signal efficiency at bkg eff.(error):       | Sepa-    Signifi-
> --- Factory                  : Method:          @B=0.01    @B=0.10    @B=0.30    ROC-integ. | ration:  cance:
> --- Factory                  : --------------------------------------------------------------------------------
> --- Factory                  : BDTmutau       : 0.705(510)  0.938(269)  0.978(164)    0.974    | 0.769    2.321
>
>
> Best regards, Ilya
>
> P.S. code example and outputs for these two cases are attached.
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> TMVA-users mailing list
> TMV...@li...
> https://lists.sourceforge.net/lists/listinfo/tmva-users

[TMVA-users] BDT training results dependence on normalization

From: Bobovnikov, I. <ily...@de...> - 2018-02-27 01:11:23

Attachments: LogNormalWeightedScaledSepMethodAnother LogNormalWeightedSepMethodAnother myTMVA.C

Dear experts, 

I am doing BDT training for classification in TMVA. 
My expectation was that BDT training results should depend on overall normalization of my signal and background, since node splitting function like significance (for example S/Sqrt(S+B)) depend on it. But I got the same results for the training with different normalization. Can it be that I am missing some default options or it should be like this?  


I am using ROOT 6.06

And I am using the options
factory->PrepareTrainingAndTestTree( mycuts, "NormMode=None" );
factory->BookMethod( TMVA::Types::kBDT, "BDT"+BDTname,                           "!H:!V:NTrees=850:MinNodeSize=2.5%:MaxDepth=3:BoostType=AdaBoost:AdaBoostBeta=0.5:UseBaggedBoost:BaggedSampleFraction=0.5:SeparationType=SDivSqrtSPlusB:nCuts=20:NegWeightTreatment=Pray" );


For example I took

--- DataSetFactory           :     Background      -- number of events       : 86123  / sum of weights: 52014.9
--- DataSetFactory           :     Signal          -- number of events       : 9117   / sum of weights: 3.92221

I got 

--- Factory                  : MVA              Signal efficiency at bkg eff.(error):       | Sepa-    Signifi- 
--- Factory                  : Method:          @B=0.01    @B=0.10    @B=0.30    ROC-integ. | ration:  cance:   
--- Factory                  : --------------------------------------------------------------------------------
--- Factory                  : BDTmutau       : 0.705(510)  0.938(269)  0.978(164)    0.974    | 0.769    2.321

And then scaled a bit

--- DataSetFactory           :     Background      -- number of events       : 86123  / sum of weights: 183752
--- DataSetFactory           :     Signal          -- number of events       : 9117   / sum of weights: 13.8161

And got the same (the BDT distributions are identical)

--- Factory                  : MVA              Signal efficiency at bkg eff.(error):       | Sepa-    Signifi- 
--- Factory                  : Method:          @B=0.01    @B=0.10    @B=0.30    ROC-integ. | ration:  cance:   
--- Factory                  : --------------------------------------------------------------------------------
--- Factory                  : BDTmutau       : 0.705(510)  0.938(269)  0.978(164)    0.974    | 0.769    2.321


Best regards, Ilya

P.S. code example and outputs for these two cases are attached.

Re: [TMVA-users] BDT training results dependence on normalization

From: Kim A. <kia...@ce...> - 2018-02-22 11:22:13

Hi Ilya,

Sorry for the late reply. In the future please post to the root forum at 
https://root-forum.cern.ch, this is where the primary support for TMVA 
is located now (This mailing list still works but the response times are 
expected to be better with the forum :) ).

When using BDT's for classification (excluding BoostType=GradBoost) 
there is an internal rescaling done before training as this "should" be 
performed implicitly as part of the boosting procedure. This can be 
disabled either by adding "SkipNormalisation=True" to the method options.

There should be a textual output of this in the log just before the 
training starts where you can read a bit more.

Cheers,
   Kim

Bobovnikov, Ilya wrote:
> Dear experts,
>
> I am doing BDT training for classification in TMVA.
> My expectation was that BDT training results should depend on overall normalization of my signal and background, since node splitting function like significance (for example S/Sqrt(S+B)) depend on it. But I got the same results for the training with different normalization. Can it be that I am missing some default options or it should be like this?
>
>
> I am using ROOT 6.06
>
> And I am using the options
> factory->PrepareTrainingAndTestTree( mycuts, "NormMode=None" );
> factory->BookMethod( TMVA::Types::kBDT, "BDT"+BDTname,                           "!H:!V:NTrees=850:MinNodeSize=2.5%:MaxDepth=3:BoostType=AdaBoost:AdaBoostBeta=0.5:UseBaggedBoost:BaggedSampleFraction=0.5:SeparationType=SDivSqrtSPlusB:nCuts=20:NegWeightTreatment=Pray" );
>
>
> For example I took
>
> --- DataSetFactory           :     Background      -- number of events       : 86123  / sum of weights: 52014.9
> --- DataSetFactory           :     Signal          -- number of events       : 9117   / sum of weights: 3.92221
>
> I got
>
> --- Factory                  : MVA              Signal efficiency at bkg eff.(error):       | Sepa-    Signifi-
> --- Factory                  : Method:          @B=0.01    @B=0.10    @B=0.30    ROC-integ. | ration:  cance:
> --- Factory                  : --------------------------------------------------------------------------------
> --- Factory                  : BDTmutau       : 0.705(510)  0.938(269)  0.978(164)    0.974    | 0.769    2.321
>
> And then scaled a bit
>
> --- DataSetFactory           :     Background      -- number of events       : 86123  / sum of weights: 183752
> --- DataSetFactory           :     Signal          -- number of events       : 9117   / sum of weights: 13.8161
>
> And got the same (the BDT distributions are identical)
>
> --- Factory                  : MVA              Signal efficiency at bkg eff.(error):       | Sepa-    Signifi-
> --- Factory                  : Method:          @B=0.01    @B=0.10    @B=0.30    ROC-integ. | ration:  cance:
> --- Factory                  : --------------------------------------------------------------------------------
> --- Factory                  : BDTmutau       : 0.705(510)  0.938(269)  0.978(164)    0.974    | 0.769    2.321
>
>
> Best regards, Ilya
>
> P.S. code example and outputs for these two cases are attached.
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> TMVA-users mailing list
> TMV...@li...
> https://lists.sourceforge.net/lists/listinfo/tmva-users

[TMVA-users] BDT training results dependence on normalization

From: Alexis K. <Ale...@ce...> - 2018-02-16 10:04:56

Dear experts,
I am posting on behalf of a colleague who for some strange reason could not post direct into the forum

Regards

Alexis


----- Forwarded Message -----
From: "Ilya Bobovnikov" <ily...@de...>
To: "TMVA-users" <TMV...@li...>
Sent: Thursday, 15 February, 2018 11:51:17
Subject: BDT training results dependence on normalization

Dear experts, 

I am doing BDT training for classification in TMVA. 
My expectation was that BDT training results should depend on overall normalization of my signal and background, since node splitting function like significance (for example S/Sqrt(S+B)) depend on it. But I got the same results for the training with different normalization. Can it be that I am missing some default options or it should be like this?  


I am using ROOT 6.06

And I am using the options
factory->PrepareTrainingAndTestTree( mycuts, "NormMode=None" );
factory->BookMethod( TMVA::Types::kBDT, "BDT"+BDTname,                           "!H:!V:NTrees=850:MinNodeSize=2.5%:MaxDepth=3:BoostType=AdaBoost:AdaBoostBeta=0.5:UseBaggedBoost:BaggedSampleFraction=0.5:SeparationType=SDivSqrtSPlusB:nCuts=20" );


For example I took

--- DataSetFactory           :     Background      -- number of events       : 86123  / sum of weights: 52014.9
--- DataSetFactory           :     Signal          -- number of events       : 9117   / sum of weights: 3.92221

I got 

--- Factory                  : MVA              Signal efficiency at bkg eff.(error):       | Sepa-    Signifi- 
--- Factory                  : Method:          @B=0.01    @B=0.10    @B=0.30    ROC-integ. | ration:  cance:   
--- Factory                  : --------------------------------------------------------------------------------
--- Factory                  : BDTmutau       : 0.705(510)  0.938(269)  0.978(164)    0.974    | 0.769    2.321

And then scaled a bit

--- DataSetFactory           :     Background      -- number of events       : 86123  / sum of weights: 183752
--- DataSetFactory           :     Signal          -- number of events       : 9117   / sum of weights: 13.8161

And got the same (the BDT distributions are identical)

--- Factory                  : MVA              Signal efficiency at bkg eff.(error):       | Sepa-    Signifi- 
--- Factory                  : Method:          @B=0.01    @B=0.10    @B=0.30    ROC-integ. | ration:  cance:   
--- Factory                  : --------------------------------------------------------------------------------
--- Factory                  : BDTmutau       : 0.705(510)  0.938(269)  0.978(164)    0.974    | 0.769    2.321


Best regards, Ilya

P.S. code example and outputs for these two cases are attached.

------------------------------------------------
Dr. Alexis Kalogeropoulos

Princeton Univ., Dept of Physics 
CMS group
alk...@ce... <mailto:alk...@ce...>
Ale...@pr... <mailto:Ale...@pr...>
--------------------------------------------------

[TMVA-users] BDT training results dependence on normalization

From: Bobovnikov, I. <ily...@de...> - 2018-02-16 01:09:10

Attachments: LogNormalWeightedScaledSepMethodAnother LogNormalWeightedSepMethodAnother myTMVA.C

Dear experts, 

I am doing BDT training for classification in TMVA. 
My expectation was that BDT training results should depend on overall normalization of my signal and background, since node splitting function like significance (for example S/Sqrt(S+B)) depend on it. But I got the same results for the training with different normalization. Can it be that I am missing some default options or it should be like this?  


I am using ROOT 6.06

And I am using the options
factory->PrepareTrainingAndTestTree( mycuts, "NormMode=None" );
factory->BookMethod( TMVA::Types::kBDT, "BDT"+BDTname,                           "!H:!V:NTrees=850:MinNodeSize=2.5%:MaxDepth=3:BoostType=AdaBoost:AdaBoostBeta=0.5:UseBaggedBoost:BaggedSampleFraction=0.5:SeparationType=SDivSqrtSPlusB:nCuts=20:NegWeightTreatment=Pray" );


For example I took

--- DataSetFactory           :     Background      -- number of events       : 86123  / sum of weights: 52014.9
--- DataSetFactory           :     Signal          -- number of events       : 9117   / sum of weights: 3.92221

I got 

--- Factory                  : MVA              Signal efficiency at bkg eff.(error):       | Sepa-    Signifi- 
--- Factory                  : Method:          @B=0.01    @B=0.10    @B=0.30    ROC-integ. | ration:  cance:   
--- Factory                  : --------------------------------------------------------------------------------
--- Factory                  : BDTmutau       : 0.705(510)  0.938(269)  0.978(164)    0.974    | 0.769    2.321

And then scaled a bit

--- DataSetFactory           :     Background      -- number of events       : 86123  / sum of weights: 183752
--- DataSetFactory           :     Signal          -- number of events       : 9117   / sum of weights: 13.8161

And got the same (the BDT distributions are identical)

--- Factory                  : MVA              Signal efficiency at bkg eff.(error):       | Sepa-    Signifi- 
--- Factory                  : Method:          @B=0.01    @B=0.10    @B=0.30    ROC-integ. | ration:  cance:   
--- Factory                  : --------------------------------------------------------------------------------
--- Factory                  : BDTmutau       : 0.705(510)  0.938(269)  0.978(164)    0.974    | 0.769    2.321


Best regards, Ilya

P.S. code example and outputs for these two cases are attached.