|
From: Christophe C. <chr...@tu...> - 2012-11-09 14:58:26
|
Dear experts, I'm experiencing some trouble training a BDT with TMVA. As far as I can decide the classifier distribution looks not too bad for the BDT * https://dl.dropbox.com/u/171315/tmva/classifier_distribution_BDT.png as well as for a Fisher discriminant which I use as a reference, since I was always getting quite good results with it: * https://dl.dropbox.com/u/171315/tmva/classifier_distribution_Fisher.png However the the cut efficiencies as well as the ROC curve look very strange: * https://dl.dropbox.com/u/171315/tmva/cut_efficiencies_BDT.png * https://dl.dropbox.com/u/171315/tmva/cut_efficiencies_Fisher.png * https://dl.dropbox.com/u/171315/tmva/ROC_curve.png I'm using a single TTree as input and sweights from an sSplot via: factory->SetWeightExpression("sweight_sig", "Signal"); factory->SetWeightExpression("sweight_bkg", "Background"); to discriminate between signal and background events for training and testing. There are about 130k signal and 20k background events in my sample. As I never experienced such problems before I guess it has something to do with the sweights. Do you ever tried that before or have a clue whats going wrong? Thanks Christophe PS: The TMVA output is: https://dl.dropbox.com/u/171315/tmva/tmva_output.txt The factory options are V:!Silent:Color:DrawProgressBar:Transformations=I;D;P;G,D:AnalysisType=Classification The split options are: SplitMode=Random:!V:SplitSeed=0:nTrain_Signal=0:nTest_Signal=0:nTrain_Background=0:nTest_Background=0 The BDT options are: !V:NTrees=400:nEventsMin=400:MaxDepth=3:BoostType=AdaBoost:SeparationType=GiniIndex:nCuts=20:PruneMethod=NoPruning:VarTransform=Decorrelate I also tried adding IgnoreNegWeightsInTraining |