Hi Jamie,
Increasing the weight is not my idea:P
My suggestion is transforming the output energy into log scale before training, and use the network to predict log(energy) instead.
By this way it's like you have a new cost function log(Truth/Response), which may serve your purpose better.
Best regards,
Jiahang
-----Original Message-----
From: Jamie Ballin [mailto:jamie.ballin@...]
Sent: 1/20/2009 (星期二) 11:57 上午
To: tmva-users@...
Subject: Re: [TMVA-users] MLPs for regression
Hi everyone,
First of all, I use distance as target variable/cost. Cost C = (Truth -
Response). I improved low energy results based on Jiahang's suggestion
of increasing their weight, so now,
C = (2 - Truth) * (Truth - Response)
(Truth is uniform over {0, 1}).
Plotting the performance in terms of Response/Truth vs. Truth strikes me
as being a legitimate way of evaluating the net - after all, this is a
practical exercise, and not an academic one! (This relates to Andreas'
remark).
In any case, training on E_reco/E_truth doesn't produce any noticeable
improvement :-(
Also, some of the input variables are highly non-uniform and
non-Gaussian: there's not much I can do about that unfortunately, but
does anyone have an idea about how robust MLPs are to this? (I always
make sure though, that they are normalised on {0, 1}).
There seems to be a lot of literature (which I admit I find somewhat
impenetrable) regarding how neural networks are modified for regression
purposes (viz. Generalised neural networks, Bayesian inference) -
apparently it requires a substantial change to the structure of the
network. Does anyone have experience of this?
Thanks for your comments - it's a good discussion!
Jamie
Peter Speckmayer wrote:
> Hi Jiahang,
>
> Actually I am using TMVA (MLP mainly) for the same purpose ...
>
> I think the problem ist the estimator you have chosen. If you take Ereco/Etruth, you
> get automatically a lot of problems in the low energy region. There, a small
> fluctuation of Ereco will give you values far away from 1. These outliers produce
> then the damage on your regression.
>
> I would suggest you try (Ereco-Etruth) instead, this is what works for me (and what
> is used in TMVA regression).
>
> cu,
> Peter
>
>
>
>
>
> zhong jiahang schrieb:
>
>> Dear Jamie,
>>
>> Explanation b sounds reasonable, as MLP use (truth-response)^2 as error
>> function.
>> IMHO, the error function would be better to left for users' definition,
>> according to different cases.
>> Maybe you'd like to try a logarithm transformation on the output, i.e.
>> training/predicting with log(energy), if you determine to use
>> prediction/truth for performance evaluation.
>>
>> Best regards,
>> Jiahang
>>
>>
>>> Dear neural net experts,
>>>
>>> I'm using a MLP for regression: having used the TMVA MLP regression
>>> package, another package
>>> (http://www.cs.toronto.edu/~radford/fbm.software.html) and even written
>>> my own implementation (!) in Erlang (!!), I come across the same problem
>>> every time. So, I'm convinced an error in my usage, rather than the
>>> implementations, and I'd really appreciate some expert advice.
>>>
>>> I am trying to predict the energy of particles in the CMS experiment as
>>> a function of variables such as ECAL energy, HCAL energy, eta, phi etc.
>>> I'm not limited by statistics (I have hundred of thousands of simulation
>>> events). In my own private implementation, I have varied the transfer
>>> function between a sigmoid, tanh, even a straight line. No
>>> transformation is applied to the output and input nodes, The target
>>> value is uniform over the range 0, 1 (or -1, 1 depending on the model's
>>> implementation).
>>>
>>> In any case, poor performance manifests itself by a very large
>>> overestimation of energy for low-energy particles. I define performance
>>> in terms of predicted/truth, so ideally this would be 1. Instead, with
>>> every implementation, a curve such as that shown here (ratio as a
>>> function of truth):
>>> http://www.hep.ph.ic.ac.uk/~jballin/ratio_NoRandom_10Epochs_DeltaA.pdf
>>>
>>> Online training, batch-based training, more epochs, more samples, more
>>> nodes, more layers... I've tried it all!
>>>
>>> To some extent, I can straighten the response curve by subtracting a
>>> constant term from the response, and multiplying my some factor, but
>>> this isn't ideal. I suspect:
>>>
>>> a) the transfer function is inappropriate
>>>
>>> b) the cost function for computing the output neuron deltas is poor:
>>> this is some variant of cost C = (Truth - response), but for low energy
>>> particles, this implies a tiny cost. Wouldn't something like a relative
>>> cost be more appropriate, rather than an absolute one? How can this be
>>> implemented?
>>>
>>> Many thanks,
>>>
>>> Jamie
>>>
>>> N.B. CMS operates with high thresholds, meaning that low energy
>>> particles are sometimes 'lost'. Now, I have also tried adding an input
>>> node with a value of unity, and this doesn't change much.
>>>
>>>
>>>
>> ------------------------------------------------------------------------------
>> This SF.net email is sponsored by:
>> SourcForge Community
>> SourceForge wants to tell your story.
>> http://p.sf.net/sfu/sf-spreadtheword
>> _______________________________________________
>> TMVA-users mailing list
>> TMVA-users@...
>> https://lists.sourceforge.net/lists/listinfo/tmva-users
>>
--
Jamie Ballin
High Energy Physics Group, Physics Department
535, Blackett Laboratory, Imperial College
Prince Consort Road, London, SW7 2AZ
Tel: 020 759 47818
http://www.hep.ph.ic.ac.uk and http://www.jamieballin.co.uk
email: j.ballin06@...
------------------------------------------------------------------------------
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
_______________________________________________
TMVA-users mailing list
TMVA-users@...
https://lists.sourceforge.net/lists/listinfo/tmva-users
|