I want to setup using global learning rate instead of learning rate matrix, I am doing below, it seems I got model with all NaN values for network weights. I guess I must have done something wrong. So anyone please tell me what goes wrong with it?
What you are encountering is instability; it is a common problem in
neural network training.
The "preconditioned" version of the component has two things that
prevent instability: it has the natural-gradient update, and it has
the "max-change" parameter which limits the parameter change per
minibatch.
If you want to do without those, you could try just decreasing the
learning rate until it's stable.
Dan
I want to setup using global learning rate instead of learning rate matrix,
I am doing below, it seems I got model with all NaN values for network
weights. I guess I must have done something wrong. So anyone please tell me
what goes wrong with it?
Hi All,
I want to setup using global learning rate instead of learning rate matrix, I am doing below, it seems I got model with all NaN values for network weights. I guess I must have done something wrong. So anyone please tell me what goes wrong with it?
AffineComponent input-dim=xxx output-dim=xxx learning-rate=xxx param-stddev=xxx bias-stddev=xxx
thanks,
Yan
What you are encountering is instability; it is a common problem in
neural network training.
The "preconditioned" version of the component has two things that
prevent instability: it has the natural-gradient update, and it has
the "max-change" parameter which limits the parameter change per
minibatch.
If you want to do without those, you could try just decreasing the
learning rate until it's stable.
Dan
On Wed, Jul 15, 2015 at 11:16 AM, Yan Yin riyijiye1976@users.sf.net wrote: