TADM / Discussion / Help: L2 regularization

L2 regularization

Forum: Help

Creator: Barbara

Created: 2009-04-25

Updated: 2013-04-09

Barbara - 2009-04-25

Dear all,

I wonder, why is L2 regularization (Gaussian prior over parameters) implemented in TADM (see probs.cc) as the following, where f is the log-likelihood, i.e. LL(p,q) = - sum_i p(x_i) log q(x_i)

L2 (ridge) penalty: f = f + sum(x^2) / 2*sigma

Why is it f *plus* the regularization term (sum(x^2) / 2*sigma) instead of
LL *minus* it (as described in papers, a.o. Chen & Rosenfeld 1999)? wha'ts the explanation for the plus in the implementation?

if LL would be sum_i p(x_i) log q(x_i) then the plus would make sense, but its - sum_i p(x_i) log q(x_i), so shouldn't it actually be:

f = f - sum(x^2) / 2*sigma

(i.e. shouldn't there be in the code, line 312 in probs.cc: f -= pen; instead of f+= pen;)?

Thanks for any help!

Barbara

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Miles Osborne - 2009-04-26
  
  It has been a long time since I looked at this, but if I remember correctly it does the right thing. Youi just need to check the signs of the various terms.
  
  (You can easily verify this by running the code with and without the prior; with the prior (and with a small variance) the weights all get smaller, as expected.
  
  Miles
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Log in to post a comment.