Menu

L2 regularization

Help
Barbara
2009-04-25
2013-04-09
  • Barbara

    Barbara - 2009-04-25

    Dear all,

    I wonder, why is L2 regularization (Gaussian prior over parameters) implemented in TADM (see probs.cc) as the following, where f is the log-likelihood, i.e. LL(p,q) = - sum_i p(x_i) log q(x_i)

    L2 (ridge) penalty: f = f + sum(x^2) / 2*sigma

    Why is it f *plus* the regularization term (sum(x^2) / 2*sigma) instead of
    LL *minus* it (as described in papers, a.o. Chen & Rosenfeld 1999)? wha'ts the explanation for the plus in the implementation?

    if LL would be sum_i p(x_i) log q(x_i) then the plus would make sense, but its  - sum_i p(x_i) log q(x_i), so shouldn't it actually be:

    f = f - sum(x^2) / 2*sigma

    (i.e. shouldn't there be in the code, line 312 in probs.cc:  f -= pen; instead of f+= pen;)?

    Thanks for any help!

    Barbara

     
    • Miles Osborne

      Miles Osborne - 2009-04-26

      It has been a long time since I looked at this, but if I remember correctly it does the right thing.  Youi just need to check the signs of the various terms.

      (You can easily verify this by running the code with and without the prior;  with the prior (and with a small variance) the weights all get smaller, as expected.

      Miles

       

Log in to post a comment.