Menu

Negative training and validation error and nan cost

Yishan
2016-03-05
2016-03-05
  • Yishan

    Yishan - 2016-03-05

    HI all,

    I'm running my multiclass classification task with Currennt. And I keep getting negative training and validation errors and nan loss, like this:

    Starting training...
    
     Epoch | Duration |  Training error  | Validation error |    Test error    | New best 
    -------+----------+------------------+------------------+------------------+----------
         1 |    256.1 |-42.46%      -nan | -0.63%      -nan |                  |  no    
         2 |    256.3 |-43.04%      -nan | -0.63%      -nan |                  |  no    
         3 |    257.0 |-43.64%      -nan | -0.63%      -nan |                  |  no    
         4 |    253.2 |-40.52%      -nan | -0.63%      -nan |                  |  no    
    

    Does anyone know what's the reason for this? Thanks!

    Yishan

     

    Last edit: Yishan 2016-03-05
    • Qingsong Liu

      Qingsong Liu - 2016-03-05

      what's the value of learning rate?
      try a smaller value

      On Sat, Mar 5, 2016 at 2:22 PM, Yishan jieralice@users.sf.net wrote:

      HI all,

      I'm running my multiclass classification task with Currennt. And I keep
      getting negative training and validation errors and nan cost, like this:

      Starting training...

      Epoch | Duration | Training error | Validation error | Test error | New best
      -------+----------+------------------+------------------+------------------+----------
      1 | 256.1 |-42.46% -nan | -0.63% -nan | | no
      2 | 256.3 |-43.04% -nan | -0.63% -nan | | no
      3 | 257.0 |-43.64% -nan | -0.63% -nan | | no
      4 | 253.2 |-40.52% -nan | -0.63% -nan | | no

      Does anyone know what's the reason for this? Thanks!

      Yishan

      Negative training and validation error and nan cost
      https://sourceforge.net/p/currennt/discussion/general/thread/ee6c4e2c/?limit=25#1995


      Sent from sourceforge.net because you indicated interest in
      https://sourceforge.net/p/currennt/discussion/general/

      To unsubscribe from further messages, please visit
      https://sourceforge.net/auth/subscriptions/

      --

      Qingsong Liu
      liuqs.ustc@gmail.com
      Univ. of Sci.& Tech. of China


       
      • Yishan

        Yishan - 2016-03-05

        Thanks, Qingsong! I noticed a problem in my label data and now the negative errors disappear but I still get -nan loss. Do you think it's because the loss value is too large to display? 'coz when it stops the validation error seems have more than 20 digits.

         
      • Yishan

        Yishan - 2016-03-05

        I tried very small learning rate, but still nan. My data is large so I started from a relatively simple model. The training error after first epoch is 89.68%, but after that it goes to 100%. like this:

        Epoch | Duration |  Training error  | Validation error |    Test error    | New best 
        -------+----------+------------------+------------------+------------------+----------
             1 |    637.1 | 89.68%      -nan |100.00%      -nan |                  |  no    
             2 |    638.2 |100.00%      -nan |100.00%      -nan |                  |  no    
             3 |    643.5 |100.00%      -nan |100.00%      -nan |                  |  no    
             4 |    648.4 |100.00%      -nan |100.00%      -nan |                  |  no    
        
         
  • Yishan

    Yishan - 2016-03-05

    Siince I'm doing many to one training, like described in this blog , when I changed the layer from blstm to lstm, the training become reasonable. But when I use the same setting and run it again, it goes back to nan.

     

    Last edit: Yishan 2016-03-05

Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.