inf RMSE for collaborative filtering crossvalidate

Brought to you by: boulzh, mikegashler

inf RMSE for collaborative filtering crossvalidate

Forum: Help

Creator: Anonymous

Created: 2014-04-22

Updated: 2014-04-25

Anonymous - 2014-04-22

Just a general question, any ideas of what may be the cause for receiving an "inf" value for RMSE, MSE, and MAE when doing crossvalidate for waffles_recommend?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Mike Gashler - 2014-04-22

That sounds like it is most-likely due to a bug in one of my collaboartive filtering implementations. (Sorry.) If you have a simple repro, I could debug it. If you want to debug it yourself, here is how I would do it:

In GCollaborativeFilter::trainAndTest in GClasses/GRecommender.cpp:174, just after this line,

double prediction = predict(size_t(pVec[0]), size_t(pVec[1]));

add some code to detect the problem,

if(prediction < -1e100 || prediction > 1e100) throw Ex("Unreasonable prediction");

then put a breakpoint on that last line, so you can find out when the problem first occurs. Then, restart your debugger, get back to the point where the problem first occurs, and step through the call to "predict" to see why it ends up making an unreasonable prediction.

Last edit: Mike Gashler 2014-04-25
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Anonymous - 2014-04-23

Thanks for the help. I got it squared away.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Anonymous - 2014-04-24

One more quick question. How does NLPCA handle the case where there is a novel item? For example, using crossvalidation, if in the original set of ratings, an item is only rated once and that rating is in the test set but the item number is less than m_itmes. From my debugging, it looks like it returns a value less than 1e-100 (is that the UNKNOWN_REAL_VALUE?)

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Mike Gashler - 2014-04-25

UNKNOWN_REAL_VALUE is defined in GClasses/GMatrix.h:

#define UNKNOWN_REAL_VALUE -1e308

NLPCA initializes item weights with small random values. Training causes these weights to be refined. If an item does not occur in the training portion of the data, then its weights will not be refined, so any predictions for that item will come out very close to zero, but not exactly zero.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Anonymous
  
  Add attachments
  Cancel
  You seem to have CSS turned off. Please don't fill out this field.
  
  You seem to have CSS turned off. Please don't fill out this field.

Anonymous