How to test 5 different models trained using 5-fold cross validation?

Search engine and data mining applications and ClueWeb datasets.

Brought to you by: cammiemw, david_fisher, gregorybrooks, jamiecallan, sm-harding

How to test 5 different models trained using 5-fold cross validation?

Forum: RankLib

Creator: Mohammad A

Created: 2016-09-15

Updated: 2016-09-21

Mohammad A - 2016-09-15

Hello,

Based on my understanding, right now when one chooses the 5-fold cross-validation option in RankLib, the data is broken into train/validation/test sets. Then the final result which is reported is kind of average of the test results for each fold.

I have a dataset, which has a training and a test set. I would like to train the model using 5-fold cross validation (creating only train/validation sets). But then I need to test the final model on my own test set which is a separate file.

I cannot find any documents explaining how I can merge the 5 model files and have one single file as the final model. Am I doing something wrong or missing something?

Thank,

Mohammad

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Lemur Project - 2016-09-15

The average scoring for train/test data over all fold models is just a convenience summary of fold model performance. There is no actual merged model produced and I am not certain how one would go about merging multiple models into one.

You can use the Analyzer to compare the performance of the various k-fold models you produced against your own test set.

See section 2.4 of https://sourceforge.net/p/lemur/wiki/RankLib%20How%20to%20use/ for information on using the RankLib Analyzer.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Mohammad A - 2016-09-16
  
  Thank you for the reply.
  
  Then may I ask what does the final result which is reported when we do k-fold cross validation mean?
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Lemur Project - 2016-09-21

The Averages are train/test set scores averaged over the number of folds.

The Total is the total test scores weighted by number of samples divided by the total test samples.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Log in to post a comment.