Based on my understanding, right now when one chooses the 5-fold cross-validation option in RankLib, the data is broken into train/validation/test sets. Then the final result which is reported is kind of average of the test results for each fold.
I have a dataset, which has a training and a test set. I would like to train the model using 5-fold cross validation (creating only train/validation sets). But then I need to test the final model on my own test set which is a separate file.
I cannot find any documents explaining how I can merge the 5 model files and have one single file as the final model. Am I doing something wrong or missing something?
Thank,
Mohammad
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
The average scoring for train/test data over all fold models is just a convenience summary of fold model performance. There is no actual merged model produced and I am not certain how one would go about merging multiple models into one.
You can use the Analyzer to compare the performance of the various k-fold models you produced against your own test set.
Hello,
Based on my understanding, right now when one chooses the 5-fold cross-validation option in RankLib, the data is broken into train/validation/test sets. Then the final result which is reported is kind of average of the test results for each fold.
I have a dataset, which has a training and a test set. I would like to train the model using 5-fold cross validation (creating only train/validation sets). But then I need to test the final model on my own test set which is a separate file.
I cannot find any documents explaining how I can merge the 5 model files and have one single file as the final model. Am I doing something wrong or missing something?
Thank,
Mohammad
The average scoring for train/test data over all fold models is just a convenience summary of fold model performance. There is no actual merged model produced and I am not certain how one would go about merging multiple models into one.
You can use the Analyzer to compare the performance of the various k-fold models you produced against your own test set.
See section 2.4 of https://sourceforge.net/p/lemur/wiki/RankLib%20How%20to%20use/ for information on using the RankLib Analyzer.
Thank you for the reply.
Then may I ask what does the final result which is reported when we do k-fold cross validation mean?
The Averages are train/test set scores averaged over the number of folds.
The Total is the total test scores weighted by number of samples divided by the total test samples.