Menu

k-fold Cross Validation

RankLib
Brian Yee
2017-08-21
2017-08-21
  • Brian Yee

    Brian Yee - 2017-08-21

    If I'm reading about the -kcv param correctly, I can have just one training_data.txt and it will split it appropriately for me for use with testing/validation. Is that correct? In this case, I would not have to define a test_data.txt or validation_data.txt.

    Is there any advantage to defining my own separate train/test/validate data sets? Why would one ever do that?

     
  • Lemur Project

    Lemur Project - 2017-08-22

    You can use the -kcv parameter for k cross-validation in which the data set will be broken up into K parts with one part reserved for testing and others for training. The parts are shuffled so that all parts eventually get used for training and testing.

    If you have one large data set, you can also split it using -tvs and -tts arguments to split into validation or test sets.

    Typically you want the bulk of your data to be for training while less for validation and/or test. Split values I've typically seen are 50/25/25 (train/validate/test), 60/20/20 and 70/30 (train/test).

    I don't know if there are any benefits to having separate train/validation/test data sets. I suppose if you have a set that has been developed at great effort with great attention to detail (relevance), you would want to keep it as a separate test set.

    There are some voices that feel nothing is really gained by cross-validation as it isn't as precise in prediction accuracy as random selections of all the data for developing and testing the model.

     
  • Brian Yee

    Brian Yee - 2017-08-25

    So doing k cross validation is working and I end up saving k number of models, but aside from the console output, my application has no way to know which model is the best performing. Is there a way to only save the best model?

     
  • Lemur Project

    Lemur Project - 2017-08-28

    The kcv output should have a summary of each model created in terms of evaluation metric selected, although depending on the ranking algorithm used, it can be difficult viewing. Save the output so you can sort through it. Should be at the bottom.

    You can use the Evaluator to directly compare models, preferably against a baseline (which could be one of your models). This will give you direct comparisons, complete with selected statistical tests on difference significance.

    This is a case where it might be good to have a separate test set that has never been seen by the models during their creation. Use it for the comparison tests. You can compare evaluation metrics on a per query basis if desired.

    You can create hard-copy versions of training, validation and test sets used for the kcv runs using the FeatureManager.

    See the RankLib Wiki page
    https://sourceforge.net/p/lemur/wiki/RankLib%20How%20to%20use/ to see an example.

     

Log in to post a comment.

MongoDB Logo MongoDB