Let me start by saying thanks for creating this toolkit and sharing it with the world.
I think I hit a bug however. Using the downloadable binary, it seems that z-score normalization messes up a saved model. For instance, note the different "MAP on test data" (this is all done on the MQ2008 data, Fold1):
I don't seem to be getting the difference in scores between a saved normed model and a normed loaded model. I definitely am not getting any sort of NullPointerException reported in the original problem from 2013.
I would not expect scores to be the same between non-normed saved models and normed data loaded into a non-normed model.
I have tried running some save/load tests using RankLib-2.3, which was the version in which the original problem was reportedly fixed, as well as the downloaded RankLib-2.7 model from downloads, and a current jar built from RankLib-2.8-SNAPSHOT sources.
Have I misunderstood your reported problem, or inadequately attempted to reproduce the problem?
Below is a listing of the testing I did based on what was done in the original error report as well as the Bug 221 report ( https://sourceforge.net/p/lemur/bugs/221/ ). I used MQ2008/Fold1 data for the runs.
[+] General Parameters:
Training data: train.txt
Test data: test.txt
Feature vector representation: Dense.
Ranking method: RankBoost
Feature description file: Unspecified. All features will be used.
Train metric: MAP
Test metric: MAP
Feature normalization: zscore
Model file: download-2.7-save.txt
[+] General Parameters:
Training data: train.txt
Test data: test.txt
Feature vector representation: Dense.
Ranking method: RankBoost
Feature description file: Unspecified. All features will be used.
Train metric: MAP
Test metric: MAP
Feature normalization: zscore
Model file: RL2.8-save.txt
// RankLib-2.3 in which Bug 221 was reported fixed.
// Create normed model
$ java -jar RankLib-2.3.jar -train train.txt -test test.txt -ranker 2 -norm zscore \
-silent -metric2t MAP -save RL2.3-save.txt
[+] General Parameters:
Training data: train.txt
Test data: test.txt
Feature vector representation: Dense.
Ranking method: RankBoost
Feature description file: Unspecified. All features will be used.
Train metric: MAP
Test metric: MAP
Feature normalization: zscore
Model file: RL2.3-save.txt
Let me start by saying thanks for creating this toolkit and sharing it with the world.
I think I hit a bug however. Using the downloadable binary, it seems that z-score normalization messes up a saved model. For instance, note the different "MAP on test data" (this is all done on the MQ2008 data, Fold1):
Without normalization it works fine:
Note that when I try to do the same normalization during testing, I get an error:
This is fixed in the SVN version. Compiling the trunk and using that yields the different scores again:
Can it be that the zscore normalization is applied to the combination of training and testing instances?
Added as a bug, https://sourceforge.net/p/lemur/bugs/221/ thanks for reporting it.
This issue has been fixed. Please do an update from trunk and let us know if it's really gone.
Yep, this is fixed now.
Sounds good. Thanks for confirming.
I'm still experiencing the same issue using this version: 2.1-patched-2
Please use a more recent version.
Current release version is RankLib-2.7.
having same problem using 2.7 version taken from here: https://sourceforge.net/projects/lemur/files/lemur/RankLib-2.7/
I don't seem to be getting the difference in scores between a saved normed model and a normed loaded model. I definitely am not getting any sort of NullPointerException reported in the original problem from 2013.
I would not expect scores to be the same between non-normed saved models and normed data loaded into a non-normed model.
I have tried running some save/load tests using RankLib-2.3, which was the version in which the original problem was reportedly fixed, as well as the downloaded RankLib-2.7 model from downloads, and a current jar built from RankLib-2.8-SNAPSHOT sources.
Have I misunderstood your reported problem, or inadequately attempted to reproduce the problem?
Below is a listing of the testing I did based on what was done in the original error report as well as the Bug 221 report ( https://sourceforge.net/p/lemur/bugs/221/ ). I used MQ2008/Fold1 data for the runs.
// Downloaded RankLib-2.7
// Create the normed model
$ java -jar RankLib-2.7-download.jar -train train.txt -test test.txt -ranker 2 -norm zscore \
-silent -metric2t MAP -save download-2.7-save.txt
[+] General Parameters:
Training data: train.txt
Test data: test.txt
Feature vector representation: Dense.
Ranking method: RankBoost
Feature description file: Unspecified. All features will be used.
Train metric: MAP
Test metric: MAP
Feature normalization: zscore
Model file: download-2.7-save.txt
[+] RankBoost's Parameters:
Reading feature file [train.txt]: 0...
Reading feature file [train.txt]... [Done.]
(471 ranked lists, 9630 entries read)
Reading feature file [test.txt]: 0...
Reading feature file [test.txt]... [Done.]
(156 ranked lists, 2874 entries read)
MAP on test data: 0.453
Model saved to: download-2.7-save.txt
// Run loaded normed model
$ java -jar RankLib-2.7-download.jar -test test.txt -norm zscore \
-silent -metric2T MAP -load download-2.7-save.txt
[+] General Parameters:
Model file: download-2.7-save.txt
Feature normalization: zscore
Test metric: MAP
Model: RankBoost
Reading feature file [test.txt]: 0...
Reading feature file [test.txt]... [Done.]
(156 ranked lists, 2874 entries read)
MAP on test data: 0.453
// Current RankLib-2.8-SNAPSHOT
// Create normed model
$ java -jar RankLib-2.8-SNAPSHOT.jar -train train.txt -test test.txt -ranker 2 \
-norm zscore -silent -metric2t MAP -save RL2.8-save.txt
[+] General Parameters:
Training data: train.txt
Test data: test.txt
Feature vector representation: Dense.
Ranking method: RankBoost
Feature description file: Unspecified. All features will be used.
Train metric: MAP
Test metric: MAP
Feature normalization: zscore
Model file: RL2.8-save.txt
[+] RankBoost's Parameters:
Reading feature file [train.txt]: 0...
Reading feature file [train.txt]... [Done.]
(471 ranked lists, 9630 entries read)
Reading feature file [test.txt]: 0...
Reading feature file [test.txt]... [Done.]
(156 ranked lists, 2874 entries read)
MAP on test data: 0.453
Model saved to: RL2.8-save.txt
// Run loaded normed model
$ java -jar RankLib-2.8-SNAPSHOT.jar -test test.txt -norm zscore \
-silent -metric2T MAP -load RL2.8-save.txt
[+] General Parameters:
Model file: RL2.8-save.txt
Feature normalization: zscore
Test metric: MAP
Model: RankBoost
Reading feature file [test.txt]: 0...
Reading feature file [test.txt]... [Done.]
(156 ranked lists, 2874 entries read)
MAP on test data: 0.453
// RankLib-2.3 in which Bug 221 was reported fixed.
// Create normed model
$ java -jar RankLib-2.3.jar -train train.txt -test test.txt -ranker 2 -norm zscore \
-silent -metric2t MAP -save RL2.3-save.txt
[+] General Parameters:
Training data: train.txt
Test data: test.txt
Feature vector representation: Dense.
Ranking method: RankBoost
Feature description file: Unspecified. All features will be used.
Train metric: MAP
Test metric: MAP
Feature normalization: zscore
Model file: RL2.3-save.txt
[+] RankBoost's Parameters:
Reading feature file [train.txt]: 0...
Reading feature file [train.txt]... [Done.]
(471 ranked lists, 9630 entries read)
Reading feature file [test.txt]: 0...
Reading feature file [test.txt]... [Done.]
(156 ranked lists, 2874 entries read)
MAP on test data: 0.453
Model saved to: RL2.3-save.txt
// Run loaded normed model
$ java -jar RankLib-2.3.jar -test test.txt -norm zscore \
-silent -metric2T MAP -load RL2.3-save.txt
[+] General Parameters:
Model file: RL2.3-save.txt
Feature normalization: zscore
Test metric: MAP
Model: RankBoost
Reading feature file [test.txt]: 0...
Reading feature file [test.txt]... [Done.]
(156 ranked lists, 2874 entries read)
MAP on test data: 0.453