Search engine and data mining applications and ClueWeb datasets.

#291 ERR@ seems to be broken in RankLib

Milestone: v1.x

Status: open

Owner: nobody

Labels: None

Priority: 1

Updated: 2016-11-18

Created: 2016-11-18

Creator: Leonid Boytsov

Private: No

Hi,

I have been working with the library for a while. I am very happy with the results: RankLib is an extremely useful library. However, when I switched to using ERR@20, things fell apart. It looks like:

1) ERR@K is computed incorrectly. For example, I get negative values.
2) Training using ERR@K doesn't work. I tried coordinate ascent and lambda mart.

In the case of coordinate ascent, I get a very bad model. In the case of LMART I get an exception, which I paste below.

To reproduce results, I attach a file with two features each of which should get a good weight. For example, a good coordinate ascent model would have weights: 0.7 and 0.3.

I also attach helper scripts that I used to train/test models just in case.

Recall that training works with e.g. NDCG@20, but not with ERR@20.

I used Java Oracle Java 8 on Linux.

Many thanks!

Exception:

Reading feature file [/home/ubuntu/sample.feat]... [Done.]
(9240 ranked lists, 138600 entries read)
Initializing... [Done]

Training starts...

iter | ERR@20-T | ERR@20-V |

1 | -20.9899 |
2 | Exception in thread "main" java.lang.NullPointerException
at ciir.umass.edu.learning.tree.RegressionTree.insert(RegressionTree.java:150)
at ciir.umass.edu.learning.tree.RegressionTree.fit(RegressionTree.java:64)
at ciir.umass.edu.learning.tree.LambdaMART.learn(LambdaMART.java:203)
at ciir.umass.edu.learning.RankerTrainer.train(RankerTrainer.java:43)
at ciir.umass.edu.eval.Evaluator.evaluate(Evaluator.java:730)
at ciir.umass.edu.eval.Evaluator.main(Evaluator.java:503)

1 Attachments

reprod_files.tar.gz

Discussion

Leonid Boytsov - 2016-11-18

PS: RankLib versions that I tried: 2.5 and 2.7.
I also noticed that all metric values (P@K, MAP, NDCG@K) seems to be multiplied by 10, though, of course, this doesn't affect the outcome of training.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

ERR@ seems to be broken in RankLib