Failure performing RankLib k-fold cross validation (KCV) with certain data. KCV process runs to completion, but produces many NaN training or test results irrespective of training metric. Failure occurs on the final fold processing.
This failure occurs irrespective of ranking algorithm or training metric used, but only with specific data, so the problem is clearly data driven.
Perhaps the problem has some relationship to Bug #291, but is not limited to use of ERR@k as training metric.
It is possible to generate a NullPointer/ArrayIndexOutOfBounds exception failure on some runs depending on arguments used.
Sample Problem Data
1 qid:6476 1:0.00000 2:0.00000 3:0.00000 4:0.00000 5:0.00000 6:0.00000 7:0.00000 8:0.69789 9:0.00000 10:0.00000 11:0.00000 12:0.67780 13:0.00000 14:0.12249
0 qid:6476 1:0.65229 2:0.61308 3:0.42410 4:0.36891 5:0.00000 6:0.88454 7:0.88275 8:0.90317 9:0.00000 10:0.88487 11:0.87646 12:0.85536 13:0.00000 14:0.13413
3 qid:6476 1:0.57395 2:0.50366 3:0.35114 4:0.27311 5:0.00000 6:0.87313 7:0.87661 8:0.75848 9:0.00000 10:0.86981 11:0.86382 12:0.72657 13:0.00000 14:0.14819
17 qid:6476 1:0.43631 2:0.38303 3:0.00000 4:0.00000 5:0.00000 6:0.00000 7:0.00000 8:0.57272 9:0.00000 10:0.00000 11:0.00000 12:0.59169 13:0.00000 14:0.16097
20 qid:6476 1:0.33554 2:0.29669 3:0.31379 4:0.29113 5:0.00000 6:0.81922 7:0.78361 8:0.73338 9:0.00000 10:0.82155 11:0.77441 12:0.66986 13:0.00000 14:0.17765
0 qid:6476 1:0.52996 2:0.47312 3:0.31961 4:0.25025 5:0.00000 6:0.88643 7:0.85118 8:0.75732 9:0.00000 10:0.88568 11:0.84646 12:0.71264 13:0.00000 14:0.20701
Run Command
java -jar RankLib.jar -ranker 3 -train data/problem_data.txt -gmax 21 -kcv 5
Output
495 | 14 | NaN | | DAMN |
496 | 14 | NaN | | DAMN |
497 | 14 | NaN | | DAMN |
498 | 14 | NaN | | DAMN |
499 | 14 | NaN | | DAMN |
500 | 14 | NaN | | DAMN |
--------------------------------------------------------
Finished sucessfully.
ERR@10 on training data: NaN
---------------------------------
Summary:
ERR@10 | Train | Test
----------------------------------
Fold 1 | 0.1052 | NaN
Fold 2 | 0.1052 | NaN
Fold 3 | 0.1052 | NaN
Fold 4 | 0.1052 | NaN
Fold 5 | NaN | 0.2604
----------------------------------
Avg. | NaN | NaN
----------------------------------
Total | | NaN
Some more sample data used when the bug appeared.