Why learning metric affects MART results?

Search engine and data mining applications and ClueWeb datasets.

Brought to you by: cammiemw, david_fisher, gregorybrooks, jamiecallan, sm-harding

Why learning metric affects MART results?

Forum: RankLib

Creator: paolo picello

Created: 2016-11-17

Updated: 2016-11-19

paolo picello - 2016-11-17

Hi everyone, I know that pointwise and pairwise algorithms have their own loss function, for example defined on the number of misclassified pairs and as I expected in RankLib tutorial there is a very clear indication of how the parameter -metric2t doesn't affect the results. But i'm actually experimenting the opposite, if i run several times for example MART with different train metric as parameter i get different results. How is this possible?

Thank you very much

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Lemur Project - 2016-11-17

What do you mean by different results? Different metric improvements?

Not sure what metrics you are using, but MAP and NDCG metrics need a query relevance file over all relevant documents, not just those inside of a cutoff point to be fully valid.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

paolo picello - 2016-11-18

Sure. I mean that i get different scores for my documents running different times the same algorithm (e.g. MART) with different -metric2t (e.g NDCG, ERR and MAP) even if this parameters shouldn't affect the results due to gbrt nature. Or am i wrong?

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

paolo picello - 2016-11-18

After calling this command :

java -jar bin/RankLib.jar -load mymodel.txt -rank MQ2008/Fold1/test.txt -score myscorefile.txt

with models trained with mart with different -metric2t i get different scores..

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Lemur Project - 2016-11-18

Different metrics to optimize are going to produce at least slightly different models, and thus different scores. There's no normalization of optimizations produced across different metrics that I am aware of.

The produced models are text files, so you can actually look at the thresholds and weights of the trees produced for the model. You will note the models will be different for different metrics used in producing the model.

It would therefore be normal to end up with different scores.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

paolo picello - 2016-11-19

Then I don't understand this note

Important Note: -metric2t (e.g. NDCG, ERR, etc) only applies to list-wise algorithms (AdaRank, Coordinate Ascent and LambdaMART). Point-wise and pair-wise techniques (MART, RankNet, RankBoost), due to their nature, always use their internal RMSE / pair-wise loss as the optimization criteria. Thus, -metric2t has no effects on them.

Tomorrow I'll look at the source to better undertand what's happening.. Thank you :)

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Diego - 2017-02-14

Hi everyone,
I tried to run RankNet [1] algorithm and if I don't specify the training metric (by the -metric2t option), it seems that the default is ERR@10, while specifying the training metric it seems that the algorithm is using it.

So, I don't understand the note either.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Log in to post a comment.