I'm currently working with ranklib coordinate ascent, -r 25 -i 100 -metric2t rr@10 and the resulting distribution is very tail ended. We're looking for an answer to the issue both by trying different metrics and tracing the code for any signs of feature preconditioning.
So a couple questions:
Assuming no -norm flag has been used, is there any preconditiong happening to the feature values before training?
Is there a specific reason why the rr@10 scorer breaks out of the correct rank search loop after finding only the first correct answer?
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I don't believe preconditioning of feature values occurs. It's not mentioned explicitly in the metzler and Croft paper describing the algorithm. Also, the original paper used MAP for its evaluation metric.
Not sure what you mean by tail heavy distribution. Of what?
You should try some feature normalization settings to see if that has much effect on the resulting model. Normalization is generally a good thing to do.
RR is typically used when there is only a single relevant document for a query. Thus, the reciprocal of the rank at which that document occurs is the focus of interest for the metric, and thus no point in looking below that rank.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Tail heavy in that a histogram of our post rerank correct document placements shows <=50% in the top ten and >50% somewhere between 11 <= x <= N, but that's not a ranklib specific issue.
We've tried both normalizing features and using their z scores. While the normalization was better, our current best results were from the raw feature set.
Thanks for the information on RR application, hopefully we'll get better results from training using different metrics.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I'm currently working with ranklib coordinate ascent, -r 25 -i 100 -metric2t rr@10 and the resulting distribution is very tail ended. We're looking for an answer to the issue both by trying different metrics and tracing the code for any signs of feature preconditioning.
So a couple questions:
I don't believe preconditioning of feature values occurs. It's not mentioned explicitly in the metzler and Croft paper describing the algorithm. Also, the original paper used MAP for its evaluation metric.
Not sure what you mean by tail heavy distribution. Of what?
You should try some feature normalization settings to see if that has much effect on the resulting model. Normalization is generally a good thing to do.
RR is typically used when there is only a single relevant document for a query. Thus, the reciprocal of the rank at which that document occurs is the focus of interest for the metric, and thus no point in looking below that rank.
Tail heavy in that a histogram of our post rerank correct document placements shows <=50% in the top ten and >50% somewhere between 11 <= x <= N, but that's not a ranklib specific issue.
We've tried both normalizing features and using their z scores. While the normalization was better, our current best results were from the raw feature set.
Thanks for the information on RR application, hopefully we'll get better results from training using different metrics.