RankLib scoring output

Search engine and data mining applications and ClueWeb datasets.

Brought to you by: cammiemw, david_fisher, gregorybrooks, jamiecallan, sm-harding

RankLib scoring output

Forum: RankLib

Creator: Darren L

Created: 2018-05-17

Updated: 2018-05-17

Darren L - 2018-05-17

Hi,

I am using my model to rank a ranking problem using -score option.

I noticed the output consists of 3 columns, the first column is the query ID and the third column is the relevance scoring. I am just wondering, is the second column the document ID? For example, with this data

2 qid:12 1:0.4 2:0.7 #docid:0 3 qid:12 1:0.3 2:0.23 #docid:1 1 qid:12 1:0.45 2:0.52 #docid:2

With the result of:

12 0 0.787 12 1 0.43 12 2 0.752

does it mean the result with second column 0 (i.e. first row) represents my document with docid 0? How does RankLib interpret the docid, using the comment value or the document order of every query?

Looking forward to your response. Cheers!
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Lemur Project - 2018-05-17

RankLib doesn't know anything about document IDs. It just knows the ordering of the documents in the provided list of docs to be ranked. So 0 represents the first document in the list, 1 the second, etc. You'll notice for multiple query output, the ordering starts over again for each new query.

So to get the document IDs in the TREC style output, you'll need to map the list order value to document ID to know what documents are represented in the ranking by their IDs Probably best to do an order value to doc ID value replacement before you sort by scores.

You can use the -indri format argument to obtain the document description in the output. If the description is the doc ID then you're all set. If no doc ID info is in the description, you're still stuck with input document order rather than ID.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Log in to post a comment.