Menu

RankLib scoring output

RankLib
Darren L
2018-05-17
2018-05-17
  • Darren L

    Darren L - 2018-05-17

    Hi,

    I am using my model to rank a ranking problem using -score option.

    I noticed the output consists of 3 columns, the first column is the query ID and the third column is the relevance scoring. I am just wondering, is the second column the document ID? For example, with this data

    2 qid:12 1:0.4 2:0.7 #docid:0
    3 qid:12 1:0.3 2:0.23 #docid:1
    1 qid:12 1:0.45 2:0.52 #docid:2
    

    With the result of:

    12 0 0.787
    12 1 0.43
    12 2 0.752
    

    does it mean the result with second column 0 (i.e. first row) represents my document with docid 0? How does RankLib interpret the docid, using the comment value or the document order of every query?

    Looking forward to your response. Cheers!

     
  • Lemur Project

    Lemur Project - 2018-05-17

    RankLib doesn't know anything about document IDs. It just knows the ordering of the documents in the provided list of docs to be ranked. So 0 represents the first document in the list, 1 the second, etc. You'll notice for multiple query output, the ordering starts over again for each new query.

    So to get the document IDs in the TREC style output, you'll need to map the list order value to document ID to know what documents are represented in the ranking by their IDs Probably best to do an order value to doc ID value replacement before you sort by scores.

    You can use the -indri format argument to obtain the document description in the output. If the description is the doc ID then you're all set. If no doc ID info is in the description, you're still stuck with input document order rather than ID.

     

Log in to post a comment.

MongoDB Logo MongoDB