Menu

#300 Incorrect MAP evaluation

v3.x
closed
1
2018-06-22
2018-02-22
No

It looks like the MAP evaluation (and possibly others) are reporting incorrect high scores when there are relevance judgments for a query but no results for that query were retrieved.

For example, if you have the following results from batch-search:

test_query_1 Q0 relevant_doc_1 1 -5 galago

With these relevance judgments:

test_query_1    0   relevant_doc_1  1
test_query_2    0   relevant_doc_2  1

Galago eval reports a MAP of 1. If you insert a non-relevant document for test_query_2:

test_query_1 Q0 relevant_doc_1 1 -5 galago
test_query_2 Q0 dummy_doc 1 -5 galago

you get the correct MAP of 0.5

Discussion

  • Lemur Project

    Lemur Project - 2018-02-22
    • labels: evaluation --> evaluation, MAP
    • status: open --> accepted
    • assigned_to: Stephen Harding
     
  • Lemur Project

    Lemur Project - 2018-02-23

    MAP calculates over the submitted query count rather than the total queries in a collection set (as indicated by the query count from the qrels file).

    NIST won't let anyone submit trec results if all the queries aren't present in the ranked lists. However, for non-TREC type submissions running query sets that have some queries with no results (as opposed to simply not being included in results files) would definitely be misleading if one didn't pay attention to the ranking list file query count.

    So should Galago eval always use the query count from the qrels file rather than what it encounters? People do sometimes place only certain queries in a file to see how they do rather than the entire query set. But the --limit argument allows one to define a subset of query IDs to evaluate so one doesn't have to create special, subset query files for evaluation.

    Should Galago Eval simply print out a warning that a submitted query count was less than the collection query set count, as indicated by qrels, but still calculate MAP based on the query count it actually saw?

    Or perhaps some sort of generic batch-search retrieval output for a query with no results instead of just nothing, e.g.
    ...
    q44 Q0 no_results 1 -999.0 galago
    ...

    This last option might cause system complications being a result with a non-existent doc ID.

    Any preferences or alternatives?

     
  • Michael Zarozinski

    I like the idea of emitting a "dummy doccument" from the batch search. Perhaps naming it something very obvious like "no_results_found_for_<querid>"

     
  • Lemur Project

    Lemur Project - 2018-02-26

    OK, I'll check into doing that. Hopefully no "complications" with a bogus doc ID.

     
  • Lemur Project

    Lemur Project - 2018-03-06
    • status: accepted --> pending
     
  • Lemur Project

    Lemur Project - 2018-03-06

    Added --showNoResults parameter to Galago batch-search.

    Default parameter value is false, meaning batch-search will print nothing for a query that produces no hits. This allows people who are focusing on the performance of a subset of a query set while using the full query set relevance file, will get evaluation results over the queries with hits rather than the entire set. Queries with no results are absent.

    With --showNoResults set to true, a dummy document will be printed out for queries having no hits. A dummy document will appear as follows:

     <qid>  Q0  no_results_found  1  -999  <run_label>
    

    The run label is typically galago, but can be changed if batch-search --systemName parameter is defined along with the --trec parameter set to true.

    The dummy results ensure Galago eval will apply specified metrics over the entire query set and not only queries returning results.

     
  • Lemur Project

    Lemur Project - 2018-06-22
    • status: pending --> closed
     

Log in to post a comment.