Scoring function with fields

Retrieval
2009-10-31
2015-08-26
  • Florian Boudin

    Florian Boudin - 2009-10-31

    Hi, I am having some difficulties to understand what exactly does the scoring
    function when we use fields with restriction or evaluation. Can you please
    explain how exactly the score is computed for :

    weight( 1.0 dog 0.5 train.title )

    and

    weight( 1.0 dog 0.5 train.(title) )

    Thanks

     
  • Florian Boudin

    Florian Boudin - 2009-11-02

    Thank you for your answer. Just to be sure that I have understood correctly,

    #weight(.) is computed as Sum(i=1,n) wi/W . log(bi)

    with n the number of query words, wi the explicit weight for each word, W the
    sum of weights and bi the belief

    bi is computed (suppose no smoothing and a field FIELD) as:

    bi = count(w,F)/|d| for word.FIELD and
    bi = count(w,F)/|F| for word.(FIELD)

     
  • David Fisher

    David Fisher - 2009-11-02

    Where |F| is the sum over all occurrences of F in the document.

     
  • liora

    liora - 2015-08-26

    Hi David,
    Reagrding the above formulae , I have created demo documents to test the exact scoring function,
    when I have a field FIELD (for instance, a sentence in a document).
    In each trial there is only single document in the collection (for simiplicity), each time doc1, doc2, doc3 are single documents in the collection.

    doc1:
    FIELD small large /FIELD
    FIELD small large /FIELD
    FIELD blue red /FIELD

    doc2:
    FIELD small large /FIELD
    FIELD blue red /FIELD

    doc3:
    FIELD small large /FIELD
    FIELD small large gigantic /FIELD
    FIELD blue red /FIELD

    I am using the query:
    "#1 (small big).FIELD", which means that the background LM is the full collection,

    I compute the scores in the following manner:
    log[(1-collection_weight)*count(phrase,field)/|C|+collection_weight(count(phrase,collection)/|C|)],
    this is calculated for each field in a document, and the sum of these scores is the final document score.

    I get the same scores as Indri gives for doc1 and doc3, yet for doc2 I do not (-1.26 and -1.39 respectively).
    I am missing something?

    Thanks a lot!

     
    Last edit: liora 2015-08-26

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks