retrieving text only in trectext data

Retrieval
Raven
2011-10-04
2012-09-27
  • Raven

    Raven - 2011-10-04

    Hi,

    How do I retrieve the text only (only the part of the document entered between
    the <TEXT> tags), and not anything
    which exists in the metadata fields and answers the query ?

    Regards,
    Rani

     
  • David Fisher

    David Fisher - 2011-10-04

    Index the text field as a named field, in your parameters, use:

    <field><name>text</name></field>

    and then do extent retrieval, using:

    combine(your query goes here)

    Review the source code for IndriRunQuery in the place where passages are
    printed (using -printPassages=true parameter) to see how to extract the
    content of the text field using the retrieval results and the parsed
    documents.

     

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks