Indri Query Flow

David Fisher

Introduction

The following set of pages gives an introduction to what happens "under the hood" when a simple query is executed using IndriRunQuery or from the runQuery() method of the QueryEnvironment class within the API. This set of documents will take you through the gritty details from start to finish with by showing you what happens at each stage of the query processing.

The following example will go through (in excruciating detail) the parsing and evaluation of the simple query: "#combine( dog )".

The Details

The basic eight steps involved in running a query are:

  1. [Parser Creation] - the parser object is prepared to parse the query.
  2. [Query Parsing] - the query is actually parsed and a parse tree is created.
  3. [Extent Restriction] - if the query utilizes any fields, the parse tree is annotated so it knows to restrict its lookups and scoring to those extents.
  4. [Raw Scoring Node Extraction] - the raw scoring nodes are extracted from the parse tree.
  5. [Gathering Statistics] - Counts from the parse tree are gathered fed back into the tree to be used for smoothing and other statistics.
  6. [Application of Smoothing Parameters] - any smoothing parameters are applied to the individual nodes.
  7. [Scored Query Evaluation] - a belief network is created and each document in the index is evaluated scored where applicable.
  8. [Sort and Restrict Results Set] - the result set is sorted and restricted to its final size.

Sample Data Set

For illustrative purposes, the following very basic TREC-formatted dataset will be used throughout the discussion of the query flow:

 <DOC>
   <DOCNO>SIMPLE_DOCNO_1</DOCNO>
   <TEXT>
     The quick brown fox jumps over the lazy dog.
   </TEXT>
 </DOC>
 <DOC>
   <DOCNO>SIMPLE_DOCNO_2</DOCNO>
   <TEXT>
     Cats are smarter than dogs. You can't get eight cats to pull a sled through snow.
   </TEXT>
 </DOC>
 <DOC>
   <DOCNO>SIMPLE_DOCNO_3</DOCNO>
   <TEXT>
     People who hate cats, will come back as mice in their next life.
   </TEXT>
 </DOC>
 <DOC>
   <DOCNO>SIMPLE_DOCNO_4</DOCNO>
   <TEXT>
     If dogs could talk, it would take a lot of the fun out of owning one.  ~Andy Rooney
   </TEXT>
 </DOC>

Related

Wiki: Application of Smoothing Parameters
Wiki: Extent Restriction
Wiki: Gathering Statistics
Wiki: Home
Wiki: Parser Creation
Wiki: Query Parsing
Wiki: Raw Scoring Node Extraction
Wiki: Scored Query Evaluation
Wiki: Sort and Restrict Results Set
Wiki: Technical Details