The following set of pages gives an introduction to what happens "under the hood" when a simple query is executed using IndriRunQuery or from the runQuery() method of the QueryEnvironment class within the API. This set of documents will take you through the gritty details from start to finish with by showing you what happens at each stage of the query processing.
The following example will go through (in excruciating detail) the parsing and evaluation of the simple query: "
#combine( dog )".
The basic eight steps involved in running a query are:
For illustrative purposes, the following very basic TREC-formatted dataset will be used throughout the discussion of the query flow:
<DOC> <DOCNO>SIMPLE_DOCNO_1</DOCNO> <TEXT> The quick brown fox jumps over the lazy dog. </TEXT> </DOC> <DOC> <DOCNO>SIMPLE_DOCNO_2</DOCNO> <TEXT> Cats are smarter than dogs. You can't get eight cats to pull a sled through snow. </TEXT> </DOC> <DOC> <DOCNO>SIMPLE_DOCNO_3</DOCNO> <TEXT> People who hate cats, will come back as mice in their next life. </TEXT> </DOC> <DOC> <DOCNO>SIMPLE_DOCNO_4</DOCNO> <TEXT> If dogs could talk, it would take a lot of the fun out of owning one. ~Andy Rooney </TEXT> </DOC>
Wiki: Application of Smoothing Parameters
Wiki: Extent Restriction
Wiki: Gathering Statistics
Wiki: Parser Creation
Wiki: Query Parsing
Wiki: Raw Scoring Node Extraction
Wiki: Scored Query Evaluation
Wiki: Sort and Restrict Results Set
Wiki: Technical Details