From: David M. <ma...@ha...> - 2012-08-29 12:19:23
|
I'm trying to figure out the search processing in Solr::initSearchQuery() vs. the processing in Solr::buildQuery() and like to ask for someone to check if the following analysis is correct. This analysis only deals with user searches and completely ignores overrideQueries. Point of reference is current master in <https://github.com/dmj/vf2/> which was synced with VF2 main repository this morning. Solr\Result\PerformSearch() passes a `query' argument to Solr::search(). This query is either an overrideQuery (ignored in this analysis) or the return value of calling Solr::buildQuery() with the user search. Solr::buildQuery() processes the user search in some cases. Solr::initSearchQuery() takes the `query' argument and decides to process the argument in some cases, too. Question: Does VF process the user search twice? Answer: No (?). Solr::buildQuery() does not process the user search in <https://github.com/dmj/vf2/blob/master/module/VuFind/src/VuFind/Connection/Solr.php#L666> After analyzing the code I would conclude that this line is executed only in two cases: 1. A simple search In case of an simple search the search type is stored in $params['index'], not $params['field']. 2. A single search in an advanced search with no search type selected and no default search type If we can rule out case 2. as a legitimate use case: A simple search is simply returned as-is (save some normalization), advanced search is processed (searchspecs etc.). Compare with Connection\Solr::initSearchQuery(). If my analysis is true then the core of this function deals with simple searches only. <https://github.com/dmj/vf2/blob/master/module/VuFind/src/VuFind/Connection/Solr.php#L802> The if condition on Line 802 always returns FALSE for an advanced search because the query string for an advanced search created in Solr::buildQuery() always counts as an advanced query string for Solr::isAdvanced(). This leaves the else clause in L835ff; this block is execute if the query string is an already processed advanced search -or- a simple search with advanced search syntax. In both cases we uppercase ranges and booleans if neccessary (which does nothing for advanced search because already done in buildQuery()), and decide if we need modifiy highlighting setup and create an advanced query string. <https://github.com/dmj/vf2/blob/master/module/VuFind/src/VuFind/Connection/Solr.php#L835> The if in Line 847 is only true if the search is a simple search. `handler' is set by \Search\Solr\Results::performSearch() to the return value of Search\Base\Params::getSearchHandler() which in the default implementation returns the search type only if the search is a simple search. Thus the inner block of the if only affects simple search w/ advanced search syntax. Conclusion: initSearchQuery() uses an already processed advanced search as-is and processes simple searches, buildQuery() uses simple searches as-is and processes advanced searches. Best and thanks in advance, -- David -- David Maus Herzog August Bibliothek - D-38299 Wolfenbuettel Phone: +49-5331-808-317 Email: ma...@ha... PGP Key 0x0CC2E093512F7385 Fingerprint 1AD2 EE67 224F 18C5 EA55 98AD 0CC2 E093 512F 7385 |
From: Demian K. <dem...@vi...> - 2012-08-29 13:11:34
|
I believe that your analysis is correct, but perhaps it would be helpful to understand the original intent of the code as you refactor. The main purpose of Solr::buildQuery is to flatten the advanced search arrays into something that can serve as a real Solr query. The main purpose of Solr::initSearchQuery is to take a query string (never an array), apply handler settings if necessary, and normalize it according to VuFind configuration. The functionality of these two methods overlap in confusing ways for a couple of reasons: 1.) Solr::buildQuery is recursive. It essentially breaks an advanced query down into a set of basic queries and lumps them all together. Although it is primarily concerned with advanced searches, it needs to normalize basic queries because it has to normalize each chunk of the advanced query. 2.) Solr::initSearchQuery is meant to be more general-purpose than Solr::buildQuery. It wants a well-formed Solr query of some sort, but it doesn't care whether it came from buildQuery or someplace else (i.e. the overrideQuery). Therefore it needs to do some normalization of its own without making any assumptions about the source of the query. I'm sure this can be simplified, but I don't have a brilliant plan for how to do it better. If nothing else, a good first step would be to break out some support methods to make it more readable (i.e. create a normalizeCapitalization() method to replace the duplicate configuration checks and calls to the SolrUtils::capitalize* methods). Perhaps it would also make sense to rename buildQuery to something like flattenAdvancedSearch to make its purpose more clear. It might even make sense to adjust the calling search object so that more knowledge about parameter names and the VuFind application's behavior is external to the Solr class... though I'm not sure how practical that would be. I hope this is helpful, but if there are any specific points of your analysis you would like me to address in more detail, let me know and I'll take a closer look at the code. - Demian ________________________________________ From: David Maus [ma...@ha...] Sent: Wednesday, August 29, 2012 8:19 AM To: vuf...@li... Subject: [VuFind-Tech] Connection\Solr::buildQuery() vs Connection\Solr::initSearchQuery() I'm trying to figure out the search processing in Solr::initSearchQuery() vs. the processing in Solr::buildQuery() and like to ask for someone to check if the following analysis is correct. This analysis only deals with user searches and completely ignores overrideQueries. Point of reference is current master in <https://github.com/dmj/vf2/> which was synced with VF2 main repository this morning. Solr\Result\PerformSearch() passes a `query' argument to Solr::search(). This query is either an overrideQuery (ignored in this analysis) or the return value of calling Solr::buildQuery() with the user search. Solr::buildQuery() processes the user search in some cases. Solr::initSearchQuery() takes the `query' argument and decides to process the argument in some cases, too. Question: Does VF process the user search twice? Answer: No (?). Solr::buildQuery() does not process the user search in <https://github.com/dmj/vf2/blob/master/module/VuFind/src/VuFind/Connection/Solr.php#L666> After analyzing the code I would conclude that this line is executed only in two cases: 1. A simple search In case of an simple search the search type is stored in $params['index'], not $params['field']. 2. A single search in an advanced search with no search type selected and no default search type If we can rule out case 2. as a legitimate use case: A simple search is simply returned as-is (save some normalization), advanced search is processed (searchspecs etc.). Compare with Connection\Solr::initSearchQuery(). If my analysis is true then the core of this function deals with simple searches only. <https://github.com/dmj/vf2/blob/master/module/VuFind/src/VuFind/Connection/Solr.php#L802> The if condition on Line 802 always returns FALSE for an advanced search because the query string for an advanced search created in Solr::buildQuery() always counts as an advanced query string for Solr::isAdvanced(). This leaves the else clause in L835ff; this block is execute if the query string is an already processed advanced search -or- a simple search with advanced search syntax. In both cases we uppercase ranges and booleans if neccessary (which does nothing for advanced search because already done in buildQuery()), and decide if we need modifiy highlighting setup and create an advanced query string. <https://github.com/dmj/vf2/blob/master/module/VuFind/src/VuFind/Connection/Solr.php#L835> The if in Line 847 is only true if the search is a simple search. `handler' is set by \Search\Solr\Results::performSearch() to the return value of Search\Base\Params::getSearchHandler() which in the default implementation returns the search type only if the search is a simple search. Thus the inner block of the if only affects simple search w/ advanced search syntax. Conclusion: initSearchQuery() uses an already processed advanced search as-is and processes simple searches, buildQuery() uses simple searches as-is and processes advanced searches. Best and thanks in advance, -- David -- David Maus Herzog August Bibliothek - D-38299 Wolfenbuettel Phone: +49-5331-808-317 Email: ma...@ha... PGP Key 0x0CC2E093512F7385 Fingerprint 1AD2 EE67 224F 18C5 EA55 98AD 0CC2 E093 512F 7385 |
From: David M. <ma...@ha...> - 2012-08-30 04:09:25
|
At Wed, 29 Aug 2012 13:11:26 +0000, Demian Katz wrote: > > I believe that your analysis is correct, but perhaps it would be > helpful to understand the original intent of the code as you > refactor. The main purpose of Solr::buildQuery is to flatten the > advanced search arrays into something that can serve as a real Solr > query. The main purpose of Solr::initSearchQuery is to take a query > string (never an array), apply handler settings if necessary, and > normalize it according to VuFind configuration. > > ... Thanks for the fast reply. This description fits my observations; to rephrase it: Solr::buildQuery() reduces an advanced search to a simple search Solr::initSearchQuery() operates on a simple search Where `simple search' is a pair of (search string . search handler). overrideQueries are simple searches. E.g. the programmatically created overrideQuery id:(foo OR bar OR baz) . null Is identical to a hypothecial search id:(foo OR bar OR baz) entered by the user. Except the search handler (null in override, something in user generated search). Wrt refactoring this looks good. The current plan is to move the entire process of building up the SOLR query to a QueryBuilder. E.g. $qb = new QueryBuilder(); $qb->setSearchSpecsFile(FILE); $qb->setUserQuery(USERQUERY); ... $solrQuery = $qb->buildQuery(); Best, -- David -- David Maus Herzog August Bibliothek - D-38299 Wolfenbuettel Phone: +49-5331-808-317 Email: ma...@ha... PGP Key 0x0CC2E093512F7385 Fingerprint 1AD2 EE67 224F 18C5 EA55 98AD 0CC2 E093 512F 7385 |
From: Demian K. <dem...@vi...> - 2012-08-30 12:45:15
|
> Thanks for the fast reply. This description fits my observations; to rephrase > it: > > Solr::buildQuery() reduces an advanced search to a simple search > Solr::initSearchQuery() operates on a simple search Exactly, it sounds like we are on the same page here. - Demian |