Hi Demian,

thanks for your summary!
Did you ever take into account the possibility of SOLR's  local params syntax where you can mix different 'parts' of the whole query with dedicated Query-Parsers?
Going this way dismax could be used for the part which doesn't work properly with eDismax and eDismax as the default QueryParser?

To say it frankly: Personally I don't like this syntax - in a way for me it's crazy... but perhaps it would combine the advantages and disadvantages of the both parsers.

I found the best examples and introduction to this syntax in
http://www.manning.com/grainger/   chapter 7

Perhaps an idea?

Günter 



On 10/29/2013 08:45 PM, Demian Katz wrote:

So, to review:

 

The main problem with Extended Dismax is that it doesn’t properly apply the default operator when a NOT or - clause is used.

 

So if you enter:

 

apples oranges -bananas

 

or

 

apples oranges NOT bananas

 

you would expect your search results to be the same as those for:

 

apples AND oranges NOT bananas

 

but instead you get:

 

apples OR oranges NOT bananas

 

Very confusing.

 

On today’s dev call, we discussed the possibility of detecting the - or NOT operators, then failing back to the old Lucene code to get around this limitation. Alas, the plot thickens and it gets more complicated.

 

First of all, using the fallback code was a mistake. It currently does not handle NOT properly. Not surprising, because it’s not real DisMax. It creates a whole bunch of queries and OR’s them together – so you will frequently get results back that include the term you are attempting to exclude. There is no easy fix for this, aside from writing our own DisMax query generator in PHP, which would be an exercise in madness.

 

Another interesting discovery is that the basic DisMax handler does process the - operator appropriately…  so while a current instance of VuFind will break with “apples oranges NOT bananas” it will yield correct results for “apples oranges -bananas". So this is definitely a regression if we move to eDisMax. Maybe not a significant one, since library users are much more likely to use the broken NOT syntax than the working - syntax.

 

This all leaves me even more uncertain about the best road forward – switching to eDismax breaks something that is already broken, just in a different way. If the Solr team fixes the underlying problem that is causing this behavior, then we’ll be in great shape. In the meantime, it seems we have these options:

 

1.)    Stick with the status quo, but add the option to turn on eDismax if desired

2.)    Switch to eDismax, on the assumption that the benefits outweigh the drawbacks

3.)    Write some sort of crude query parser to insert AND operators into queries containing NOT or -. We can probably make the most common cases work fairly easily, but doing it correctly would require a lot of effort, and that may be a waste of time given that this is a workaround for a bug and not something that we need in an ideal world.

4.)    Write code to use the regular Dismax handler instead of eDismax for queries containing the - operator and no other operators. This will lead to optimal functionality of a small number of edge cases – not worthwhile in my opinion, but maybe worth mentioning.

 

Thoughts? Opinions?

 

I’d really like to get this wrapped up, but the best option is not obvious.

 

- Demian



------------------------------------------------------------------------------
Android is increasing in popularity, but the open development platform that
developers love is also attractive to malware creators. Download this white
paper to learn more about secure code signing practices that can help keep
Android apps secure.
http://pubads.g.doubleclick.net/gampad/clk?id=65839951&iu=/4140/ostg.clktrk


_______________________________________________
Vufind-tech mailing list
Vufind-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/vufind-tech


-- 
Universität Basel
Universitätsbibliothek
Günter Hipler
Projekt SwissBib
Schoenbeinstrasse 18-20
4056 Basel, Schweiz
Tel.: + 41 (0)61 267 31 12 Fax: ++41 61 267 3103
E-Mail guenter.hipler@unibas.ch
URL: www.swissbib.org  / http://www.ub.unibas.ch/