currently lucy selects the first N words from a query,
where N is a compile time constant (for efficiency
reasons). A better strategy would be to select the N
most selective words (by inverse document frequency),
with a cutoff to prevent lots of parsing for really
long queries. Phrases and ANDs could be assumed to be
more selective than individual words. They could also
be included on a first come basis, given that this
simplifies things and we can only estimate their IDF
anyway.