From: Juri L. <ju...@ex...> - 2022-03-18 15:14:13
|
More recent versions of exist, I believe since 5.3.0, allow you to have separate analyzers for indexing and querying. Index without stop-words but query with them is a good combination. That way non-phrase searches would not regard stop-words; as one usually wants, but phrase searches can still find them as they are part of the index do include them. It is at least worth a try. On 18. March 2022 at 13:57:25, Joe Wicentowski (jo...@gm...) wrote: > Hi Alem, > > The words "to" and "the" are included in Lucene's default list of > stopwords. They're dropped when indexing documents and ignored when > querying them. To ensure these words are indexed, you have to either > customize the list of stopwords or explicitly eliminate all stopwords, > adding the following <param> element to the <analyzer> element in your > .xconf file: > > <analyzer class="org.apache.lucene.analysis.standard.StandardAnalyzer"> >> <!-- Specify stop words - or remove them entirely --> >> <param name="stopwords" type= >> "org.apache.lucene.analysis.util.CharArraySet"> >> <!--<value>the</value>--> >> </param> >> </analyzer> > > > For the list of words treated as stopwords by default in Lucene, see > https://markmail.org/message/wxmtjzbskgrj2cug. Place any stopwords you > want to keep in <value> elements. By omitting any <value> elements (as > shown above), the you've removed all stopwords. For the current eXist docs > on configuring Lucene's stopword facility, see > https://exist-db.org/exist/apps/doc/lucene#conf. > > Joe > > On Thu, Mar 17, 2022 at 5:59 PM Areki, Alem <aa...@ri...> wrote: > >> Hi all, >> >> >> >> I am having a problem Lucene searching with stop words in between the >> phrase keywords >> >> - The title of the article to be searched: “*Where to watch: Catch >> the Spiders in March Madness*” >> - A searching phrase like “*March Madness*” works but phrases like “*Where >> to Watch*” or “*Catch the Spiders*” don’t work. >> - Searching the phrases doesn’t work has stopwords in between them. >> >> >> >> Does anyone have any clue why I am having this issue? >> >> >> >> Thanks >> >> Alem >> >> -- >> >> Alem T. Areki >> >> Senior Web Developer – Web Services >> >> University of Richmond >> >> >> _______________________________________________ >> Exist-open mailing list >> Exi...@li... >> https://lists.sourceforge.net/lists/listinfo/exist-open >> > _______________________________________________ > Exist-open mailing list > Exi...@li... > https://lists.sourceforge.net/lists/listinfo/exist-open > |