From: Areki, A. <aa...@ri...> - 2022-03-18 20:08:23
|
Much appreciation and thank you, Joe. The following param works in existdb 2.1 <param name="stopwords" type="java.util.Set"> <!--<value>a</value>--> </param> Thanks again ALEM -- Alem T. Areki Senior Web Developer – Web Services University of Richmond phone: 804-289-8899 | Fax: 804-289-8988 Please send technical questions or requests to we...@ri...<mailto:we...@ri...> and not to individuals if at all possible. Thank you for your email! Someone from Web Services will contact you regarding your request by the end of the next business day. To submit technical support requests or defect reports please use the SpiderTechNet System: https://spidertechnet.richmond.edu/TDClient/1955/Portal/Requests/ServiceCatalog?CategoryID=12767 For other Technology Questions and Issues visit: https://spidertechnet.richmond.edu/ From: Joe Wicentowski <jo...@gm...> Date: Friday, March 18, 2022 at 3:00 PM To: "Areki, Alem" <aa...@ri...> Cc: "exi...@li..." <exi...@li...> Subject: Re: [Exist-open] Lucene search issue with existed 2.1 External Email: Use caution in opening links, attachments, and buying gift cards. Hi Alem, The customized stopwords feature I described was added in eXist 1.4.3, so it should work in 2.1 as well. See https://github.com/eXist-db/exist/commit/744d104e8bacb17da5d54cc0d17d484601346867. Give a close look at the @type attribute on the <param> element though in the examples added in that commit: https://github.com/eXist-db/exist/commit/744d104e8bacb17da5d54cc0d17d484601346867#diff-35d71d352dd07d7f78245dc1936346e5a2db7a1908b593eda31f4e4f427d1b8dR341-R364. Notice that it was <param type="java.util.Set">. In contrast, in the current documentation I linked to, it's <param type="org.apache.lucene.analysis.util.CharArraySet">. So perhaps for 2.1 you should use "java.util.Set." Joe On Fri, Mar 18, 2022 at 10:38 AM Areki, Alem <aa...@ri...<mailto:aa...@ri...>> wrote: Thanks, Joe. It works with my local existdb version 4.1 but not version 2.1. thanks -- Alem T. Areki Senior Web Developer – Web Services University of Richmond phone: 804-289-8899 | Fax: 804-289-8988 Please send technical questions or requests to we...@ri...<mailto:we...@ri...> and not to individuals if at all possible. Thank you for your email! Someone from Web Services will contact you regarding your request by the end of the next business day. To submit technical support requests or defect reports please use the SpiderTechNet System: https://spidertechnet.richmond.edu/TDClient/1955/Portal/Requests/ServiceCatalog?CategoryID=12767 For other Technology Questions and Issues visit: https://spidertechnet.richmond.edu/ From: Joe Wicentowski <jo...@gm...<mailto:jo...@gm...>> Date: Friday, March 18, 2022 at 8:57 AM To: "Areki, Alem" <aa...@ri...<mailto:aa...@ri...>> Cc: "exi...@li...<mailto:exi...@li...>" <exi...@li...<mailto:exi...@li...>> Subject: Re: [Exist-open] Lucene search issue with existed 2.1 External Email: Use caution in opening links, attachments, and buying gift cards. Hi Alem, The words "to" and "the" are included in Lucene's default list of stopwords. They're dropped when indexing documents and ignored when querying them. To ensure these words are indexed, you have to either customize the list of stopwords or explicitly eliminate all stopwords, adding the following <param> element to the <analyzer> element in your .xconf file: <analyzer class="org.apache.lucene.analysis.standard.StandardAnalyzer"> <!-- Specify stop words - or remove them entirely --> <param name="stopwords" type="org.apache.lucene.analysis.util.CharArraySet"> <!--<value>the</value>--> </param> </analyzer> For the list of words treated as stopwords by default in Lucene, see https://markmail.org/message/wxmtjzbskgrj2cug. Place any stopwords you want to keep in <value> elements. By omitting any <value> elements (as shown above), the you've removed all stopwords. For the current eXist docs on configuring Lucene's stopword facility, see https://exist-db.org/exist/apps/doc/lucene#conf. Joe On Thu, Mar 17, 2022 at 5:59 PM Areki, Alem <aa...@ri...<mailto:aa...@ri...>> wrote: Hi all, I am having a problem Lucene searching with stop words in between the phrase keywords * The title of the article to be searched: “Where to watch: Catch the Spiders in March Madness” * A searching phrase like “March Madness” works but phrases like “Where to Watch” or “Catch the Spiders” don’t work. * Searching the phrases doesn’t work has stopwords in between them. Does anyone have any clue why I am having this issue? Thanks Alem -- Alem T. Areki Senior Web Developer – Web Services University of Richmond _______________________________________________ Exist-open mailing list Exi...@li...<mailto:Exi...@li...> https://lists.sourceforge.net/lists/listinfo/exist-open |