From: Hungerburg <pc...@my...> - 2010-07-30 13:38:44
|
I extracted lucene-snowball-2.9.2.jar from the lucene zip contrib area and put it into EXIST-HOME/extensions/indexes/lucene/lib/ In /db/system.../collection.xconf I added the line <analyzer id="snowball" class="org.apache.lucene.analysis.snowball.SnowballAnalyzer"/> When saving the file, even before reindexing from the admin-client, the message below is printed to exist.log: 2010-07-30 15:32:13,895 [eXistThread-21] WARN (CollectionConfiguration.java [read]:162) - Exception while instantiating analyzer class org.apache.lucene.analysis.snowball.SnowballAnalyzer: org.apache.lucene.analysis.snowball.SnowballAnalyzer org.exist.util.DatabaseConfigurationException: Exception while instantiating analyzer class org.apache.lucene.analysis.snowball.SnowballAnalyzer: org.apache.lucene.analysis.snowball.SnowballAnalyzer I think the extra jar is in the right place. Do I have to do something else to make it available? -- peter |
From: Mike F. <mik...@ya...> - 2010-07-30 14:07:20
|
Peter, >From my experience with this, you need to use the org.exist.indexing.lucene.LuceneIndex class because it is a wrapper for the eXist application. However, in your xconf and in the org.exist.indexing.lucene.LuceneIndex class you can indicate you want to use this analyzer. In your collection.xconf you can call this analyzer in the <analyzer> element. In the org.exist.indexing.lucene.LuceneIndex class you need to comment out the Lucene StandardAnalyzer (as the default analyzer used) and put this one in it. Import the class, put the jar containing the analyzer (org.apache.lucene.analysis.snowball.SnowballAnalyzer) in the eXist/lib/extensions folder, and recompile. You need to make sure that you xconf and the LuceneIndex.java refer to the same Analyzer. LuceneIndex.java is located at eXist/extensions/indexes/lucene/src/org/exist/indexing/lucene/LuceneIndex.java. package org.exist.indexing.lucene.LuceneIndex; import org.apache.lucene.analysis.snowball.SnowballAnalyzer; if (defaultAnalyzer == null) // defaultAnalyzer = new StandardAnalyzer(); defaultAnalyzer = new org.apache.lucene.analysis.snowball.SnowballAnalyzer(); When both the LuceneIndex and the xconf file are pointing to the same Analyzer, everything will be wonderful. -mike ________________________________ From: Hungerburg <pc...@my...> To: eXist Open <exi...@li...> Sent: Fri, July 30, 2010 9:38:36 AM Subject: [Exist-open] lucene snowball analyzer I extracted lucene-snowball-2.9.2.jar from the lucene zip contrib area and put it into EXIST-HOME/extensions/indexes/lucene/lib/ In /db/system.../collection.xconf I added the line <analyzer id="snowball" class="org.apache.lucene.analysis.snowball.SnowballAnalyzer"/> When saving the file, even before reindexing from the admin-client, the message below is printed to exist.log: 2010-07-30 15:32:13,895 [eXistThread-21] WARN (CollectionConfiguration.java [read]:162) - Exception while instantiating analyzer class org.apache.lucene.analysis.snowball.SnowballAnalyzer: org.apache.lucene.analysis.snowball.SnowballAnalyzer org.exist.util.DatabaseConfigurationException: Exception while instantiating analyzer class org.apache.lucene.analysis.snowball.SnowballAnalyzer: org.apache.lucene.analysis.snowball.SnowballAnalyzer I think the extra jar is in the right place. Do I have to do something else to make it available? -- peter ------------------------------------------------------------------------------ The Palm PDK Hot Apps Program offers developers who use the Plug-In Development Kit to bring their C/C++ apps to Palm for a share of $1 Million in cash or HP Products. Visit us here for more details: http://p.sf.net/sfu/dev2dev-palm _______________________________________________ Exist-open mailing list Exi...@li... https://lists.sourceforge.net/lists/listinfo/exist-open |
From: Wolfgang M. <wol...@gm...> - 2010-07-30 15:13:56
|
> In the org.exist.indexing.lucene.LuceneIndex class you need to comment out > the Lucene StandardAnalyzer (as the default analyzer used) and put this one > in it. This should not be necessary. I'm not sure why you had to do that. However, in addition to collection.xconf, you can also configure the analyzer in the main conf.xml: <module id="lucene-index" class="org.exist.indexing.lucene.LuceneIndex" buffer="32"> <analyzer class="..."/> </module> Peter, putting the module jar into EXIST-HOME/extensions/indexes/lucene/lib/ should indeed make it available on the classpath. Wolfgang |
From: Mike F. <mik...@ya...> - 2010-07-30 15:21:06
|
Wolfgang, I appreciate the information on this. I must have missed it in some of the documentation. Sincerely, -mike ________________________________ From: Wolfgang Meier <wol...@gm...> To: Mike Ferrando <mik...@ya...> Cc: Hungerburg <pc...@my...>; eXist Open <exi...@li...> Sent: Fri, July 30, 2010 11:13:49 AM Subject: Re: [Exist-open] lucene snowball analyzer > In the org.exist.indexing.lucene.LuceneIndex class you need to comment out > the Lucene StandardAnalyzer (as the default analyzer used) and put this one > in it. This should not be necessary. I'm not sure why you had to do that. However, in addition to collection.xconf, you can also configure the analyzer in the main conf.xml: <module id="lucene-index" class="org.exist.indexing.lucene.LuceneIndex" buffer="32"> <analyzer class="..."/> </module> Peter, putting the module jar into EXIST-HOME/extensions/indexes/lucene/lib/ should indeed make it available on the classpath. Wolfgang |
From: Hungerburg <pc...@my...> - 2010-07-30 18:41:28
|
Am 2010-07-30 17:13, schrieb Wolfgang Meier: >> In the org.exist.indexing.lucene.LuceneIndex class you need to comment out >> the Lucene StandardAnalyzer (as the default analyzer used) and put this one >> in it. > > This should not be necessary. I'm not sure why you had to do that. > However, in addition to collection.xconf, you can also configure the > analyzer in the main conf.xml: > > <module id="lucene-index" class="org.exist.indexing.lucene.LuceneIndex" > buffer="32"> > <analyzer class="..."/> > </module> > > Peter, putting the module jar into > EXIST-HOME/extensions/indexes/lucene/lib/ should indeed make it > available on the classpath. > > Wolfgang I think it is available on the classpath, there is no not-found-exception but a instantiation-exception. I added the import statement in LuceneIndex.java and recompiled. The message remains and no index will be created: Caused by: java.lang.InstantiationException: org.apache.lucene.analysis.snowball.SnowballAnalyzer at java.lang.Class.newInstance0(Class.java:340) at java.lang.Class.newInstance(Class.java:308) at org.exist.indexing.lucene.AnalyzerConfig.configureAnalyzer(AnalyzerConfig.java:43) I will give in. it was very easy to get pdf-images and hyphenation patterns into fop. this is a tougher one. I also tried to pass some more arguments to the analyzer in collection.xconf: <analyzer id="snowball" class="org.apache.lucene.analysis.snowball.SnowballAnalyzer" stemmer="German" locale="de"/> regards -- peter |
From: Mike F. <mik...@ya...> - 2010-07-30 18:57:58
|
Peter, You shouldn't need the import statement in LuceneIndex.java. All you need is the <analyzer> element child in the <module>. But it looks like it is a separate jar. http://www.docjar.com/docs2web/s.jsp?q=org.apache.lucene.analysis.snowball.SnowballAnalyzer&t=j&start=0 Try downloading the jar and put it in the lib directory and recompile. -mike ________________________________ From: Hungerburg <pc...@my...> To: Wolfgang Meier <wol...@gm...> Cc: Mike Ferrando <mik...@ya...>; eXist Open <exi...@li...> Sent: Fri, July 30, 2010 2:41:24 PM Subject: Re: [Exist-open] lucene snowball analyzer Am 2010-07-30 17:13, schrieb Wolfgang Meier: >> In the org.exist.indexing.lucene.LuceneIndex class you need to comment out >> the Lucene StandardAnalyzer (as the default analyzer used) and put this one >> in it. > > This should not be necessary. I'm not sure why you had to do that. > However, in addition to collection.xconf, you can also configure the > analyzer in the main conf.xml: > > <module id="lucene-index" class="org.exist.indexing.lucene.LuceneIndex" > buffer="32"> > <analyzer class="..."/> > </module> > > Peter, putting the module jar into > EXIST-HOME/extensions/indexes/lucene/lib/ should indeed make it > available on the classpath. > > Wolfgang I think it is available on the classpath, there is no not-found-exception but a instantiation-exception. I added the import statement in LuceneIndex.java and recompiled. The message remains and no index will be created: Caused by: java.lang.InstantiationException: org.apache.lucene.analysis.snowball.SnowballAnalyzer at java.lang.Class.newInstance0(Class.java:340) at java.lang.Class.newInstance(Class.java:308) at org.exist.indexing.lucene.AnalyzerConfig.configureAnalyzer(AnalyzerConfig.java:43) I will give in. it was very easy to get pdf-images and hyphenation patterns into fop. this is a tougher one. I also tried to pass some more arguments to the analyzer in collection.xconf: <analyzer id="snowball" class="org.apache.lucene.analysis.snowball.SnowballAnalyzer" stemmer="German" locale="de"/> regards -- peter |
From: Mike F. <mik...@ya...> - 2010-07-31 23:51:39
|
Peter, Ok, it looks like you will need to modify the code of LuceneIndex or AnalyzerConfig to get this analyzer to work. I also found the full 2.4.1 jar here: http://www.java2s.com/Code/Jar/JKL/Downloadlucenesnowball241jar.htm I de-compiled the jar and used jad to recreate the java files (attached). SnowballAnalyzer has NO default constructor, but requires a String (as you have indicated) to complete the call to other classes. But the AnalyzerConfig only reads two attributes of the <analyzer> element: @id and @class. What would be nice is to have the AnalyzerConfig parse something like: <analyzer class="org.apache.lucene.analysis.snowball.SnowballAnalyzer"> <property name="name" value="German"/> </analyzer> But since that is not the case, there are lots of ways you could go about this, but the easiest (hack) is to import the SnowballAnalyzer to the LuceneIndex I was able to get this to work: LuceneIndex.java import org.apache.lucene.analysis.snowball.SnowballAnalyzer; if (defaultAnalyzer == null) // defaultAnalyzer = new StandardAnalyzer(); defaultAnalyzer = new org.apache.lucene.analysis.snowball.SnowballAnalyzer("German"); conf.xml <module id="lucene-index" class="org.exist.indexing.lucene.LuceneIndex" buffer="32"/> collection.xconf <analyzer class="org.apache.lucene.analysis.snowball.SnowballAnalyzer"/> This is a hack! I wrote some code into the AnalyzerConfig. It compiled without error, but always threw an instantiate error on startup. Maybe someone on the list can do better. -mike conf.xml and collection.xconf <analyzer stemmer="German" class="org.apache.lucene.analysis.snowball.SnowballAnalyzer"/> Code I wrote into the AnalyzerConfig class private final static String STEMMER_ATTRIBUTE = "stemmer"; String stemmer = config.getAttribute(STEMMER_ATTRIBUTE); if(className.equalsIgnoreCase("SnowballAnalyzer")) { java.lang.reflect.Constructor con = clazz.getConstructor(new Class[] {String.class}); return (Analyzer)con.newInstance(new Object[]{new String(stemmer)}); } else { return (Analyzer)clazz.newInstance(); } catch exceptions: java.lang.reflect.InvocationTargetException NoSuchMethodException ________________________________ From: Mike Ferrando <mik...@ya...> To: Hungerburg <pc...@my...>; Wolfgang Meier <wol...@gm...> Cc: eXist Open <exi...@li...> Sent: Fri, July 30, 2010 2:57:51 PM Subject: Re: [Exist-open] lucene snowball analyzer Peter, You shouldn't need the import statement in LuceneIndex.java. All you need is the <analyzer> element child in the <module>. But it looks like it is a separate jar. http://www.docjar.com/docs2web/s.jsp?q=org.apache.lucene.analysis.snowball.SnowballAnalyzer&t=j&start=0 Try downloading the jar and put it in the lib directory and recompile. -mike ________________________________ From: Hungerburg <pc...@my...> To: Wolfgang Meier <wol...@gm...> Cc: Mike Ferrando <mik...@ya...>; eXist Open <exi...@li...> Sent: Fri, July 30, 2010 2:41:24 PM Subject: Re: [Exist-open] lucene snowball analyzer Am 2010-07-30 17:13, schrieb Wolfgang Meier: >> In the org.exist.indexing.lucene.LuceneIndex class you need to comment out >> the Lucene StandardAnalyzer (as the default analyzer used) and put this one >> in it. > > This should not be necessary. I'm not sure why you had to do that. > However, in addition to collection.xconf, you can also configure the > analyzer in the main conf.xml: > > <module id="lucene-index" class="org.exist.indexing.lucene.LuceneIndex" > buffer="32"> > <analyzer class="..."/> > </module> > > Peter, putting the module jar into > EXIST-HOME/extensions/indexes/lucene/lib/ should indeed make it > available on the classpath. > > Wolfgang I think it is available on the classpath, there is no not-found-exception but a instantiation-exception. I added the import statement in LuceneIndex.java and recompiled. The message remains and no index will be created: Caused by: java.lang.InstantiationException: org.apache.lucene.analysis.snowball.SnowballAnalyzer at java.lang.Class.newInstance0(Class.java:340) at java.lang.Class.newInstance(Class.java:308) at org.exist.indexing.lucene.AnalyzerConfig.configureAnalyzer(AnalyzerConfig.java:43) I will give in. it was very easy to get pdf-images and hyphenation patterns into fop. this is a tougher one. I also tried to pass some more arguments to the analyzer in collection.xconf: <analyzer id="snowball" class="org.apache.lucene.analysis.snowball.SnowballAnalyzer" stemmer="German" locale="de"/> regards -- peter |