Re: [Hebmorph-thinktank] Help need with HdbMorph Setup on Solr
Status: Pre-Alpha
Brought to you by:
synhershko
|
From: Shai <sh...@dr...> - 2012-01-09 10:06:51
|
Hi
I am re-testing HebMorph now for using with hebrew searches in apache-solr
I use apache-solr-1.4.1
and it seems to work with the latest HebMorph commit id
eb403a6ad63bfc0dc18cf100dc3f256a4a6eb8af
(even when compiled with lucene 3.0.2)
it seems to work but I didn't test it fully yet
I end up with something like this config for fieldType text in schema.xml -
I will be happy to know the configurations others use and if its fully
configured to work properly
(if i need to use additional filters/tokenizers/analyzers and so on...)
<fieldType name="text" class="solr.TextField">
<analyzer type="index"
class="org.apache.lucene.analysis.hebrew.MorphAnalyzer">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="stopwords.txt"
enablePositionIncrements="true"
/>
<filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="1"
catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SnowballPorterFilterFactory" language="English"
protected="protwords.txt"/>
</analyzer>
<analyzer type="query"
class="org.apache.lucene.analysis.hebrew.MorphAnalyzer">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
ignoreCase="true" expand="true"/>
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="stopwords.txt"
enablePositionIncrements="true"
/>
<filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="0"
catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SnowballPorterFilterFactory" language="English"
protected="protwords.txt"/>
</analyzer>
</fieldType>
On Thu, Nov 24, 2011 at 11:29 PM, Itamar Syn-Hershko <it...@co...>wrote:
> I'm not really sure what to tell you. I never used HebMorph with Solr, but
> I know some people did (
> http://lucene.472066.n3.nabble.com/using-HebMorph-td1826534.html),
> possibly with earlier versions.
>
> Java's ClassCastException is sometimes when compilation to jar isn't done
> correctly.
>
> Sorry I can't be of more help atm.
>
> On Thu, Nov 24, 2011 at 6:59 PM, Manoj Damodaran <mda...@at...>wrote:
>
>> Itamar,****
>>
>> ** **
>>
>> I gave up making it work with lucene 2.9.3 (solr 1.4.1) and tried to
>> compile HebMorph for other solr versions, but none of them work.****
>>
>> Solr Lucene****
>>
>> 1.4.1 2.9.3****
>>
>> 3.1.0 3.1.0****
>>
>> 3.2.0 3.2.0****
>>
>> 3.3.0 3.3.0****
>>
>> 3.4.0 3.4.0****
>>
>> ** **
>>
>> Lucene 3.0.2 is not bundled with any solr. I am getting the below runtime
>> exception****
>>
>> ** **
>>
>> 24-Nov-2011 16:58:39 org.apache.solr.schema.IndexSchema readAnalyzer****
>>
>> SEVERE: Cannot load analyzer:
>> org.apache.lucene.analysis.hebrew.MorphAnalyzer****
>>
>> java.lang.ClassCastException: class
>> org.apache.lucene.analysis.hebrew.MorphAnalyzer****
>>
>> at java.lang.Class.asSubclass(Unknown Source)****
>>
>> at
>> org.apache.solr.schema.IndexSchema.readAnalyzer(IndexSchema.java:828)****
>>
>> at
>> org.apache.solr.schema.IndexSchema.access$100(IndexSchema.java:62)****
>>
>> at
>> org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:450)****
>>
>> at
>> org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:435)****
>>
>> at
>> org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:140)
>> ****
>>
>> ** **
>>
>> Has anyone had success running HebMorph on Solr, What version did they
>> use.****
>>
>> ** **
>>
>> Thanks,****
>>
>> Manoj****
>>
>> ** **
>>
>> *From:* ita...@gm... [mailto:ita...@gm...]
>> *On Behalf Of *Itamar Syn-Hershko
>> *Sent:* 23 November 2011 07:39 PM
>>
>> *To:* Manoj Damodaran
>> *Cc:* heb...@li...
>> *Subject:* Re: [Hebmorph-thinktank] Help need with HdbMorph Setup on Solr
>> ****
>>
>> ** **
>>
>> MorphAnalyzer is compiled against 3.0.2, and the API might have changed.
>> Can you try looking at the project history, I think it was 2.9.3 not long
>> ago, that should get you going.****
>>
>> On Wed, Nov 23, 2011 at 2:44 PM, Manoj Damodaran <mda...@at...>
>> wrote:****
>>
>> Itmar,****
>>
>> ****
>>
>> Thanks for the quick response.****
>>
>> I would like to make it work with Lucene 2.9.3 (solr 1.41.) if possible
>> as upgrading the solr will bring other complications for me. I changed the
>> ant build script to use <property name="lucene-version" value="2.9.3" />
>> now Solr loads Lucene 2.9.3 libs, but I still get the same runtime error
>> when loading MorphAnalyzer****
>>
>> ****
>>
>> Thanks,****
>>
>> Manoj****
>>
>> ****
>>
>> ****
>>
>> *From:* ita...@gm... [mailto:ita...@gm...]
>> *On Behalf Of *Itamar Syn-Hershko
>> *Sent:* 22 November 2011 18:45
>> *To:* Manoj Damodaran
>> *Cc:* heb...@li...
>> *Subject:* Re: [Hebmorph-thinktank] Help need with HdbMorph Setup on Solr
>> ****
>>
>> ****
>>
>> That is probably because HebMorph is compiled against Lucene 3.0.2 in the
>> Java version. Try changing that, or using a compatible version of Solr, let
>> me know how it goes.****
>>
>> On Tue, Nov 22, 2011 at 7:57 PM, Manoj Damodaran <mda...@at...>
>> wrote:****
>>
>> Hi,****
>>
>> ****
>>
>> I am trying to use HebMorph to do hebrew search with Solr in our
>> application. HebMorph looks quite promising, but I am having difficulty
>> making it work.****
>>
>> ****
>>
>> I am not able to make solr useHebMorph. I am able to build the Jar files
>> and have put them in the lib folder. When I make schema change to add filed
>> type to use lucene.analysis.hebrew.MorphAnalyzer , I get a run-time
>> exception shown below. Any idea what is going wrong ? I am running Solr
>> 1.4.1( Lucene 2.9.3)****
>>
>> ****
>>
>> Nov 22, 2011 5:38:51 PM org.apache.solr.common.SolrException log****
>>
>> SEVERE: java.lang.ClassCastException:
>> org.apache.lucene.analysis.hebrew.MorphAnalyzer cannot be cast to
>> org.apache.lucene.analysis.Analyzer****
>>
>> at
>> org.apache.solr.schema.IndexSchema.readAnalyzer(IndexSchema.java:759)****
>>
>> at
>> org.apache.solr.schema.IndexSchema.access$100(IndexSchema.java:58)****
>>
>> at
>> org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:429)****
>>
>> ****
>>
>> Thanks,****
>>
>> *Manoj*****
>>
>> ****
>>
>>
>>
>> ------------------------------------------------------------------------------
>> All the data continuously generated in your IT infrastructure
>> contains a definitive record of customers, application performance,
>> security threats, fraudulent activity, and more. Splunk takes this
>> data and makes sense of it. IT sense. And common sense.
>> http://p.sf.net/sfu/splunk-novd2d
>> _______________________________________________
>> Hebmorph-thinktank mailing list
>> Heb...@li...
>> https://lists.sourceforge.net/lists/listinfo/hebmorph-thinktank****
>>
>> ****
>>
>> ** **
>>
>
>
>
> ------------------------------------------------------------------------------
> All the data continuously generated in your IT infrastructure
> contains a definitive record of customers, application performance,
> security threats, fraudulent activity, and more. Splunk takes this
> data and makes sense of it. IT sense. And common sense.
> http://p.sf.net/sfu/splunk-novd2d
> _______________________________________________
> Hebmorph-thinktank mailing list
> Heb...@li...
> https://lists.sourceforge.net/lists/listinfo/hebmorph-thinktank
>
>
|