hebmorph-thinktank Mailing List for HebMorph
Status: Pre-Alpha
Brought to you by:
synhershko
You can subscribe to this list here.
| 2010 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(1) |
Jul
|
Aug
(3) |
Sep
|
Oct
(1) |
Nov
(3) |
Dec
(14) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2011 |
Jan
(3) |
Feb
|
Mar
(3) |
Apr
|
May
(15) |
Jun
(17) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(6) |
Dec
|
| 2012 |
Jan
(5) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2013 |
Jan
|
Feb
|
Mar
|
Apr
(4) |
May
(3) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2014 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(1) |
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2017 |
Jan
|
Feb
(1) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
| 2018 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
|
From: layal a. <lab...@gm...> - 2018-05-21 08:50:20
|
Hi, I am developing an application that supports indexing & searching of multi-language texts, including hebrew, using the "solr" engine. After lots of searches, I found that HebMorph is the best plugin to use for hebrew language My problem is that the behavior of HebMorph with hebrew stopwords seems to be different than solr: - Whith solr (any language): when I search for a stopword, the results returned doesn't include any of the stopwords exxisting in query. - Whereas when I search for hebrew terms (after pluging HebMorh in solr following [this link][1]), the returned results include all existing stopwords in the query. 1. Is this the normal behavior for HebMorph? If yes, how can I alter it? If no, what should I change? 2. Since HebMorph doesn't support synonyms, (as I read in their documentation that it is a future work). Is there a way to use synonyms for hebrew as other languages the way solr supports it? (i.e. by adding the proper filter in solrconfig and pointing out to the synonyms file)? Thanks in advance for your help. |
|
From: Shuli B <sh...@gm...> - 2017-02-06 10:22:32
|
Hi, I'm using the hebmorph code for developing a utility of desktop search in all text files. First, it works very good. I would like to add Also Aramaic files, that language is similar to hebrew in few properties. I'm trying to understand the code, and find where to add the reference to that language. Important to note, that I don't need it perfect, but only the concept. I'm trying and didn't manage, I'll be very happy if somebody can give me some references to the relevant files that I need to work on, or maybe even little bit more. Thanks |
|
From: elyashiv <ely...@gm...> - 2014-07-23 07:19:00
|
Hi, I started looking into using HebMorph for a project of my. Following the instructions in http://code972.com/blog/2013/08/129-hebrew-search-with-elasticsearch-and-hebmorph I downloaded the source for HebMorph. I searched for the class 'ReusableAnalyzerBase', but couldn't find it. I searched both the lucene source and HebMorph source, using the following commands: find . -name '*ReusableAnalyzerBase*'' grep -r 'ReusableAnalyzerBase' both resulted with no results. I searched the jars using "jar tf", couldn't find a thing there. Can any one help? Any way, can any one point me to documentation on HebMorph? I tried searching for some, but couldn't find any thing. Thanks in advance, Elyashiv. |
|
From: Tzeviya <tze...@gm...> - 2014-05-06 17:34:40
|
Hi, I'm looking for a Hebrew stemmer (a stemmer, *not* a lemmatizer). Does HebMorph do the job? Poking around the website, it seems to be an indexer...? (sorry, I'm completely unfamiliar with this system and with lucene). If HebMorph is indeed the stemmer I'm looking for, are there any installation instructions anywhere? Thank you. |
|
From: Brett L. <bl...@gm...> - 2013-05-03 16:52:08
|
This is fantastic, really appreciate it. I will let you know how it works out for me. On Fri, May 3, 2013 at 3:30 AM, Itamar Syn-Hershko <it...@co...>wrote: > There you go: > http://www.code972.com/blog/2013/05/hebrew-search-with-elasticsearch-using-hebmorph > > > On Thu, May 2, 2013 at 5:38 AM, Brett Lockspeiser <bl...@gm...>wrote: > >> Hi, >> >> I'd like to use hebmorph with elasticsearch. I'm completely new to >> elasticsearch and lucene -- don't even know where to start. Anyone have any >> tips or pointers to get me going? >> >> Thanks, >> Brett >> >> >> >> ------------------------------------------------------------------------------ >> Introducing AppDynamics Lite, a free troubleshooting tool for Java/.NET >> Get 100% visibility into your production application - at no cost. >> Code-level diagnostics for performance bottlenecks with <2% overhead >> Download for free and get started troubleshooting in minutes. >> http://p.sf.net/sfu/appdyn_d2d_ap1 >> _______________________________________________ >> Hebmorph-thinktank mailing list >> Heb...@li... >> https://lists.sourceforge.net/lists/listinfo/hebmorph-thinktank >> >> > |
|
From: Itamar Syn-H. <it...@co...> - 2013-05-03 10:30:43
|
There you go: http://www.code972.com/blog/2013/05/hebrew-search-with-elasticsearch-using-hebmorph On Thu, May 2, 2013 at 5:38 AM, Brett Lockspeiser <bl...@gm...> wrote: > Hi, > > I'd like to use hebmorph with elasticsearch. I'm completely new to > elasticsearch and lucene -- don't even know where to start. Anyone have any > tips or pointers to get me going? > > Thanks, > Brett > > > > ------------------------------------------------------------------------------ > Introducing AppDynamics Lite, a free troubleshooting tool for Java/.NET > Get 100% visibility into your production application - at no cost. > Code-level diagnostics for performance bottlenecks with <2% overhead > Download for free and get started troubleshooting in minutes. > http://p.sf.net/sfu/appdyn_d2d_ap1 > _______________________________________________ > Hebmorph-thinktank mailing list > Heb...@li... > https://lists.sourceforge.net/lists/listinfo/hebmorph-thinktank > > |
|
From: Brett L. <bl...@gm...> - 2013-05-02 02:38:58
|
Hi, I'd like to use hebmorph with elasticsearch. I'm completely new to elasticsearch and lucene -- don't even know where to start. Anyone have any tips or pointers to get me going? Thanks, Brett |
|
From: Itamar Syn-H. <it...@co...> - 2013-04-22 08:00:20
|
Wherever. Use Loader.loadDictionaryFromHSpellData(new File(somePath + "hspell-data-files"), true); to load the radix once to a static instance, and create MorphAnalyzer instances passing them the radix, thats the easiest way of working with it. On Mon, Apr 22, 2013 at 2:00 AM, Efraim Feinstein < efr...@gm...> wrote: > Hi, > > > On 04/21/2013 12:58 PM, Itamar Syn-Hershko wrote: > > I'm loading the hspell files from the file system and never loaded them > from the jar so I might have broke the code path that worked with it. You > are welcome to fix it and send a pull request, I'm afraid I won't have time > to investigate this myself in the near future. > > > I'm not attached to using a Jar. I'll ask a simpler question: where are > the hspell-data-files expected to be for the code to function as-is? Simply > dropping the directory into the same dir as the code doesn't do it. Do I > need to add the directory to the classpath manually? > > Thanks, > > > -- > --- > Efraim Feinstein > Lead Developer > Open Siddur Projecthttp://opensiddur.nethttp://wiki.jewishliturgy.org > > |
|
From: Efraim F. <efr...@gm...> - 2013-04-21 23:00:37
|
Hi, On 04/21/2013 12:58 PM, Itamar Syn-Hershko wrote: > I'm loading the hspell files from the file system and never loaded > them from the jar so I might have broke the code path that worked with > it. You are welcome to fix it and send a pull request, I'm afraid I > won't have time to investigate this myself in the near future. > I'm not attached to using a Jar. I'll ask a simpler question: where are the hspell-data-files expected to be for the code to function as-is? Simply dropping the directory into the same dir as the code doesn't do it. Do I need to add the directory to the classpath manually? Thanks, -- --- Efraim Feinstein Lead Developer Open Siddur Project http://opensiddur.net http://wiki.jewishliturgy.org |
|
From: Itamar Syn-H. <it...@co...> - 2013-04-21 19:58:58
|
I'm loading the hspell files from the file system and never loaded them from the jar so I might have broke the code path that worked with it. You are welcome to fix it and send a pull request, I'm afraid I won't have time to investigate this myself in the near future. On Sun, Apr 21, 2013 at 10:51 PM, Efraim Feinstein < efr...@gm...> wrote: > Hi, > > I recently tried a newer commit of hebmorph than the very old one I had > been using and I'm having issues with the classpath loading of > hspell-data-files. > > As far as I can tell, they used to be included in lucene.hebrew.jar, but > are now not included by default in either the built hebmorph-core or > hebmorph-lucene. I tried making a new jar containing hspell-data-files > and putting it in a directory in the classpath. As far as I can tell, > Loader.java, line 35 now succeeds in finding a URL > > (jar:file:/usr/local/opensiddur/extensions/indexes/lucene/lib/hspell-data-files.jar!/hspell-data-files), > but line 59 fails because both hspellFolder.exists() and > hspellFolder.isDirectory() return False. > > Is there anything special I need to do to make sure the > hspell-data-files can be found? (Note: I'm not usually a Java coder, so > it might be something very simple) > > I am using commit 9298a8c71a62af06cac7a8001066b6387020a6b2 (Jan 9, 2013) > because it is the last one before the change to Lucene 4 and other > dependencies use still use Lucene 3. > > Thanks for any help you can give! > > -- > --- > Efraim Feinstein > Lead Developer > Open Siddur Project > http://opensiddur.net > http://wiki.jewishliturgy.org > > > > ------------------------------------------------------------------------------ > Precog is a next-generation analytics platform capable of advanced > analytics on semi-structured data. The platform includes APIs for building > apps and a phenomenal toolset for data science. Developers can use > our toolset for easy data analysis & visualization. Get a free account! > http://www2.precog.com/precogplatform/slashdotnewsletter > _______________________________________________ > Hebmorph-thinktank mailing list > Heb...@li... > https://lists.sourceforge.net/lists/listinfo/hebmorph-thinktank > |
|
From: Efraim F. <efr...@gm...> - 2013-04-21 19:51:55
|
Hi, I recently tried a newer commit of hebmorph than the very old one I had been using and I'm having issues with the classpath loading of hspell-data-files. As far as I can tell, they used to be included in lucene.hebrew.jar, but are now not included by default in either the built hebmorph-core or hebmorph-lucene. I tried making a new jar containing hspell-data-files and putting it in a directory in the classpath. As far as I can tell, Loader.java, line 35 now succeeds in finding a URL (jar:file:/usr/local/opensiddur/extensions/indexes/lucene/lib/hspell-data-files.jar!/hspell-data-files), but line 59 fails because both hspellFolder.exists() and hspellFolder.isDirectory() return False. Is there anything special I need to do to make sure the hspell-data-files can be found? (Note: I'm not usually a Java coder, so it might be something very simple) I am using commit 9298a8c71a62af06cac7a8001066b6387020a6b2 (Jan 9, 2013) because it is the last one before the change to Lucene 4 and other dependencies use still use Lucene 3. Thanks for any help you can give! -- --- Efraim Feinstein Lead Developer Open Siddur Project http://opensiddur.net http://wiki.jewishliturgy.org |
|
From: Itamar Syn-H. <it...@co...> - 2012-01-31 18:17:16
|
That is expected הדירה is being mapped to two index terms: דירה and הדירה$ - the latter is used to boost exact matches. So when searching you will only get hits for tokens which are mapped to דירה as well. דיור is another lemma, and is being saved to the index as דיור. In the .NET sources there is a tool called VisualHebMorph, it should help understand what maps into what. The general idea is a lemma is _not_ a stem. I discuss this in more length in my blog. Itamar. On Tue, Jan 31, 2012 at 8:11 PM, Shai <sh...@dr...> wrote: > Hi > > I am trying to test searches, > using lucene java with MorphAnalyzer and HebrewMultiFieldQueryParser > > I want to know if this is the correct behavior - > > I index a sentence- הדירה יקרה > (for example) > > but when searching for דיור > it don't give any result > > when i search for דירה / והדירה / דירתן / דירות > it do give result..(as probably expected..) > > > Thanks > > > ------------------------------------------------------------------------------ > Keep Your Developer Skills Current with LearnDevNow! > The most comprehensive online learning library for Microsoft developers > is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, > Metro Style Apps, more. Free future releases when you subscribe now! > http://p.sf.net/sfu/learndevnow-d2d > _______________________________________________ > Hebmorph-thinktank mailing list > Heb...@li... > https://lists.sourceforge.net/lists/listinfo/hebmorph-thinktank > |
|
From: Shai <sh...@dr...> - 2012-01-31 18:11:53
|
Hi I am trying to test searches, using lucene java with MorphAnalyzer and HebrewMultiFieldQueryParser I want to know if this is the correct behavior - I index a sentence- הדירה יקרה (for example) but when searching for דיור it don't give any result when i search for דירה / והדירה / דירתן / דירות it do give result..(as probably expected..) Thanks |
|
From: Shai <sh...@dr...> - 2012-01-10 09:56:34
|
I think that with the current configuration it don't load the hspell for each query because the query takes 2ms (for a very small amount of documents) On Tue, Jan 10, 2012 at 10:52 AM, Itamar Syn-Hershko <it...@co...>wrote: > Basically, MorphAnalyzer uses a custom tokenizer and some of it's own > filters, so I'm not sure if its a good idea to define other ones like you > did. The snowball one is definitely not helpful here. > > Also, you want to make sure MorphAnalyzer doesn't get recreated on each > query. I'm not sure if and how this could be done with SOLR, but it's > crucial as loading the hspell dictionary takes about 2 seconds... > > > On Mon, Jan 9, 2012 at 12:06 PM, Shai <sh...@dr...> wrote: > >> Hi >> I am re-testing HebMorph now for using with hebrew searches in apache-solr >> I use apache-solr-1.4.1 >> and it seems to work with the latest HebMorph commit id >> eb403a6ad63bfc0dc18cf100dc3f256a4a6eb8af >> (even when compiled with lucene 3.0.2) >> >> it seems to work but I didn't test it fully yet >> >> I end up with something like this config for fieldType text in >> schema.xml - >> I will be happy to know the configurations others use and if its fully >> configured to work properly >> (if i need to use additional filters/tokenizers/analyzers and so on...) >> >> >> <fieldType name="text" class="solr.TextField"> >> <analyzer type="index" >> class="org.apache.lucene.analysis.hebrew.MorphAnalyzer"> >> <tokenizer class="solr.WhitespaceTokenizerFactory"/> >> <filter class="solr.StopFilterFactory" >> ignoreCase="true" >> words="stopwords.txt" >> enablePositionIncrements="true" >> /> >> <filter class="solr.WordDelimiterFilterFactory" >> generateWordParts="1" generateNumberParts="1" catenateWords="1" >> catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/> >> <filter class="solr.LowerCaseFilterFactory"/> >> <filter class="solr.SnowballPorterFilterFactory" >> language="English" protected="protwords.txt"/> >> </analyzer> >> <analyzer type="query" >> class="org.apache.lucene.analysis.hebrew.MorphAnalyzer"> >> <tokenizer class="solr.WhitespaceTokenizerFactory"/> >> <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" >> ignoreCase="true" expand="true"/> >> <filter class="solr.StopFilterFactory" >> ignoreCase="true" >> words="stopwords.txt" >> enablePositionIncrements="true" >> /> >> <filter class="solr.WordDelimiterFilterFactory" >> generateWordParts="1" generateNumberParts="1" catenateWords="0" >> catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/> >> <filter class="solr.LowerCaseFilterFactory"/> >> <filter class="solr.SnowballPorterFilterFactory" >> language="English" protected="protwords.txt"/> >> </analyzer> >> </fieldType> >> >> >> >> On Thu, Nov 24, 2011 at 11:29 PM, Itamar Syn-Hershko <it...@co...>wrote: >> >>> I'm not really sure what to tell you. I never used HebMorph with Solr, >>> but I know some people did ( >>> http://lucene.472066.n3.nabble.com/using-HebMorph-td1826534.html), >>> possibly with earlier versions. >>> >>> Java's ClassCastException is sometimes when compilation to jar isn't >>> done correctly. >>> >>> Sorry I can't be of more help atm. >>> >>> On Thu, Nov 24, 2011 at 6:59 PM, Manoj Damodaran <mda...@at...>wrote: >>> >>>> Itamar,**** >>>> >>>> ** ** >>>> >>>> I gave up making it work with lucene 2.9.3 (solr 1.4.1) and tried to >>>> compile HebMorph for other solr versions, but none of them work.**** >>>> >>>> Solr Lucene**** >>>> >>>> 1.4.1 2.9.3**** >>>> >>>> 3.1.0 3.1.0**** >>>> >>>> 3.2.0 3.2.0**** >>>> >>>> 3.3.0 3.3.0**** >>>> >>>> 3.4.0 3.4.0**** >>>> >>>> ** ** >>>> >>>> Lucene 3.0.2 is not bundled with any solr. I am getting the below >>>> runtime exception**** >>>> >>>> ** ** >>>> >>>> 24-Nov-2011 16:58:39 org.apache.solr.schema.IndexSchema readAnalyzer*** >>>> * >>>> >>>> SEVERE: Cannot load analyzer: >>>> org.apache.lucene.analysis.hebrew.MorphAnalyzer**** >>>> >>>> java.lang.ClassCastException: class >>>> org.apache.lucene.analysis.hebrew.MorphAnalyzer**** >>>> >>>> at java.lang.Class.asSubclass(Unknown Source)**** >>>> >>>> at >>>> org.apache.solr.schema.IndexSchema.readAnalyzer(IndexSchema.java:828)** >>>> ** >>>> >>>> at >>>> org.apache.solr.schema.IndexSchema.access$100(IndexSchema.java:62)**** >>>> >>>> at >>>> org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:450)**** >>>> >>>> at >>>> org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:435)**** >>>> >>>> at >>>> org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:140) >>>> **** >>>> >>>> ** ** >>>> >>>> Has anyone had success running HebMorph on Solr, What version did they >>>> use.**** >>>> >>>> ** ** >>>> >>>> Thanks,**** >>>> >>>> Manoj**** >>>> >>>> ** ** >>>> >>>> *From:* ita...@gm... [mailto:ita...@gm...] >>>> *On Behalf Of *Itamar Syn-Hershko >>>> *Sent:* 23 November 2011 07:39 PM >>>> >>>> *To:* Manoj Damodaran >>>> *Cc:* heb...@li... >>>> *Subject:* Re: [Hebmorph-thinktank] Help need with HdbMorph Setup on >>>> Solr**** >>>> >>>> ** ** >>>> >>>> MorphAnalyzer is compiled against 3.0.2, and the API might have >>>> changed. Can you try looking at the project history, I think it was 2.9.3 >>>> not long ago, that should get you going.**** >>>> >>>> On Wed, Nov 23, 2011 at 2:44 PM, Manoj Damodaran <mda...@at...> >>>> wrote:**** >>>> >>>> Itmar,**** >>>> >>>> **** >>>> >>>> Thanks for the quick response.**** >>>> >>>> I would like to make it work with Lucene 2.9.3 (solr 1.41.) if possible >>>> as upgrading the solr will bring other complications for me. I changed the >>>> ant build script to use <property name="lucene-version" value="2.9.3" /> >>>> now Solr loads Lucene 2.9.3 libs, but I still get the same runtime >>>> error when loading MorphAnalyzer**** >>>> >>>> **** >>>> >>>> Thanks,**** >>>> >>>> Manoj**** >>>> >>>> **** >>>> >>>> **** >>>> >>>> *From:* ita...@gm... [mailto:ita...@gm...] >>>> *On Behalf Of *Itamar Syn-Hershko >>>> *Sent:* 22 November 2011 18:45 >>>> *To:* Manoj Damodaran >>>> *Cc:* heb...@li... >>>> *Subject:* Re: [Hebmorph-thinktank] Help need with HdbMorph Setup on >>>> Solr**** >>>> >>>> **** >>>> >>>> That is probably because HebMorph is compiled against Lucene 3.0.2 in >>>> the Java version. Try changing that, or using a compatible version of Solr, >>>> let me know how it goes.**** >>>> >>>> On Tue, Nov 22, 2011 at 7:57 PM, Manoj Damodaran <mda...@at...> >>>> wrote:**** >>>> >>>> Hi,**** >>>> >>>> **** >>>> >>>> I am trying to use HebMorph to do hebrew search with Solr in our >>>> application. HebMorph looks quite promising, but I am having difficulty >>>> making it work.**** >>>> >>>> **** >>>> >>>> I am not able to make solr useHebMorph. I am able to build the Jar >>>> files and have put them in the lib folder. When I make schema change to >>>> add filed type to use lucene.analysis.hebrew.MorphAnalyzer , I get a >>>> run-time exception shown below. Any idea what is going wrong ? I am running >>>> Solr 1.4.1( Lucene 2.9.3)**** >>>> >>>> **** >>>> >>>> Nov 22, 2011 5:38:51 PM org.apache.solr.common.SolrException log**** >>>> >>>> SEVERE: java.lang.ClassCastException: >>>> org.apache.lucene.analysis.hebrew.MorphAnalyzer cannot be cast to >>>> org.apache.lucene.analysis.Analyzer**** >>>> >>>> at >>>> org.apache.solr.schema.IndexSchema.readAnalyzer(IndexSchema.java:759)** >>>> ** >>>> >>>> at >>>> org.apache.solr.schema.IndexSchema.access$100(IndexSchema.java:58)**** >>>> >>>> at >>>> org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:429)**** >>>> >>>> **** >>>> >>>> Thanks,**** >>>> >>>> *Manoj***** >>>> >>>> **** >>>> >>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> All the data continuously generated in your IT infrastructure >>>> contains a definitive record of customers, application performance, >>>> security threats, fraudulent activity, and more. Splunk takes this >>>> data and makes sense of it. IT sense. And common sense. >>>> http://p.sf.net/sfu/splunk-novd2d >>>> _______________________________________________ >>>> Hebmorph-thinktank mailing list >>>> Heb...@li... >>>> https://lists.sourceforge.net/lists/listinfo/hebmorph-thinktank**** >>>> >>>> **** >>>> >>>> ** ** >>>> >>> >>> >>> >>> ------------------------------------------------------------------------------ >>> All the data continuously generated in your IT infrastructure >>> contains a definitive record of customers, application performance, >>> security threats, fraudulent activity, and more. Splunk takes this >>> data and makes sense of it. IT sense. And common sense. >>> http://p.sf.net/sfu/splunk-novd2d >>> _______________________________________________ >>> Hebmorph-thinktank mailing list >>> Heb...@li... >>> https://lists.sourceforge.net/lists/listinfo/hebmorph-thinktank >>> >>> >> > |
|
From: Itamar Syn-H. <it...@co...> - 2012-01-10 08:52:35
|
Basically, MorphAnalyzer uses a custom tokenizer and some of it's own filters, so I'm not sure if its a good idea to define other ones like you did. The snowball one is definitely not helpful here. Also, you want to make sure MorphAnalyzer doesn't get recreated on each query. I'm not sure if and how this could be done with SOLR, but it's crucial as loading the hspell dictionary takes about 2 seconds... On Mon, Jan 9, 2012 at 12:06 PM, Shai <sh...@dr...> wrote: > Hi > I am re-testing HebMorph now for using with hebrew searches in apache-solr > I use apache-solr-1.4.1 > and it seems to work with the latest HebMorph commit id > eb403a6ad63bfc0dc18cf100dc3f256a4a6eb8af > (even when compiled with lucene 3.0.2) > > it seems to work but I didn't test it fully yet > > I end up with something like this config for fieldType text in schema.xml > - > I will be happy to know the configurations others use and if its fully > configured to work properly > (if i need to use additional filters/tokenizers/analyzers and so on...) > > > <fieldType name="text" class="solr.TextField"> > <analyzer type="index" > class="org.apache.lucene.analysis.hebrew.MorphAnalyzer"> > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > <filter class="solr.StopFilterFactory" > ignoreCase="true" > words="stopwords.txt" > enablePositionIncrements="true" > /> > <filter class="solr.WordDelimiterFilterFactory" > generateWordParts="1" generateNumberParts="1" catenateWords="1" > catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.SnowballPorterFilterFactory" > language="English" protected="protwords.txt"/> > </analyzer> > <analyzer type="query" > class="org.apache.lucene.analysis.hebrew.MorphAnalyzer"> > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" > ignoreCase="true" expand="true"/> > <filter class="solr.StopFilterFactory" > ignoreCase="true" > words="stopwords.txt" > enablePositionIncrements="true" > /> > <filter class="solr.WordDelimiterFilterFactory" > generateWordParts="1" generateNumberParts="1" catenateWords="0" > catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/> > <filter class="solr.LowerCaseFilterFactory"/> > <filter class="solr.SnowballPorterFilterFactory" > language="English" protected="protwords.txt"/> > </analyzer> > </fieldType> > > > > On Thu, Nov 24, 2011 at 11:29 PM, Itamar Syn-Hershko <it...@co...>wrote: > >> I'm not really sure what to tell you. I never used HebMorph with Solr, >> but I know some people did ( >> http://lucene.472066.n3.nabble.com/using-HebMorph-td1826534.html), >> possibly with earlier versions. >> >> Java's ClassCastException is sometimes when compilation to jar isn't done >> correctly. >> >> Sorry I can't be of more help atm. >> >> On Thu, Nov 24, 2011 at 6:59 PM, Manoj Damodaran <mda...@at...>wrote: >> >>> Itamar,**** >>> >>> ** ** >>> >>> I gave up making it work with lucene 2.9.3 (solr 1.4.1) and tried to >>> compile HebMorph for other solr versions, but none of them work.**** >>> >>> Solr Lucene**** >>> >>> 1.4.1 2.9.3**** >>> >>> 3.1.0 3.1.0**** >>> >>> 3.2.0 3.2.0**** >>> >>> 3.3.0 3.3.0**** >>> >>> 3.4.0 3.4.0**** >>> >>> ** ** >>> >>> Lucene 3.0.2 is not bundled with any solr. I am getting the below >>> runtime exception**** >>> >>> ** ** >>> >>> 24-Nov-2011 16:58:39 org.apache.solr.schema.IndexSchema readAnalyzer**** >>> >>> SEVERE: Cannot load analyzer: >>> org.apache.lucene.analysis.hebrew.MorphAnalyzer**** >>> >>> java.lang.ClassCastException: class >>> org.apache.lucene.analysis.hebrew.MorphAnalyzer**** >>> >>> at java.lang.Class.asSubclass(Unknown Source)**** >>> >>> at >>> org.apache.solr.schema.IndexSchema.readAnalyzer(IndexSchema.java:828)*** >>> * >>> >>> at >>> org.apache.solr.schema.IndexSchema.access$100(IndexSchema.java:62)**** >>> >>> at >>> org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:450)**** >>> >>> at >>> org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:435)**** >>> >>> at >>> org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:140) >>> **** >>> >>> ** ** >>> >>> Has anyone had success running HebMorph on Solr, What version did they >>> use.**** >>> >>> ** ** >>> >>> Thanks,**** >>> >>> Manoj**** >>> >>> ** ** >>> >>> *From:* ita...@gm... [mailto:ita...@gm...] >>> *On Behalf Of *Itamar Syn-Hershko >>> *Sent:* 23 November 2011 07:39 PM >>> >>> *To:* Manoj Damodaran >>> *Cc:* heb...@li... >>> *Subject:* Re: [Hebmorph-thinktank] Help need with HdbMorph Setup on >>> Solr**** >>> >>> ** ** >>> >>> MorphAnalyzer is compiled against 3.0.2, and the API might have changed. >>> Can you try looking at the project history, I think it was 2.9.3 not long >>> ago, that should get you going.**** >>> >>> On Wed, Nov 23, 2011 at 2:44 PM, Manoj Damodaran <mda...@at...> >>> wrote:**** >>> >>> Itmar,**** >>> >>> **** >>> >>> Thanks for the quick response.**** >>> >>> I would like to make it work with Lucene 2.9.3 (solr 1.41.) if possible >>> as upgrading the solr will bring other complications for me. I changed the >>> ant build script to use <property name="lucene-version" value="2.9.3" /> >>> now Solr loads Lucene 2.9.3 libs, but I still get the same runtime >>> error when loading MorphAnalyzer**** >>> >>> **** >>> >>> Thanks,**** >>> >>> Manoj**** >>> >>> **** >>> >>> **** >>> >>> *From:* ita...@gm... [mailto:ita...@gm...] >>> *On Behalf Of *Itamar Syn-Hershko >>> *Sent:* 22 November 2011 18:45 >>> *To:* Manoj Damodaran >>> *Cc:* heb...@li... >>> *Subject:* Re: [Hebmorph-thinktank] Help need with HdbMorph Setup on >>> Solr**** >>> >>> **** >>> >>> That is probably because HebMorph is compiled against Lucene 3.0.2 in >>> the Java version. Try changing that, or using a compatible version of Solr, >>> let me know how it goes.**** >>> >>> On Tue, Nov 22, 2011 at 7:57 PM, Manoj Damodaran <mda...@at...> >>> wrote:**** >>> >>> Hi,**** >>> >>> **** >>> >>> I am trying to use HebMorph to do hebrew search with Solr in our >>> application. HebMorph looks quite promising, but I am having difficulty >>> making it work.**** >>> >>> **** >>> >>> I am not able to make solr useHebMorph. I am able to build the Jar >>> files and have put them in the lib folder. When I make schema change to >>> add filed type to use lucene.analysis.hebrew.MorphAnalyzer , I get a >>> run-time exception shown below. Any idea what is going wrong ? I am running >>> Solr 1.4.1( Lucene 2.9.3)**** >>> >>> **** >>> >>> Nov 22, 2011 5:38:51 PM org.apache.solr.common.SolrException log**** >>> >>> SEVERE: java.lang.ClassCastException: >>> org.apache.lucene.analysis.hebrew.MorphAnalyzer cannot be cast to >>> org.apache.lucene.analysis.Analyzer**** >>> >>> at >>> org.apache.solr.schema.IndexSchema.readAnalyzer(IndexSchema.java:759)*** >>> * >>> >>> at >>> org.apache.solr.schema.IndexSchema.access$100(IndexSchema.java:58)**** >>> >>> at >>> org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:429)**** >>> >>> **** >>> >>> Thanks,**** >>> >>> *Manoj***** >>> >>> **** >>> >>> >>> >>> ------------------------------------------------------------------------------ >>> All the data continuously generated in your IT infrastructure >>> contains a definitive record of customers, application performance, >>> security threats, fraudulent activity, and more. Splunk takes this >>> data and makes sense of it. IT sense. And common sense. >>> http://p.sf.net/sfu/splunk-novd2d >>> _______________________________________________ >>> Hebmorph-thinktank mailing list >>> Heb...@li... >>> https://lists.sourceforge.net/lists/listinfo/hebmorph-thinktank**** >>> >>> **** >>> >>> ** ** >>> >> >> >> >> ------------------------------------------------------------------------------ >> All the data continuously generated in your IT infrastructure >> contains a definitive record of customers, application performance, >> security threats, fraudulent activity, and more. Splunk takes this >> data and makes sense of it. IT sense. And common sense. >> http://p.sf.net/sfu/splunk-novd2d >> _______________________________________________ >> Hebmorph-thinktank mailing list >> Heb...@li... >> https://lists.sourceforge.net/lists/listinfo/hebmorph-thinktank >> >> > |
|
From: Shai <sh...@dr...> - 2012-01-09 10:06:51
|
Hi
I am re-testing HebMorph now for using with hebrew searches in apache-solr
I use apache-solr-1.4.1
and it seems to work with the latest HebMorph commit id
eb403a6ad63bfc0dc18cf100dc3f256a4a6eb8af
(even when compiled with lucene 3.0.2)
it seems to work but I didn't test it fully yet
I end up with something like this config for fieldType text in schema.xml -
I will be happy to know the configurations others use and if its fully
configured to work properly
(if i need to use additional filters/tokenizers/analyzers and so on...)
<fieldType name="text" class="solr.TextField">
<analyzer type="index"
class="org.apache.lucene.analysis.hebrew.MorphAnalyzer">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="stopwords.txt"
enablePositionIncrements="true"
/>
<filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="1"
catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SnowballPorterFilterFactory" language="English"
protected="protwords.txt"/>
</analyzer>
<analyzer type="query"
class="org.apache.lucene.analysis.hebrew.MorphAnalyzer">
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
ignoreCase="true" expand="true"/>
<filter class="solr.StopFilterFactory"
ignoreCase="true"
words="stopwords.txt"
enablePositionIncrements="true"
/>
<filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="0"
catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SnowballPorterFilterFactory" language="English"
protected="protwords.txt"/>
</analyzer>
</fieldType>
On Thu, Nov 24, 2011 at 11:29 PM, Itamar Syn-Hershko <it...@co...>wrote:
> I'm not really sure what to tell you. I never used HebMorph with Solr, but
> I know some people did (
> http://lucene.472066.n3.nabble.com/using-HebMorph-td1826534.html),
> possibly with earlier versions.
>
> Java's ClassCastException is sometimes when compilation to jar isn't done
> correctly.
>
> Sorry I can't be of more help atm.
>
> On Thu, Nov 24, 2011 at 6:59 PM, Manoj Damodaran <mda...@at...>wrote:
>
>> Itamar,****
>>
>> ** **
>>
>> I gave up making it work with lucene 2.9.3 (solr 1.4.1) and tried to
>> compile HebMorph for other solr versions, but none of them work.****
>>
>> Solr Lucene****
>>
>> 1.4.1 2.9.3****
>>
>> 3.1.0 3.1.0****
>>
>> 3.2.0 3.2.0****
>>
>> 3.3.0 3.3.0****
>>
>> 3.4.0 3.4.0****
>>
>> ** **
>>
>> Lucene 3.0.2 is not bundled with any solr. I am getting the below runtime
>> exception****
>>
>> ** **
>>
>> 24-Nov-2011 16:58:39 org.apache.solr.schema.IndexSchema readAnalyzer****
>>
>> SEVERE: Cannot load analyzer:
>> org.apache.lucene.analysis.hebrew.MorphAnalyzer****
>>
>> java.lang.ClassCastException: class
>> org.apache.lucene.analysis.hebrew.MorphAnalyzer****
>>
>> at java.lang.Class.asSubclass(Unknown Source)****
>>
>> at
>> org.apache.solr.schema.IndexSchema.readAnalyzer(IndexSchema.java:828)****
>>
>> at
>> org.apache.solr.schema.IndexSchema.access$100(IndexSchema.java:62)****
>>
>> at
>> org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:450)****
>>
>> at
>> org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:435)****
>>
>> at
>> org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:140)
>> ****
>>
>> ** **
>>
>> Has anyone had success running HebMorph on Solr, What version did they
>> use.****
>>
>> ** **
>>
>> Thanks,****
>>
>> Manoj****
>>
>> ** **
>>
>> *From:* ita...@gm... [mailto:ita...@gm...]
>> *On Behalf Of *Itamar Syn-Hershko
>> *Sent:* 23 November 2011 07:39 PM
>>
>> *To:* Manoj Damodaran
>> *Cc:* heb...@li...
>> *Subject:* Re: [Hebmorph-thinktank] Help need with HdbMorph Setup on Solr
>> ****
>>
>> ** **
>>
>> MorphAnalyzer is compiled against 3.0.2, and the API might have changed.
>> Can you try looking at the project history, I think it was 2.9.3 not long
>> ago, that should get you going.****
>>
>> On Wed, Nov 23, 2011 at 2:44 PM, Manoj Damodaran <mda...@at...>
>> wrote:****
>>
>> Itmar,****
>>
>> ****
>>
>> Thanks for the quick response.****
>>
>> I would like to make it work with Lucene 2.9.3 (solr 1.41.) if possible
>> as upgrading the solr will bring other complications for me. I changed the
>> ant build script to use <property name="lucene-version" value="2.9.3" />
>> now Solr loads Lucene 2.9.3 libs, but I still get the same runtime error
>> when loading MorphAnalyzer****
>>
>> ****
>>
>> Thanks,****
>>
>> Manoj****
>>
>> ****
>>
>> ****
>>
>> *From:* ita...@gm... [mailto:ita...@gm...]
>> *On Behalf Of *Itamar Syn-Hershko
>> *Sent:* 22 November 2011 18:45
>> *To:* Manoj Damodaran
>> *Cc:* heb...@li...
>> *Subject:* Re: [Hebmorph-thinktank] Help need with HdbMorph Setup on Solr
>> ****
>>
>> ****
>>
>> That is probably because HebMorph is compiled against Lucene 3.0.2 in the
>> Java version. Try changing that, or using a compatible version of Solr, let
>> me know how it goes.****
>>
>> On Tue, Nov 22, 2011 at 7:57 PM, Manoj Damodaran <mda...@at...>
>> wrote:****
>>
>> Hi,****
>>
>> ****
>>
>> I am trying to use HebMorph to do hebrew search with Solr in our
>> application. HebMorph looks quite promising, but I am having difficulty
>> making it work.****
>>
>> ****
>>
>> I am not able to make solr useHebMorph. I am able to build the Jar files
>> and have put them in the lib folder. When I make schema change to add filed
>> type to use lucene.analysis.hebrew.MorphAnalyzer , I get a run-time
>> exception shown below. Any idea what is going wrong ? I am running Solr
>> 1.4.1( Lucene 2.9.3)****
>>
>> ****
>>
>> Nov 22, 2011 5:38:51 PM org.apache.solr.common.SolrException log****
>>
>> SEVERE: java.lang.ClassCastException:
>> org.apache.lucene.analysis.hebrew.MorphAnalyzer cannot be cast to
>> org.apache.lucene.analysis.Analyzer****
>>
>> at
>> org.apache.solr.schema.IndexSchema.readAnalyzer(IndexSchema.java:759)****
>>
>> at
>> org.apache.solr.schema.IndexSchema.access$100(IndexSchema.java:58)****
>>
>> at
>> org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:429)****
>>
>> ****
>>
>> Thanks,****
>>
>> *Manoj*****
>>
>> ****
>>
>>
>>
>> ------------------------------------------------------------------------------
>> All the data continuously generated in your IT infrastructure
>> contains a definitive record of customers, application performance,
>> security threats, fraudulent activity, and more. Splunk takes this
>> data and makes sense of it. IT sense. And common sense.
>> http://p.sf.net/sfu/splunk-novd2d
>> _______________________________________________
>> Hebmorph-thinktank mailing list
>> Heb...@li...
>> https://lists.sourceforge.net/lists/listinfo/hebmorph-thinktank****
>>
>> ****
>>
>> ** **
>>
>
>
>
> ------------------------------------------------------------------------------
> All the data continuously generated in your IT infrastructure
> contains a definitive record of customers, application performance,
> security threats, fraudulent activity, and more. Splunk takes this
> data and makes sense of it. IT sense. And common sense.
> http://p.sf.net/sfu/splunk-novd2d
> _______________________________________________
> Hebmorph-thinktank mailing list
> Heb...@li...
> https://lists.sourceforge.net/lists/listinfo/hebmorph-thinktank
>
>
|
|
From: Itamar Syn-H. <it...@co...> - 2011-11-24 21:29:07
|
I'm not really sure what to tell you. I never used HebMorph with Solr, but I know some people did ( http://lucene.472066.n3.nabble.com/using-HebMorph-td1826534.html), possibly with earlier versions. Java's ClassCastException is sometimes when compilation to jar isn't done correctly. Sorry I can't be of more help atm. On Thu, Nov 24, 2011 at 6:59 PM, Manoj Damodaran <mda...@at...>wrote: > Itamar,**** > > ** ** > > I gave up making it work with lucene 2.9.3 (solr 1.4.1) and tried to > compile HebMorph for other solr versions, but none of them work.**** > > Solr Lucene**** > > 1.4.1 2.9.3**** > > 3.1.0 3.1.0**** > > 3.2.0 3.2.0**** > > 3.3.0 3.3.0**** > > 3.4.0 3.4.0**** > > ** ** > > Lucene 3.0.2 is not bundled with any solr. I am getting the below runtime > exception**** > > ** ** > > 24-Nov-2011 16:58:39 org.apache.solr.schema.IndexSchema readAnalyzer**** > > SEVERE: Cannot load analyzer: > org.apache.lucene.analysis.hebrew.MorphAnalyzer**** > > java.lang.ClassCastException: class > org.apache.lucene.analysis.hebrew.MorphAnalyzer**** > > at java.lang.Class.asSubclass(Unknown Source)**** > > at > org.apache.solr.schema.IndexSchema.readAnalyzer(IndexSchema.java:828)**** > > at > org.apache.solr.schema.IndexSchema.access$100(IndexSchema.java:62)**** > > at > org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:450)**** > > at > org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:435)**** > > at > org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:140) > **** > > ** ** > > Has anyone had success running HebMorph on Solr, What version did they use. > **** > > ** ** > > Thanks,**** > > Manoj**** > > ** ** > > *From:* ita...@gm... [mailto:ita...@gm...] *On > Behalf Of *Itamar Syn-Hershko > *Sent:* 23 November 2011 07:39 PM > > *To:* Manoj Damodaran > *Cc:* heb...@li... > *Subject:* Re: [Hebmorph-thinktank] Help need with HdbMorph Setup on Solr* > *** > > ** ** > > MorphAnalyzer is compiled against 3.0.2, and the API might have changed. > Can you try looking at the project history, I think it was 2.9.3 not long > ago, that should get you going.**** > > On Wed, Nov 23, 2011 at 2:44 PM, Manoj Damodaran <mda...@at...> > wrote:**** > > Itmar,**** > > **** > > Thanks for the quick response.**** > > I would like to make it work with Lucene 2.9.3 (solr 1.41.) if possible as > upgrading the solr will bring other complications for me. I changed the > ant build script to use <property name="lucene-version" value="2.9.3" /> > now Solr loads Lucene 2.9.3 libs, but I still get the same runtime error > when loading MorphAnalyzer**** > > **** > > Thanks,**** > > Manoj**** > > **** > > **** > > *From:* ita...@gm... [mailto:ita...@gm...] *On > Behalf Of *Itamar Syn-Hershko > *Sent:* 22 November 2011 18:45 > *To:* Manoj Damodaran > *Cc:* heb...@li... > *Subject:* Re: [Hebmorph-thinktank] Help need with HdbMorph Setup on Solr* > *** > > **** > > That is probably because HebMorph is compiled against Lucene 3.0.2 in the > Java version. Try changing that, or using a compatible version of Solr, let > me know how it goes.**** > > On Tue, Nov 22, 2011 at 7:57 PM, Manoj Damodaran <mda...@at...> > wrote:**** > > Hi,**** > > **** > > I am trying to use HebMorph to do hebrew search with Solr in our > application. HebMorph looks quite promising, but I am having difficulty > making it work.**** > > **** > > I am not able to make solr useHebMorph. I am able to build the Jar files > and have put them in the lib folder. When I make schema change to add filed > type to use lucene.analysis.hebrew.MorphAnalyzer , I get a run-time > exception shown below. Any idea what is going wrong ? I am running Solr > 1.4.1( Lucene 2.9.3)**** > > **** > > Nov 22, 2011 5:38:51 PM org.apache.solr.common.SolrException log**** > > SEVERE: java.lang.ClassCastException: > org.apache.lucene.analysis.hebrew.MorphAnalyzer cannot be cast to > org.apache.lucene.analysis.Analyzer**** > > at > org.apache.solr.schema.IndexSchema.readAnalyzer(IndexSchema.java:759)**** > > at > org.apache.solr.schema.IndexSchema.access$100(IndexSchema.java:58)**** > > at > org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:429)**** > > **** > > Thanks,**** > > *Manoj***** > > **** > > > > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure > contains a definitive record of customers, application performance, > security threats, fraudulent activity, and more. Splunk takes this > data and makes sense of it. IT sense. And common sense. > http://p.sf.net/sfu/splunk-novd2d > _______________________________________________ > Hebmorph-thinktank mailing list > Heb...@li... > https://lists.sourceforge.net/lists/listinfo/hebmorph-thinktank**** > > **** > > ** ** > |
|
From: Manoj D. <mda...@at...> - 2011-11-24 16:59:53
|
Itamar,
I gave up making it work with lucene 2.9.3 (solr 1.4.1) and tried to
compile HebMorph for other solr versions, but none of them work.
Solr Lucene
1.4.1 2.9.3
3.1.0 3.1.0
3.2.0 3.2.0
3.3.0 3.3.0
3.4.0 3.4.0
Lucene 3.0.2 is not bundled with any solr. I am getting the below
runtime exception
24-Nov-2011 16:58:39 org.apache.solr.schema.IndexSchema readAnalyzer
SEVERE: Cannot load analyzer:
org.apache.lucene.analysis.hebrew.MorphAnalyzer
java.lang.ClassCastException: class
org.apache.lucene.analysis.hebrew.MorphAnalyzer
at java.lang.Class.asSubclass(Unknown Source)
at
org.apache.solr.schema.IndexSchema.readAnalyzer(IndexSchema.java:828)
at
org.apache.solr.schema.IndexSchema.access$100(IndexSchema.java:62)
at
org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:450)
at
org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:435)
at
org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoad
er.java:140)
Has anyone had success running HebMorph on Solr, What version did they
use.
Thanks,
Manoj
From: ita...@gm... [mailto:ita...@gm...]
On Behalf Of Itamar Syn-Hershko
Sent: 23 November 2011 07:39 PM
To: Manoj Damodaran
Cc: heb...@li...
Subject: Re: [Hebmorph-thinktank] Help need with HdbMorph Setup on Solr
MorphAnalyzer is compiled against 3.0.2, and the API might have changed.
Can you try looking at the project history, I think it was 2.9.3 not
long ago, that should get you going.
On Wed, Nov 23, 2011 at 2:44 PM, Manoj Damodaran <mda...@at...>
wrote:
Itmar,
Thanks for the quick response.
I would like to make it work with Lucene 2.9.3 (solr 1.41.) if possible
as upgrading the solr will bring other complications for me. I changed
the ant build script to use <property name="lucene-version"
value="2.9.3" /> now Solr loads Lucene 2.9.3 libs, but I still get the
same runtime error when loading MorphAnalyzer
Thanks,
Manoj
From: ita...@gm... [mailto:ita...@gm...]
On Behalf Of Itamar Syn-Hershko
Sent: 22 November 2011 18:45
To: Manoj Damodaran
Cc: heb...@li...
Subject: Re: [Hebmorph-thinktank] Help need with HdbMorph Setup on Solr
That is probably because HebMorph is compiled against Lucene 3.0.2 in
the Java version. Try changing that, or using a compatible version of
Solr, let me know how it goes.
On Tue, Nov 22, 2011 at 7:57 PM, Manoj Damodaran <mda...@at...>
wrote:
Hi,
I am trying to use HebMorph to do hebrew search with Solr in our
application. HebMorph looks quite promising, but I am having difficulty
making it work.
I am not able to make solr useHebMorph. I am able to build the Jar files
and have put them in the lib folder. When I make schema change to add
filed type to use lucene.analysis.hebrew.MorphAnalyzer , I get a
run-time exception shown below. Any idea what is going wrong ? I am
running Solr 1.4.1( Lucene 2.9.3)
Nov 22, 2011 5:38:51 PM org.apache.solr.common.SolrException log
SEVERE: java.lang.ClassCastException:
org.apache.lucene.analysis.hebrew.MorphAnalyzer cannot be cast to
org.apache.lucene.analysis.Analyzer
at
org.apache.solr.schema.IndexSchema.readAnalyzer(IndexSchema.java:759)
at
org.apache.solr.schema.IndexSchema.access$100(IndexSchema.java:58)
at
org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:429)
Thanks,
Manoj
------------------------------------------------------------------------
------
All the data continuously generated in your IT infrastructure
contains a definitive record of customers, application performance,
security threats, fraudulent activity, and more. Splunk takes this
data and makes sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-novd2d
_______________________________________________
Hebmorph-thinktank mailing list
Heb...@li...
https://lists.sourceforge.net/lists/listinfo/hebmorph-thinktank
|
|
From: Itamar Syn-H. <it...@co...> - 2011-11-23 19:38:40
|
MorphAnalyzer is compiled against 3.0.2, and the API might have changed. Can you try looking at the project history, I think it was 2.9.3 not long ago, that should get you going. On Wed, Nov 23, 2011 at 2:44 PM, Manoj Damodaran <mda...@at...>wrote: > Itmar,**** > > ** ** > > Thanks for the quick response.**** > > I would like to make it work with Lucene 2.9.3 (solr 1.41.) if possible as > upgrading the solr will bring other complications for me. I changed the > ant build script to use <property name="lucene-version" value="2.9.3" /> > now Solr loads Lucene 2.9.3 libs, but I still get the same runtime error > when loading MorphAnalyzer**** > > ** ** > > Thanks,**** > > Manoj**** > > ** ** > > ** ** > > *From:* ita...@gm... [mailto:ita...@gm...] *On > Behalf Of *Itamar Syn-Hershko > *Sent:* 22 November 2011 18:45 > *To:* Manoj Damodaran > *Cc:* heb...@li... > *Subject:* Re: [Hebmorph-thinktank] Help need with HdbMorph Setup on Solr* > *** > > ** ** > > That is probably because HebMorph is compiled against Lucene 3.0.2 in the > Java version. Try changing that, or using a compatible version of Solr, let > me know how it goes.**** > > On Tue, Nov 22, 2011 at 7:57 PM, Manoj Damodaran <mda...@at...> > wrote:**** > > Hi,**** > > **** > > I am trying to use HebMorph to do hebrew search with Solr in our > application. HebMorph looks quite promising, but I am having difficulty > making it work.**** > > **** > > I am not able to make solr useHebMorph. I am able to build the Jar files > and have put them in the lib folder. When I make schema change to add filed > type to use lucene.analysis.hebrew.MorphAnalyzer , I get a run-time > exception shown below. Any idea what is going wrong ? I am running Solr > 1.4.1( Lucene 2.9.3)**** > > **** > > Nov 22, 2011 5:38:51 PM org.apache.solr.common.SolrException log**** > > SEVERE: java.lang.ClassCastException: > org.apache.lucene.analysis.hebrew.MorphAnalyzer cannot be cast to > org.apache.lucene.analysis.Analyzer**** > > at > org.apache.solr.schema.IndexSchema.readAnalyzer(IndexSchema.java:759)**** > > at > org.apache.solr.schema.IndexSchema.access$100(IndexSchema.java:58)**** > > at > org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:429)**** > > **** > > Thanks,**** > > *Manoj***** > > **** > > > > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure > contains a definitive record of customers, application performance, > security threats, fraudulent activity, and more. Splunk takes this > data and makes sense of it. IT sense. And common sense. > http://p.sf.net/sfu/splunk-novd2d > _______________________________________________ > Hebmorph-thinktank mailing list > Heb...@li... > https://lists.sourceforge.net/lists/listinfo/hebmorph-thinktank**** > > ** ** > |
|
From: Manoj D. <mda...@at...> - 2011-11-23 12:44:22
|
Itmar,
Thanks for the quick response.
I would like to make it work with Lucene 2.9.3 (solr 1.41.) if possible
as upgrading the solr will bring other complications for me. I changed
the ant build script to use <property name="lucene-version"
value="2.9.3" /> now Solr loads Lucene 2.9.3 libs, but I still get the
same runtime error when loading MorphAnalyzer
Thanks,
Manoj
From: ita...@gm... [mailto:ita...@gm...]
On Behalf Of Itamar Syn-Hershko
Sent: 22 November 2011 18:45
To: Manoj Damodaran
Cc: heb...@li...
Subject: Re: [Hebmorph-thinktank] Help need with HdbMorph Setup on Solr
That is probably because HebMorph is compiled against Lucene 3.0.2 in
the Java version. Try changing that, or using a compatible version of
Solr, let me know how it goes.
On Tue, Nov 22, 2011 at 7:57 PM, Manoj Damodaran <mda...@at...>
wrote:
Hi,
I am trying to use HebMorph to do hebrew search with Solr in our
application. HebMorph looks quite promising, but I am having difficulty
making it work.
I am not able to make solr useHebMorph. I am able to build the Jar files
and have put them in the lib folder. When I make schema change to add
filed type to use lucene.analysis.hebrew.MorphAnalyzer , I get a
run-time exception shown below. Any idea what is going wrong ? I am
running Solr 1.4.1( Lucene 2.9.3)
Nov 22, 2011 5:38:51 PM org.apache.solr.common.SolrException log
SEVERE: java.lang.ClassCastException:
org.apache.lucene.analysis.hebrew.MorphAnalyzer cannot be cast to
org.apache.lucene.analysis.Analyzer
at
org.apache.solr.schema.IndexSchema.readAnalyzer(IndexSchema.java:759)
at
org.apache.solr.schema.IndexSchema.access$100(IndexSchema.java:58)
at
org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:429)
Thanks,
Manoj
------------------------------------------------------------------------
------
All the data continuously generated in your IT infrastructure
contains a definitive record of customers, application performance,
security threats, fraudulent activity, and more. Splunk takes this
data and makes sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-novd2d
_______________________________________________
Hebmorph-thinktank mailing list
Heb...@li...
https://lists.sourceforge.net/lists/listinfo/hebmorph-thinktank
|
|
From: Itamar Syn-H. <it...@co...> - 2011-11-22 18:45:25
|
That is probably because HebMorph is compiled against Lucene 3.0.2 in the Java version. Try changing that, or using a compatible version of Solr, let me know how it goes. On Tue, Nov 22, 2011 at 7:57 PM, Manoj Damodaran <mda...@at...>wrote: > Hi,**** > > ** ** > > I am trying to use HebMorph to do hebrew search with Solr in our > application. HebMorph looks quite promising, but I am having difficulty > making it work.**** > > ** ** > > I am not able to make solr useHebMorph. I am able to build the Jar files > and have put them in the lib folder. When I make schema change to add filed > type to use lucene.analysis.hebrew.MorphAnalyzer , I get a run-time > exception shown below. Any idea what is going wrong ? I am running Solr > 1.4.1( Lucene 2.9.3)**** > > ** ** > > Nov 22, 2011 5:38:51 PM org.apache.solr.common.SolrException log**** > > SEVERE: java.lang.ClassCastException: > org.apache.lucene.analysis.hebrew.MorphAnalyzer cannot be cast to > org.apache.lucene.analysis.Analyzer**** > > at > org.apache.solr.schema.IndexSchema.readAnalyzer(IndexSchema.java:759)**** > > at > org.apache.solr.schema.IndexSchema.access$100(IndexSchema.java:58)**** > > at > org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:429)**** > > ** ** > > Thanks,**** > > *Manoj***** > > ** ** > > > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure > contains a definitive record of customers, application performance, > security threats, fraudulent activity, and more. Splunk takes this > data and makes sense of it. IT sense. And common sense. > http://p.sf.net/sfu/splunk-novd2d > _______________________________________________ > Hebmorph-thinktank mailing list > Heb...@li... > https://lists.sourceforge.net/lists/listinfo/hebmorph-thinktank > > |
|
From: Manoj D. <mda...@at...> - 2011-11-22 18:23:14
|
Hi,
I am trying to use HebMorph to do hebrew search with Solr in our
application. HebMorph looks quite promising, but I am having difficulty
making it work.
I am not able to make solr useHebMorph. I am able to build the Jar files
and have put them in the lib folder. When I make schema change to add
filed type to use lucene.analysis.hebrew.MorphAnalyzer , I get a
run-time exception shown below. Any idea what is going wrong ? I am
running Solr 1.4.1( Lucene 2.9.3)
Nov 22, 2011 5:38:51 PM org.apache.solr.common.SolrException log
SEVERE: java.lang.ClassCastException:
org.apache.lucene.analysis.hebrew.MorphAnalyzer cannot be cast to
org.apache.lucene.analysis.Analyzer
at
org.apache.solr.schema.IndexSchema.readAnalyzer(IndexSchema.java:759)
at
org.apache.solr.schema.IndexSchema.access$100(IndexSchema.java:58)
at
org.apache.solr.schema.IndexSchema$1.create(IndexSchema.java:429)
Thanks,
Manoj
|
|
From: Itamar Syn-H. <it...@co...> - 2011-06-28 00:04:41
|
Hi all, Recently we had a conversation here over HebMorph's licensing. My position was that although my selection of GPLv2 was incorrect, I don't see an urgent need to change that since I trust people will act in good faith. Unfortunately, days after I made that statement I was proven wrong. This made me do some reading, and it turns out the licensing choice was even worse than I thought. Among other things, it is also completely incompatible with Lucene/Solr's license, and that isn't good at all. After a lot of thought and discussions, I decided to change HebMorph's license. Effective immediately, including my next push to the github repo, HebMorph's license is AGPLv3. This resolves all issues coming from GPLv2, and hopefully will remove any future debates and license infringements. I explained the intention behind this, and a lot more, in length here: http://www.code972.com/blog/2011/06/some-words-on-hebmorphs-licensing/. Due to the unintended conflict of licenses, any previous versions of HebMorph have to move to the new license. Any OSS projects affected by this can approach me privately. Comments welcome, Itamar. |
|
From: Itamar Syn-H. <it...@co...> - 2011-06-10 13:47:52
|
Hi, On 10/06/2011 16:09, Efraim Feinstein wrote: > I think we're not actually disagreeing here. The words that set me off > on this line of questioning are "commercial use," and the two cases I > was working off were: > (1) a commercial user who does release full source code. > (2) a commercial user who uses hebmorph internally but never releases > source code or binaries outside the organization > Do you think they're *required* to pay a licensing fee or requested to > donate to the project? If the former, that's not what the GPL says, if > the latter, that's OK. For most commercial solutions, releasing their source code is a death sentence - or so they think. So they'd rather pay than release it. If a company decides to release their sources and not pay for a commercial license - thats fine by me. So I'm aligned with GPL even by your interpretation of it. > If it's non-free, it becomes unusable to or undistributable by other > free/open source projects. I don't think it actually is, although I > think GPL v2-only may cause a binary distribution issue w/Lucene. I know > you're restricted by hspell. It's unclear to me whether hspell is > intended to be GPLv2 only or GPL (any version), since the COPYING file > is GPLv2, but the README never mentions a version number (which > conventionally means "any version"). Thats something worth checking, although IIRC I looked into that and it wasn't an issue. >> What will be something that is not allowed under the GPL? > Releasing binaries of the actual software or a derivative work of the > software without releasing corresponding source code; Releasing any copy > of the software (source or binary) without a copy of the license Exactly what I'm aiming for. I want OSS to be free to use HebMorph as they like, but for ALL commercial solutions relying on HebMorph to negotiate a license or release their sources. Itamar. |
|
From: Itamar Syn-H. <it...@co...> - 2011-06-10 13:41:29
|
Hi, On 10/06/2011 15:55, Efraim Feinstein wrote: > I'll dig up some unfiltered examples over the weekend. As I said, the > interface I'm using is eXist, so I'm not sure exactly where the > extraneous results are coming from. > > What would be useful to help debug it? Sorry but I'm not familiar with eXist. Try perhaps using the Score - that can give you an idea of how far off was the scoring, or Lucene's Explanation objects. Usually common sense and understanding of the lower works is the best debugger... Common faults are the tf/idf algorithm which takes into account the document length, so short documents score higher (not what you want for Tanach), and lemma density (similar words, different meanings, and ambiguity). There are some configurations that fine tune HebMorph searches, and I'll be blogging about those next week. > I was actually pleasantly surprised at how *well* it works with > Biblical Hebrew, considering that it is based on modern Hebrew > spelling and grammar. It certainly does much better than any other > analyzers. What would it take to add to the dictionary? Although I > don't have lots of time to work on this, we do have a reasonably > complete public domain biblical dictionary (that is, word list + parts > of speech). It wouldn't help with the unique biblical grammatical > forms, non-Academia spelling, or Aramaic, but it could get us a bit > farther along. Actually I recently added a simple external dictionary support (to the .NET version), and it proved to be very useful for one of our users. It didn't use POS though. I'll be blogging about that as well, and I suggest we'll continue this discussion after I have that published. Itamar. |