Re: [Hebmorph-thinktank] hebmorph behavior
Status: Pre-Alpha
Brought to you by:
synhershko
|
From: Itamar Syn-H. <it...@co...> - 2012-01-31 18:17:16
|
That is expected הדירה is being mapped to two index terms: דירה and הדירה$ - the latter is used to boost exact matches. So when searching you will only get hits for tokens which are mapped to דירה as well. דיור is another lemma, and is being saved to the index as דיור. In the .NET sources there is a tool called VisualHebMorph, it should help understand what maps into what. The general idea is a lemma is _not_ a stem. I discuss this in more length in my blog. Itamar. On Tue, Jan 31, 2012 at 8:11 PM, Shai <sh...@dr...> wrote: > Hi > > I am trying to test searches, > using lucene java with MorphAnalyzer and HebrewMultiFieldQueryParser > > I want to know if this is the correct behavior - > > I index a sentence- הדירה יקרה > (for example) > > but when searching for דיור > it don't give any result > > when i search for דירה / והדירה / דירתן / דירות > it do give result..(as probably expected..) > > > Thanks > > > ------------------------------------------------------------------------------ > Keep Your Developer Skills Current with LearnDevNow! > The most comprehensive online learning library for Microsoft developers > is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, > Metro Style Apps, more. Free future releases when you subscribe now! > http://p.sf.net/sfu/learndevnow-d2d > _______________________________________________ > Hebmorph-thinktank mailing list > Heb...@li... > https://lists.sourceforge.net/lists/listinfo/hebmorph-thinktank > |