From: David M. <ma...@ha...> - 2013-08-29 07:38:49
|
Hi, At Thu, 29 Aug 2013 09:26:17 +0200, Christian Dabrowski wrote: > > Hi Vufinders, > > is somebody here using stemming for german language? > > In schema.xml I found that Solr uses stemming by default for english > with this line in the index and query analyzer: > > <filter class="solr.SnowballPorterFilterFactory" language="English"/> > > Can I add another line with another language just changing the language > to what I need, for example "German" or do I have to add something else? > > During my internet search about this I found there is problem with > compound words which are very common in german, like "Donaudampfschiff" > (danubian steam ship). Searching only for "Dampfschiff" would not > succeed with german search query then. There is another filter to solve > this: > > <filter class="solr.DictionaryCompoundWordTokenFilterFactory" > dictionary="my_dictionary.txt" /> > > But as far as I understood this filter needs to be fed with a dictionary > which has the compound words already in the basic form (splitted). Are > there any experiences with it? Or does somebody has a german dictionary > list that could be posted here? > I'm not sure if we should apply stemming at all, but I considered a less aggressive stemmer for german titles: solr.GermanLightStemFilterFactory or even solr.GermanMinimalStemFilterFactory. Best, -- David > > Thanks in advance, > Christian > > ------------------------------------------------------------------------------ > Learn the latest--Visual Studio 2012, SharePoint 2013, SQL 2012, more! > Discover the easy way to master current and previous Microsoft technologies > and advance your career. Get an incredible 1,500+ hours of step-by-step > tutorial videos with LearnDevNow. Subscribe today and save! > http://pubads.g.doubleclick.net/gampad/clk?id=58040911&iu=/4140/ostg.clktrk > _______________________________________________ > Vufind-tech mailing list > Vuf...@li... > https://lists.sourceforge.net/lists/listinfo/vufind-tech |