From: Vitaly A. <vit...@gm...> - 2012-11-25 11:32:53
|
I checked CJKAnalyzer source and see only CJKTokenizer implementation in it. Is it means that I need to create specific analyzer to use it with CJKTokenizer? Thanks, Vitaly On Thu, Nov 22, 2012 at 1:13 PM, Freiholz Manuel <M.F...@ca...>wrote: > Hi,**** > > ** ** > > in my experience it’s the best way to create N-Grams for the Asian texts. > I think basic CJKAnalyzers already do it this way.**** > > ** ** > > Manuel**** > > ** ** > > *Von:* Vitaly Artemov [mailto:vit...@gm...] > *Gesendet:* Donnerstag, 22. November 2012 11:23 > *An:* clu...@li... > *Betreff:* Re: [CLucene-dev] Creating CLucene Index in a Database; > Support for Asian languages**** > > ** ** > > One more question about Asian languages: > I know that in Asian languages word boundaries are difficult issue. > How are you tokenize Asian texts? > Thank you, Vitaly**** > > On Thu, Nov 22, 2012 at 12:18 PM, Vitaly Artemov <vit...@gm...> > wrote:**** > > Thank you for your fast reply. > Can you please explain why Filesystem store better than Database. > We will use CLucene to index and search huge amount of data. > Vitaly**** > > ** ** > > On Thu, Nov 22, 2012 at 11:40 AM, Itamar Syn-Hershko <it...@co...> > wrote:**** > > inline**** > > ** ** > > On Thu, Nov 22, 2012 at 11:15 AM, Vitaly Artemov <vit...@gm...> > wrote:**** > > > Hello all, > I starting to evaluate Clucene engine for using in our product. > I have 2 questions. > > 1. Is It planned to add support(or it already exists) for creating index in > the Database instead of memory or filesystem? > I read that java Lucene has it by providing JdbcDirectory interface.** > ** > > ** ** > > Don't do that. Use the filesystem, it is much better for every aspect.**** > > **** > > > 2. I read in the FAQ that: > "CLucene is not limited to English, nor any other language. To index text > properly, you need to use an Analyzer appropriate for the language of the > text you are indexing. CLucene's default Analyzers work well for English. > There are a number of other Analyzers in "CLucene Sandbox", including > those for Chinese, Japanese, and Korean." > But "CLucene Sandbox" link is not works for some reason. Can you specify > link to Analyzers list?**** > > ** ** > > Take a look at CJKAnalyzer**** > > **** > > > Thanks in advance, Vitaly**** > > > > ------------------------------------------------------------------------------ > Monitor your physical, virtual and cloud infrastructure from a single > web console. Get in-depth insight into apps, servers, databases, vmware, > SAP, cloud infrastructure, etc. Download 30-day Free Trial. > Pricing starts from $795 for 25 servers or applications! > http://p.sf.net/sfu/zoho_dev2dev_nov > _______________________________________________ > CLucene-developers mailing list > CLu...@li... > https://lists.sourceforge.net/lists/listinfo/clucene-developers**** > > ** ** > > > > ------------------------------------------------------------------------------ > Monitor your physical, virtual and cloud infrastructure from a single > web console. Get in-depth insight into apps, servers, databases, vmware, > SAP, cloud infrastructure, etc. Download 30-day Free Trial. > Pricing starts from $795 for 25 servers or applications! > http://p.sf.net/sfu/zoho_dev2dev_nov > _______________________________________________ > CLucene-developers mailing list > CLu...@li... > https://lists.sourceforge.net/lists/listinfo/clucene-developers**** > > ** ** > > ** ** > > > ------------------------------------------------------------------------------ > Monitor your physical, virtual and cloud infrastructure from a single > web console. Get in-depth insight into apps, servers, databases, vmware, > SAP, cloud infrastructure, etc. Download 30-day Free Trial. > Pricing starts from $795 for 25 servers or applications! > http://p.sf.net/sfu/zoho_dev2dev_nov > _______________________________________________ > CLucene-developers mailing list > CLu...@li... > https://lists.sourceforge.net/lists/listinfo/clucene-developers > > |