From: Ben v. K. <bva...@gm...> - 2008-07-02 18:31:27
|
Jim and Itamar have most of the answers for this. However, on the error handling point, although the macros are there, this section of work was never completed. Just so you don't go thinking that there's no work to do on that section :-) cheers ben 2008/7/2 Wellmann, Harald <HWe...@ha...>: > Sorry for not getting back sooner, I've been offline for a couple of days. > > Meanwhile, I've repeated my experiments with Lucene 1.9 and CLucene 0.9.20, > and as far as I can see, these two versions are in fact interoperable (at > least under Win XP with Visual Studio Express 2005 and Java 1.6). > > However, after downgrading to Lucene 1.9, I noticed several restrictions, > and my overall impression is that these older versions of (C)Lucene do not > meet our requirements, so we would have to wait for the CLucene version > compatible to Lucene 2.3.2 to do anything useful with it... > > Some more details: > > 1) Index creation with Lucene 1.9 is dead slow compared to 2.3.2. I'm > indexing about 6 million documents with 3 or 4 fields with total length of > ~200 bytes per document. With 2.3.2, this takes 5 minutes. With 1.9, there > is almost no visible progress in the standard configuration. > > After increasing the buffer size (IndexWriter.setMaxBufferedDocs(2000)), > index creation still took 80 min. > > 2) What is worse, the support for prefix queries is poor. Even in Lucene > 2.3.2, this is a bit difficult, since you may get a TooManyClauses exception > during the internal rewriting of a prefix query "foo*". As a workaround, you > can use a range query "[foo TO fop]". However, in Lucene 1.9/CLucene 0.9.20, > these range queries are handled differently, they get expanded to a Boolean > query with one clause for each term within the range, leading again to a > TooManyClauses exception when there are more than 1024 matches. (Increasing > this threshold is not an option, as we cannot estimate it beforehand.) > > 3) For our application, CLucene would be running in an embedded environment > where C++ exception handling is disabled. Apparently, this use case has been > considered in CLucene, seeing that most (but not all...) exception handling > code is wrapped in macros, but there are no suitable macro definitions for > the case #define _CL_DISABLE_NATIVE_EXCEPTIONS. > > 4) The query parser of CLucene 0.9.20 does not fully match the parser of > Lucene 1.9. Given a range query "[foo TO fop]", CLucene interprets "TO" as > the end of the range and apparently ignores "fop", so I had to rewrite my > query as "[foo fop]". > > 5) Regarding the stability of the 2_3_2 branch: Yes, the test suite does > pass, but since I cannot open my index created by Java Lucene, the obvious > conclusion is that the test coverage is not good enough... Maybe someone > from the development team could have a look into that sizeof() vs. strlen() > issue that I pointed out earlier in this thread. > > Moreover, if it is a design goal of CLucene to be interoperable with a > given version of Java Lucene, I would recommend to add some test cases with > CLucene searching an index created by Java Lucene and Java Lucene searching > an index created by CLucene, using all sorts of complex queries. > > 6) Does anybody have a feeling how long the 2.3.2 compatible release will > take? 4 weeks, 4 months, or some time next year....? This information would > be helpful to us for evaluating whether CLucene is a viable solution for our > use cases and whether we should invest any resources in adapting CLucene or > even contributing to it. > > Best regards, > > Harald > > > -----Ursprüngliche Nachricht----- > Von: clu...@li... im Auftrag von > Itamar Syn-Hershko > Gesendet: Do 26.06.2008 21:27 > An: clu...@li... > Cc: > Betreff: Re: [CLucene-dev] CLucene/Lucene compatibility > > Harald, > > CLucene 0.9.20 is 100% working for JLucene 1.9 and below. Jim's changes > made > it work with 2.1 indices as well, and 2.3 should be available soon enough > (I > think Jim's changes did not cover 2.3 indices in full). Overall I'd define > the 2_3_2 branch stable, since until Ben's most recent committs no major > code changes were made, and the CL framework is very well built. Now with > Ben's latest committs, and work we put on updating specific classes, more > tests should be writted to test CLucene at its peaks. Once this is done, it > will be stable again and only will require the test of time before we will > make it the main release. > > Once Ben's changes are proved to work 100%, Jim and I will commit more 2.3 > specific changes, that should allow you to read 2.3 indices with CL pretty > soon hopefully. > > Itamar. > > ******************************************* > innovative systems GmbH Navigation-Multimedia > Geschaeftsfuehrung: Edwin Summers - Kevin Brown - Regis Baudot > Sitz der Gesellschaft: Hamburg - Registergericht: Hamburg HRB 59980 > > ******************************************* > Diese E-Mail enthaelt vertrauliche und/oder rechtlich geschuetzte > Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail > irrtuemlich erhalten haben, informieren Sie bitte sofort den Absender und > loeschen Sie diese Mail. Das unerlaubte Kopieren sowie die unbefugte > Weitergabe dieser Mail ist nicht gestattet. > This e-mail may contain confidential and/or privileged information. If you > are not the intended recipient (or have received this e-mail in error) > please notify the sender immediately and delete this e-mail. Any > unauthorized copying, disclosure or distribution of the contents in this > e-mail is strictly forbidden. > ******************************************* > > ------------------------------------------------------------------------- > Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW! > Studies have shown that voting for your favorite open source project, > along with a healthy diet, reduces your potential for chronic lameness > and boredom. Vote Now at http://www.sourceforge.net/community/cca08 > _______________________________________________ > CLucene-developers mailing list > CLu...@li... > https://lists.sourceforge.net/lists/listinfo/clucene-developers > > |