From: Itamar Syn-H. <it...@di...> - 2008-01-15 22:04:38
|
I know what this is, I've read the file format spec page, question is indeed whether CLucene in its current version will know to identify the values it needs automatically without this segments file? also, does it support what so called "generations" yet, or is it a v2.x feature? As for the rest of the points: 1. Arabic stemmer isn't much of help, thanks anyway. 2. How would I indeed execute a search for words in all fields (as opposed to h1:foo OR MsoNormal:foo). Would *:foo indeed work? This is crucial since as I see this now the Hebrew stemmer/analyzer would enhance each word into a couple of options (due to the morpholgical complexity). Also, I found this note in CLucene FAQ: "Note: Leading wildcards (e.g. *ook) are not supported by the !QueryParser (although CLucene could handle them.". What does this mean? how would I go about enabling this support in my queries (-- again, due to the Hebrew language structure, this will save the need for some several stemming operations)? Thanks! Itamar. _____ From: clu...@li... [mailto:clu...@li...] On Behalf Of James Weir Sent: Tuesday, January 15, 2008 11:27 PM To: clu...@li... Subject: Re: [CLucene-dev] CLucene questions I know that. I was asking whether there is a way I could get rid of the segments file. Don't remove this file. It is the master record for your index, think of it as a table of contents. Even if you optimize down to one segment you'll need this file. In file format 1.x it is called "segments" and in 2.x indices there are 2 files "segments.gen" and "segments_XXX". segments.gen merely contains the value XXX in the other filename. I believe you could, if you absolutely wanted to, get rid of "segments.gen " and Lucene will infer the most current generation value of your index. -Jim |