Re: [CLucene-dev] CLucene questions

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

I know what this is, I've read the file format spec page, question is indeed
whether CLucene in its current version will know to identify the values it
needs automatically without this segments file? also, does it support what
so called "generations" yet, or is it a v2.x feature?

As for the rest of the points:
1. Arabic stemmer isn't much of help, thanks anyway.
2. How would I indeed execute a search for words in all fields (as opposed
to h1:foo OR MsoNormal:foo). Would *:foo indeed work? This is crucial since
as I see this now the Hebrew stemmer/analyzer would enhance each word into a
couple of options (due to the morpholgical complexity).

Also, I found this note in CLucene FAQ: "Note: Leading wildcards (e.g. *ook)
are not supported by the !QueryParser (although CLucene could handle them.".
What does this mean? how would I go about enabling this support in my
queries (-- again, due to the Hebrew language structure, this will save the
need for some several stemming operations)?

Thanks!

Itamar.

  _____  

From: clu...@li...
[mailto:clu...@li...] On Behalf Of James
Weir
Sent: Tuesday, January 15, 2008 11:27 PM
To: clu...@li...
Subject: Re: [CLucene-dev] CLucene questions

I know that. I was asking whether there is a way I could get rid of the
segments file. 

Don't remove this file.  It is the master record for your index, think of it
as a table of contents.  Even if you optimize down to one segment you'll
need this file.  In file format 1.x it is called "segments" and in 2.x
indices there are 2 files "segments.gen" and "segments_XXX".  segments.gen
merely contains the value XXX in the other filename.  I believe you could,
if you absolutely wanted to, get rid of "segments.gen " and Lucene will
infer the most current generation value of your index.

-Jim