From: Neal R. <ne...@ri...> - 2002-01-08 22:44:39
|
Any one have an answer for Question 1? The 3.2 Beta3 release TODO page marks Question 5 (Field-based searching) as 'out-standing'. Who is working on that? I also notice the 3.2 Beta4 snapshot from Dec 30 does not have an updated TODO.html file (dated Feb 15). Is there a browsable cvsweb of the code somewhere? Thanks -- Neal Richter Knowledgebase Developer RightNow Technologies, Inc. Customer Service for Every Web Site ---------- Forwarded message ---------- Date: Fri, 4 Jan 2002 12:23:00 -0700 (MST) From: Neal Richter <ne...@ri...> To: htd...@li... Subject: [htdig-dev] Incremenal Index Efficiency, Unicode, & 2GIG limit Hello again, I've got a couple more questions.. 1. Is there any need to rebuild the index from scratch periodically? Some commercial search engines use incremental indexing and recommend that when the incremental portion of the index gets to be a given size (say 20%) the entire index is rebuilt. 2. Is it possible to turn stemming off for particular languages during run time? We have our own stemming tools.. (Porter Algorithm) 3. (Unicode) Is the index (the core of the index code) capable of doing multibyte searching? For example if a fully escaped version of a Japanese or other multibyte document was indexed.. and then searched with a properly escaped query.. would valid matches occur? (exculde any UI or upper level code in your thing here.) 4. (2 Gig Limit) Some of the archives will be at a million+ documents in size with an average length exceeding 2K. Other than using XFS or JFS, the solution in this case is to use multiple index files? 5. Is there a way to add a 'field' to the index? Ie.. multiple documents share a source-id & a query is given to return the documents with that source-id. This could accomplished implicitly by modifying the source-id to be some special alpha-numeric character (DJ23KJD823).. but this has a small probability of giving false-positive search results. Thanks for your help! -- Neal Richter Knowledge base Developer Right Now Technologies, Inc. Customer Service for Every Web Site _______________________________________________ htdig-dev mailing list htd...@li... https://lists.sourceforge.net/lists/listinfo/htdig-dev |