From: xaxx <xa...@id...> - 2004-07-20 04:43:38
|
Is there anyone there? All the site updates appear to end in 2002, and there does not appear to be a clear explanation of the search algorithm or the source code for the project anywhere on the site. We have been using ht:/dig for our professional website and there are some very peculiar results which our webmaster cannot explain - largely because he does not understand the program. I have been asked to unravel the problem, and I can be patient as needed, but not if there is no one minding the store anymore. Please let me know how to get a copy of at least the htsearch module, and some decent documentation of the algorithm it implements. Thanks, Sincerely, William A. Hoffman III |
From: Lachlan A. <lh...@us...> - 2004-07-22 12:40:15
|
Greetings William, On Tue, 20 Jul 2004 02:48 pm, xaxx wrote: > Is there anyone there? Yes, we're here :) > All the site updates appear to end in 2002, > and there does not appear to be a clear explanation of the search > algorithm or the source code for the project anywhere on the site. Things have been a bit slow of late, especially the documentation, but we're still working on it all. > We have been using ht:/dig for our professional website and there > are some very peculiar results which our webmaster cannot explain - > largely because he does not understand the program. > > I have been asked to unravel the problem, and I can be patient as > needed, but not if there is no one minding the store anymore. > > Please let me know how to get a copy of at least the htsearch > module, and some decent documentation of the algorithm it > implements. Since you've been using it for some time, I assume you're using version 3.1.6. The source is at <http://www.htdig.org/files/htdig-3.1.6.tar.gz>. The search code is in the .../htsearch subdirectory. Search for "factor" in parser.cc and display.cc for the ranking calculations. However, in 3.1.x, a lot of the ranking is done by htdig itself, so you'll probably need to look there if you are tracking down an anomoly. Search for factor in htcommon/DocumentRef.cc, htcommon/WordList.cc, htdig/Retriever.cc to find some relevant code. I am not aware of any documentation of the actual weighting algorithm. Perhaps once you have unravelled your mystery, you may be the best qualified person to write it! I'll be happy to help out as I can, but I don't know the 3.1 code very well. Cheers, Lachlan -- lh...@us... ht://Dig developer DownUnder (http://www.htdig.org) |