From: Daniel R. <da...@fr...> - 2005-02-23 17:00:23
|
We installed htdig to search an HTML catalog located at: http://www.jorvet.com/products/css/ In general things seem to work well, and some times the search returns good results. Indexing the files in verbose mode returned no errors or problems. But for example if you search for the word 'syringes' the search results list files that don't contain the word: http://www.jorvet.com/cgi_bin/htsearch.cgi Searching for a product number seems to work well most every time. Not sure what to do to tune the results. Thanks in advance. |
From: Olivier V. <ova...@ch...> - 2005-02-23 17:42:10
|
Have you tried to reset your database? Daniel Reed wrote: >We installed htdig to search an HTML catalog located at: > >http://www.jorvet.com/products/css/ > >In general things seem to work well, and some times the search returns >good results. Indexing the files in verbose mode returned no errors or >problems. But for example if you search for the word 'syringes' the >search results list files that don't contain the word: > >http://www.jorvet.com/cgi_bin/htsearch.cgi > >Searching for a product number seems to work well most every time. Not >sure what to do to tune the results. > >Thanks in advance. > > >------------------------------------------------------- >SF email is sponsored by - The IT Product Guide >Read honest & candid reviews on hundreds of IT Products from real users. >Discover which products truly live up to the hype. Start reading now. >http://ads.osdn.com/?ad_id=6595&alloc_id=14396&op=click >_______________________________________________ >ht://Dig general mailing list: <htd...@li...> >ht://Dig FAQ: http://htdig.sourceforge.net/FAQ.html >List information (subscribe/unsubscribe, etc.) >https://lists.sourceforge.net/lists/listinfo/htdig-general > > |
From: Mike C. <mi...@mi...> - 2005-02-23 17:46:56
|
On Wed, 23 Feb 2005 10:00:17 -0700 (MST) Daniel Reed <da...@fr...> wrote: > In general things seem to work well, and some times the search returns > good results. Indexing the files in verbose mode returned no errors or > problems. But for example if you search for the word 'syringes' the > search results list files that don't contain the word: I get fairly good results on syringe & variants, and good results on other search terms. There is just one document I've seen that comes up each time that syringe is searched, which is Catalog_13. This _does_ contain "needle" several times, and I wonder if needle has been declared as a synonym for syringe at some time, or if the wording of the page has been changed since indexing? BTW, that fixed positioning for every element on the page does not come out at all well in Mozilla 1.7 Mike -- Mike Causer Email - mailto:mi...@mi... GPG KeyID 1C2DDA07 WWW - http://www.mikecauser.com Flood the fen again! - Wicken Fen enlargement - http://www.wicken.org.uk |
From: Jim <li...@yg...> - 2005-02-24 20:23:33
|
On Wed, 23 Feb 2005, Daniel Reed wrote: > In general things seem to work well, and some times the search returns > good results. Indexing the files in verbose mode returned no errors or > problems. But for example if you search for the word 'syringes' the > search results list files that don't contain the word: Did you check the file itself? htdig will pick up and index some terms that don't appear in the rendered page (e.g. meta tags). It might also pick up other unexpected terms if there are any missing or misplaced closing tags. In addition, I believe htdig will add the link descriptions found in other indexed documents to the documents they point to. So for example if one of the pages you refer to is linked to by some other page with a link description like "Proper handling of syringes", then htdig would add "proper", "handling", and "syringes" to the terms associated with that page. Jim |
From: Mike C. <mi...@mi...> - 2005-02-25 16:53:22
|
On Thu, 24 Feb 2005 13:22:58 -0700 (MST) Jim <li...@yg...> wrote: > Did you check the file itself? htdig will pick up and index some terms > that don't appear in the rendered page (e.g. meta tags). There were no meta tags at all, and that reminds me of something I forgot to say. Daniel, you can easily improve the results by adding meta keywords to the html pages. It would also help if the title of each page were something more meaningful than "Catalog_237", "Small Animal Feeding Tube" for example, because htdig (and most other search software) attach much importance to the title. If you do not have sufficient influence over the content of the pages then you should look at tuning the text_factor to be much significant than title_factor and keywords_factor. Mike -- Mike Causer Email - mailto:mi...@mi... GPG KeyID 1C2DDA07 WWW - http://www.mikecauser.com Flood the fen again! - Wicken Fen enlargement - http://www.wicken.org.uk |