From: Todd H. <th...@me...> - 2002-10-03 19:57:42
|
Hi, I am noticing this too in 3.2.0b4 snapshot - even PDF's with only one word occurrence seem to rank much higher than HTML files with many occurrences of the search term... we can't seem to get to the bottom of this... Todd Todd Hooge Director of Website Development Metamend Software & Design Ltd. http://www.metamend.com/ -----Original Message----- From: htd...@li... [mailto:htd...@li...]On Behalf Of Gilles Detillieux Sent: Thursday, October 03, 2002 12:38 PM To: Ted Stresen-Reuter Cc: htdig Subject: Re: [htdig] pdf and word files ranked higher than html files According to Ted Stresen-Reuter: > Any idea why pdf and Word files are consistently ranked higher than html > files (which have keyword meta tags, TITLE tags, and H1 tags with closer > matches)? Not really, but you're not the first person to complain about it. I think in the past it's usually boiled down to the fact that the word appears many more times in the text of the PDF or Word document than in the HTML files. Is this still with a recent 3.2.0b4 snapshot, or have you gone back to 3.1.6 now? Another scoring quirk in 3.1.x is that words near the start of a document are ranked higher than words near the end. Mind you, meta tags, titles and h1 tags tend to be near the start, so they should be ranked high in 3.1.x. -- Gilles R. Detillieux E-mail: <gr...@sc...> Spinal Cord Research Centre WWW: http://www.scrc.umanitoba.ca/ Dept. Physiology, U. of Manitoba Winnipeg, MB R3E 3J7 (Canada) ------------------------------------------------------- This sf.net email is sponsored by:ThinkGeek Welcome to geek heaven. http://thinkgeek.com/sf _______________________________________________ htdig-general mailing list <htd...@li...> To unsubscribe, send a message to <htd...@li...> with a subject of unsubscribe FAQ: http://htdig.sourceforge.net/FAQ.html |