|
From: Jessica B. <jes...@ya...> - 2002-10-02 13:15:29
|
I have two files on my test site being indexed.
fruit.html
pineapple.html
There's a word, "large" on fruit.html. "large" does
NOT appear anywhere within pineapple.html, however,
when I htsearch on the index, both documents show as a
match. In fact, the base_score is very high for the
query "large" on the document pineapple.html.
If I index pineapple.html alone, the query "large"
yields no results. So there is definitely some type
of relationship between the two documents and the word
"large". htdump revealed the relationship by
outputting this:
3 u:http://jtest/pineapple.htm t:pineapples a:0
m:1033563111 s:5824 H: Pineapple trees h:
l:1033563111 L:12 b:5 c:1 g:0 e:
n: S: d:large A:Stump^ATrunk
The key part is "d:large". According to the htdump
doc page online, the "d" element is defined as: "The
text of links pointing to this document. (e.g. <a
href="docURL">description</a>)"
fruit.html, the other HTML file indexed, contains a
hyperlink to pineapple.html that looks like this:
<a href="pineapple.html"
onMouseOver="MM_showHideLayers('Pine_apple','','show','large','','hide','apple','','hide','tomato','')"><img
border="0" src="images/pineapple.jpg" width="80"
height="60"></a>
What configuration attribute must set to zero in order
for that extra anchor text being indexed and factored
into pineapple.html's word list? I've basically tried
setting all "documented" factors to zero (including
backlink).
I used the latest htdig-3.2.0b4-092902 as well as a
January 2002 release -- both behave the same way.
__________________________________________________
Do you Yahoo!?
New DSL Internet Access from SBC & Yahoo!
http://sbc.yahoo.com
|