|
From: Jessica B. <jes...@ya...> - 2002-10-02 14:34:45
|
Actually, to append to this problem I'm running into,
it doesn't appear to be the text WITHIN the anchor tag
-- rather, there is text between the beginning <A> and
the ending </A> of the anchor tag that contains the
word "large".
The question still is, though, how do I completely
wipe out any reference to text between these tags as
relevant to the linked document?
Thanks,
-Jes
--- Jessica Biola <jes...@ya...> wrote:
> I have two files on my test site being indexed.
>
> fruit.html
> pineapple.html
>
> There's a word, "large" on fruit.html. "large" does
> NOT appear anywhere within pineapple.html, however,
> when I htsearch on the index, both documents show as
> a
> match. In fact, the base_score is very high for the
> query "large" on the document pineapple.html.
>
> If I index pineapple.html alone, the query "large"
> yields no results. So there is definitely some type
> of relationship between the two documents and the
> word
> "large". htdump revealed the relationship by
> outputting this:
>
> 3 u:http://jtest/pineapple.htm t:pineapples a:0
> m:1033563111 s:5824 H: Pineapple trees h:
> l:1033563111 L:12 b:5 c:1 g:0 e:
>
> n: S: d:large A:Stump^ATrunk
>
> The key part is "d:large". According to the htdump
> doc page online, the "d" element is defined as:
> "The
> text of links pointing to this document. (e.g. <a
> href="docURL">description</a>)"
>
> fruit.html, the other HTML file indexed, contains a
> hyperlink to pineapple.html that looks like this:
>
> <a href="pineapple.html"
>
onMouseOver="MM_showHideLayers('Pine_apple','','show','large','','hide','apple','','hide','tomato','')"><img
> border="0" src="images/pineapple.jpg" width="80"
> height="60"></a>
>
> What configuration attribute must set to zero in
> order
> for that extra anchor text being indexed and
> factored
> into pineapple.html's word list? I've basically
> tried
> setting all "documented" factors to zero (including
> backlink).
>
> I used the latest htdig-3.2.0b4-092902 as well as a
> January 2002 release -- both behave the same way.
__________________________________________________
Do you Yahoo!?
New DSL Internet Access from SBC & Yahoo!
http://sbc.yahoo.com
|