|
From: Jessica B. <jes...@ya...> - 2002-10-02 22:07:04
|
Thanks Gilles and Geoff for your responses. Indeed, description_factor is the setting that I overlooked confusing that with the meta_description_factor. --- Gilles Detillieux <gr...@sc...> wrote: > According to Jessica Biola: > > Actually, to append to this problem I'm running > into, > > it doesn't appear to be the text WITHIN the anchor > tag > > -- rather, there is text between the beginning <A> > and > > the ending </A> of the anchor tag that contains > the > > word "large". > > Ah! Well that certainly explains why I was unable > to reproduce the > problem earlier, based on your first message. Yes, > indexing text > between the <A ...> and </A> tags as link > description text for the > referenced document is normal behaviour for htdig. > > > The question still is, though, how do I completely > > wipe out any reference to text between these tags > as > > relevant to the linked document? > > As Geoff pointed out, description_factor is the one > that controls the > score that will be applied to this text. > Unfortunately, the best you > can do is drop the score of these documents so they > appear at the end > of the results instead of at the start. In the > future, there will be > some sort of option for removing matches with a 0 or > small score. > > > > There's a word, "large" on fruit.html. "large" > does > > > NOT appear anywhere within pineapple.html, > however, > > > when I htsearch on the index, both documents > show as > > > a > > > match. In fact, the base_score is very high for > the > > > query "large" on the document pineapple.html. > > > > > > If I index pineapple.html alone, the query > "large" > > > yields no results. So there is definitely some > type > > > of relationship between the two documents and > the > > > word > > > "large". htdump revealed the relationship by > > > outputting this: > ... > > > What configuration attribute must set to zero in > > > order > > > for that extra anchor text being indexed and > > > factored > > > into pineapple.html's word list? I've basically > > > tried > > > setting all "documented" factors to zero > (including > > > backlink). > > > > > > I used the latest htdig-3.2.0b4-092902 as well > as a > > > January 2002 release -- both behave the same > way. > > By the January 2002 release, do you mean the 3.1.6 > release of Jan 31/02, > or one of the January snapshots of 3.2.0b4? Either > way, the behaviour > of description_factor will be similar, the main > difference being that > with 3.1.x, you need to reindex after changing this > factor, whereas with > 3.2.x you don't. > > -- > Gilles R. Detillieux E-mail: > <gr...@sc...> > Spinal Cord Research Centre WWW: > http://www.scrc.umanitoba.ca/ > Dept. Physiology, U. of Manitoba Winnipeg, MB R3E > 3J7 (Canada) > > > ------------------------------------------------------- > This sf.net email is sponsored by:ThinkGeek > Welcome to geek heaven. > http://thinkgeek.com/sf > _______________________________________________ > htdig-dev mailing list > htd...@li... > https://lists.sourceforge.net/lists/listinfo/htdig-dev __________________________________________________ Do you Yahoo!? New DSL Internet Access from SBC & Yahoo! http://sbc.yahoo.com |