From: Budd, S. <s....@im...> - 2003-01-29 10:32:47
|
Another way is to capture the user clicks to results page documents would be to set up a htdig web server and to have all the links in the results pages click thru the htdig web server onto the desired web page perhaps thru a cgi-bin or a redirect. How to affect the weight of a hit based on this information is rather problematic because the weight to be ascribed to a particular url in the database would be dependant on the search ( boolean? phrase ) string that was used to produce the result. A large! matrix. Perhaps just the fact that the url was clicked on would be enough information. In this case a usage database would be updated. Another difficulty is if the site is reindexed, what happens to the url id's vs frequency of use . Perhaps using the MD5 of the uls's as a key? -----Original Message----- From: Kev Shepherd [mailto:K.S...@bo...] Sent: Tuesday, January 28, 2003 11:20 PM To: htd...@li... Subject: [htdig] Knowledge re-use suggestion Hi, I'm new to the list (but not to HtDig) so forgive me if these has been suggested or discussed previously, though I had a quick search through the archives. I have been thinking over how a search tool could adopt a knowledge management approach, having just completed a thesis based on that topic. I have been running HtDig for years, and a number of times have thought about harnessing what people are searching for as symptoms of desirable knowledge in an organisation. Looking through the logs, there seemed to be valuable information buried away. I recently modified my HtDig configuration so that I could use PHP4 search and results forms. I began to think that now I could change the results URLs into PHP links to capture the information into a log of what links were followed. The log would simply contain the URL, the keyword(s) used, and todays date. I believe it might be useful to merge information from "viewed URL" logs back in to the HtDig database, to raise the priority of documents, depending on whether they've been viewed previously. Moreover, the raised priority should be based on recency, as the value of information becomes dated. For example, documents viewed recently might be given a higher weighting, but that weighting would diminish over (say) 100 days. I know there will be users who click on all links until they find what they wanted, but I suspect that on average, the weightings would lead to higher initial win rates. While I am competent at PHP, I don't think I could tackle a patch for HtDig so I'll throw this idea open for discussion. Regards, Kev. ------------------------------------------------------- This SF.NET email is sponsored by: SourceForge Enterprise Edition + IBM + LinuxWorld = Something 2 See! http://www.vasoftware.com _______________________________________________ htdig-general mailing list <htd...@li...> To unsubscribe, send a message to <htd...@li...> with a subject of unsubscribe FAQ: http://htdig.sourceforge.net/FAQ.html |