From: Geoff H. <ghu...@ws...> - 2001-10-17 15:11:46
|
On Wed, 17 Oct 2001, Quim Sanmarti wrote: > > That's when I began to wonder if anybody in the htdig > > development community had looked into implementing 'bayesian' > > searching, or if htdig could do 'it', hence my vague post. > > The htdig databases are postitively not prepared to deal with such > techniques, IMHO. They are not intended to. The power of htdig is based in > 'classical' boolean queries. I don't know that I'd call them "classical" anymore, but I'd agree that I'd design a different word database backend if I wanted to do Bayesian queries. But I think you can get pretty high quality results without this--AFAIK, Google doesn't use them and most people see them as the target. > > My theory was that most new market trends (worth paying > > attention to) are usually already, or quickly will be, reflected in the open > > source development community. > My perception is eventually the inverse. Open-source has been traditionally > being bound to research and innovation. It's now being used by companies as > an innovation channel, so that market trends emerge later from there... This depends a lot on the development effort. Certainly gcc has some true innovation and is a great example of getting truly fantastic people together--I doubt you could ever afford to pay for all the development on gcc. In this case, I think parts of ht://Dig could be used for research purposes and I think there are several research-grade algorithms that could be implemented without too much effort (n-gram fuzzy algorithms come to mind). On the other hand, the number of active contributors to the project right now is extremely low and so I think we'd need an infusion of "fresh brains" before this could happen. As Quim pointed out as well, there are other packages which attempted to tackle Bayesian searches. -- -Geoff Hutchison Williams Students Online http://wso.williams.edu/ |