On Wed, 17 Oct 2001, Quim Sanmarti wrote:
> > That's when I began to wonder if anybody in the htdig
> > development community had looked into implementing 'bayesian'
> > searching, or if htdig could do 'it', hence my vague post.
> The htdig databases are postitively not prepared to deal with such
> techniques, IMHO. They are not intended to. The power of htdig is based in
> 'classical' boolean queries.
I don't know that I'd call them "classical" anymore, but I'd agree that
I'd design a different word database backend if I wanted to do Bayesian
queries. But I think you can get pretty high quality results without
this--AFAIK, Google doesn't use them and most people see them as the
> > My theory was that most new market trends (worth paying
> > attention to) are usually already, or quickly will be, reflected in the open
> > source development community.
> My perception is eventually the inverse. Open-source has been traditionally
> being bound to research and innovation. It's now being used by companies as
> an innovation channel, so that market trends emerge later from there...
This depends a lot on the development effort. Certainly gcc has some true
innovation and is a great example of getting truly fantastic people
together--I doubt you could ever afford to pay for all the development on
In this case, I think parts of ht://Dig could be used for research
purposes and I think there are several research-grade algorithms that
could be implemented without too much effort (n-gram fuzzy algorithms come
On the other hand, the number of active contributors to the project right
now is extremely low and so I think we'd need an infusion of "fresh
brains" before this could happen.
As Quim pointed out as well, there are other packages which attempted to
tackle Bayesian searches.
Williams Students Online