I'm running a 200.000-document-Search Engine (plus Spider) on Webware. All the
documents are MiddleKit objects, the backend is MySQL, and I think the
full-text index that MySQL 4.0 supplies is not good enough. So I am taking on
using Lupy, a Python port of Jakarta Lucene, a full-text index package.
I googled and searched the Wiki and the archive of this mailing list without
finding any references. So before starting, I'd like some of your thoughts on:
* Has anybody attempted this before?
* Has anybody attempted (and succeeded) in integrating other full-text indexing
packages into Webware?
* Does anybody have bad / good / valuable experiences using Lupy? And how about
the full-text index in MySQL 4.1 or 5.0?
* Does anybody have estimates / educated guesses on what the performance of
Lupy fulltext index compared to MySQL might be while searching?
* My first thought would be to achieve the integration by building a
"Lupy-enabeled" subclass of the MiddleKit.Run.ObjectStore class. Based on
setings in the object model, some attributes of some objects would then be
full-text-indexed in a certain way by lupy whenever the attributes changed. To
use the index, a new method similar to store.fetchObjectsOfClass would be
added. Third, a command-line migration tool (to add a fulltext index to an
existing ObjectStore or re-build an index) would be needed.
* Is there enough interest so that I should make this available once I get it
Lupy is a is a full-text indexer and search engine written in Python. It is a
port of Jakarta Lucene 1.2 to Python.
-- Martin Virtel
aim / yahoo messenger mvftd
tel. +49 177 242 2889