I hate to break it to you but a relational database is
not going to be able to index millions of documents and
provide a satisfactory degree of search response with
a degree of relevance ranking. The reason? RDBM's write and access data randomly. i.e. in on-disk tree structures. -Any- fast search implementation needs to perform sequential I/O as opposed to random I/O.
If you are serious I suggest you look at lucence, muscat and mifluz which to my knowledge are the only open source industry strength full text indexing libraries.
Regards
Peter Marelas
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Something lightweight like MySQL may not be able to, but enterprise systems like Sybase sure can if everything is done properly. I have dbs that have dozens of millions of records and it works just fine.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I just read your db schema.
I hate to break it to you but a relational database is
not going to be able to index millions of documents and
provide a satisfactory degree of search response with
a degree of relevance ranking. The reason? RDBM's write and access data randomly. i.e. in on-disk tree structures. -Any- fast search implementation needs to perform sequential I/O as opposed to random I/O.
If you are serious I suggest you look at lucence, muscat and mifluz which to my knowledge are the only open source industry strength full text indexing libraries.
Regards
Peter Marelas
Something lightweight like MySQL may not be able to, but enterprise systems like Sybase sure can if everything is done properly. I have dbs that have dozens of millions of records and it works just fine.