[PyIndexer] Thoughts on MySQL Implementation
Status: Pre-Alpha
Brought to you by:
cduncan
From: Casey D. <cas...@ya...> - 2001-12-16 17:00:54
|
A couple of observations: The index population could seems pretty inefficient to me in sqlindexer. You are essentially sending one sql INSERT per word when something is indexed. I think it would be worthwhile to test an implementation that uses a single LOAD DATA statement when the text is indexed. This would involve writing the rows to be inserted into textindex to a temporary file and then using LOAD DATA to batch load it to mySQL. See the following for docs on LOAD DATA: http://www.mysql.org/documentation/mysql/bychapter/manual_Reference.html#LOAD_DATA As the search end, it seems to me that app side processing will be best for positional matches, such as for phrases. I'd imagine such processing should be eventually coded in C, but it should be acceptable in Python if done efficiently (like using Python arrays, IISets or some-such). You might want to check out this Guido-essay for some ideas here: http://www.python.org/doc/essays/list2str.html I agree that storing a document count for each word could help with optimizing since you could start with the smallest dataset first and prune it from there. Perhaps IISets could be used to get UNION/INTERSECT functionality efficiently if mySQL can't do it for you. As for partial indexes, they may help, especially reducing memory paging and improving cache usage. Here are the docs for that: http://www.mysql.org/documentation/mysql/bychapter/manual_Reference.html#CREATE_INDEX It would probably be worthwhile to test times for a relatively large size (32 bytes?) vs a small one (4 or 8 bytes?). Sometimes less is more 8^). Its near impossible to tell which would be faster on a given architecture without real-world testing. BTW: Have you seen mySQL's built-in full-text indexing support? http://www.mysql.org/documentation/mysql/bychapter/manual_Reference.html#Fulltext_Search Not sure whether it does everything we need, but it probably worth a look... -Casey __________________________________________________ Do You Yahoo!? Check out Yahoo! Shopping and Yahoo! Auctions for all of your unique holiday gifts! Buy at http://shopping.yahoo.com or bid at http://auctions.yahoo.com |