|
From: <rg...@sd...> - 2003-10-20 19:22:48
|
>>>>> "amigo" == amigo <am...@ma...> writes: amigo> amigo> Thanks Ben, I got it now ! amigo> The library compiled fine here and the demo works good (it felt amigo> faster indexing the Reuters sample texts, though it could be amigo> just me) It felt faster to me too. Faster than with just the changes I did. I have not looked to see what Ben might have beyond or instead of some of my patches, but it's better. I just ran it all through valgrind once again. All of the errors I had previously noted are now gone. Because of the greater speed of the indexing, I went ahead and indexed some bigger stuff, and did some searches. There are few of the same type of memory error (mismatched delete[]/delete) that occur in code that runs when you search; I did not uncover these before because it was slower before so I didn't try as many things. You can make more of these errors occur, and even crash the demo program, by entering very long search strings. Pounding in half a page of random keyboard hits, with no spaces, seems to do it. This is something I can find and patch also. I can make up another patch for those over the next few days, but it seems CLucene is definitely improving and getting a shake out. My efforts on CLucene over the next two weeks or so will be something like this: 1) finish tracking down and submitting patches for any remaining memory errors, and see those ingested by Ben and everyone else 2) turn attention to memory leaks, not reported in my current runs, and do the same to them 3) Track down any instances of inaccuracy in CLucene, that is references people have made to not getting the right number of hits For the last I may look into something automated, perhaps a modification of the demo program that could index a body, and then churn through a dictionary search on each term with both grep and CLucene and making sure the results match. I left it indexing overnight and it seems to have done a more than one gigabyte sized work ok. --Rob |