|
From: Geoff H. <ghu...@ws...> - 2002-08-28 04:10:14
|
Hi, I had a brief brainstorm on my run today as far as profiling the indexing. Obviously htword/mifluz performance still needs to improve significantly. But another slowdown relative to 3.1 is from the way 3.2 treats hopcounts. To ensure that restricting indexes by hopcount works correctly, the "queue" for URLs is really a priority queue. URLs with lower hopcounts move up the heap. Of course this requires some sorting and some overhead. Right now, I don't think this needs to happen *unless* we're restricting indexing based on hopcount. So the proposal is that when we're not restricting by hopcount, the Server objects would switch back to the previous system (i.e. no sorting). I think this should shave a few percent off of indexing. Does this seem like an OK idea? Can anyone come up with an example where this would be a Bad Idea(tm)? -Geoff |