From: Tim M. <ti...@ti...> - 2003-10-16 17:50:28
|
On Wed, 15 Oct 2003 23:17:58 -0400, ke...@xo... (Kenneth C. Schalk) wrote: > On Mon, Oct 13, 2003 at 03:19:21PM -0700, Allan Heydon wrote: > Without altering the data structures used to store dependencies, the > process of eliminating redundant dependencies is going to be pretty > expensive (~O(n^2) in the size of the dependency set). Sorting them > by path first may make it a little better on average, but probably not > a lot. (The fact that my modified VCacheStats, which was essentially > doing exactly this over every entry in the cache, took as long as it > did illustrates the problem pretty well.) Roughly speaking, a dependency is redundant if there is another dependency that's a prefix of it, right? I know it's more complicated than that because of dependency types, but let's ignore that for a moment. It seems like putting the dependencies into a tree as they're generated could make the elimination much faster. The tree arcs would be labelled by arcs of the dependency paths. As you're inserting a new path P into the structure, it's easy to recognize when there's already a path in there that is a prefix of P, or when P is a prefix of some path(s) already in the structure. In the latter case you delete all of the substructure that's rooted at P's end node. The constants in doing this are enough bigger than simply doing string compares that it's unclear how much it would win, but asymtotically it should be much better. I think. -- Tim Mann ti...@ti... http://www.tim-mann.org/ |