From: Andreas K. <and...@ac...> - 2009-06-26 18:04:43
|
Andreas Kupries wrote: > > 812: proc ::struct::graph::op::createTwoSquaredGraph {G} { > > 813: > > 814: set H [struct::graph] > > 815: set copyG [struct::graph] > > 816: > > 817: foreach v [$G nodes] { > > 818: $H node insert $v > > 819: $copyG node insert $v > > 820: } > > 821: > > 822: foreach e [$G arcs] { > > 823: set v [$G arc source $e] > > 824: set u [$G arc target $e] > > 825: > > 826: $copyG arc insert $v $u [list $v $u] > > 827: $copyG arc setweight [list $v $u] 1 > > 828: } > > 829: > > 830: foreach v [$copyG nodes] { > > 831: foreach u [$copyG nodes] { > > 832: if { [distance $copyG $v $u] <= 2 && ![$H arc exists [list $u $v]] > && ($u ne $v)} { I thought about this a bit more, on the way home yesterday, and sleeping over it. Not to be meant as criticism, the code here feels like a direct translation of the mathematical definition of G**2 graphs. It basically works by generating all pairs of nodes and then testing the distance between them, where the distance code is particular inefficient by using a dijkstra which computes the data for all nodes, and then throws 99% of it away. We can do it better ... In essence, generate the node pairs in such a way that a distance check is not required, because the way the generation works ensures that the distance <= 2 condition holds true. For this we have to unravel the definition of distance a bit, i.e. what does distance <= 2 mean ? We can analyze this easier by splitting it into 2 cases, i.e. distance == 1, and distance == 2. So, distance (u,v)== 1 means ? That node u can be reached from v in one hop <=> u and v are connected by an edge in G (a) <=> u is a "neighbour of v in G". (b) (X) In other words, the G**@ graph has all edges of the original graph, and we can generate them either through translating (a) or (b) into code, i.e. (Ad a) for all e edge in G, with end nodes u, v add edge (u,v) to G**2 (Ad b) for all nodes u in G for all nodes v neighbour of u in G add edge (u,v) to G**2, if u != v Of these two fragments I prefer (b), for reasons explained when I am dealing with distance 2. (Ad X) As the input graph is considered unidirectional its neighbourhood is all adjacent nodes, whether conected by incoming or outgoing arcs. Now, distance(u,v) == 2 means ? That node u can be reached from v in two hops <=> It exists a node z in G connected to both u and v with edges in G. (c) <=> u is a neighbour of a neighbour of v in G, but not a neighbour of v, nor v. (d) Definition (c) doesn't translate that well to code, but (d) does. (Ad d) for all nodes u in G for all nodes z neighbour of u in G for all nodes v neighbour of z in G add edge (u,v) to G**2 if u != v and no edge (u,v) existing already. The if condition at the end takes care of the "but not ..." terms in (d). This also looks very similar to the code for (b), indeed the two fragements can be merged into one fragment which takes of both at the same time. Which is why I prefered fragment (b) here. what we are in essence doing is to execute the two steps of dijkstra needed to find the nodes at distances 1 and 2 for all nodes, but no more. As this directly generates the nodes at the distance we are interested in the distance test becomes irrelevant. It is always true, by construction. While it doesn't look like the mathematical definition any longer, and as such needs more comments to make the relation to it clear, it should also be more efficient than the naive code. Now, looking at the overall algorithm,, as described in the other mail, i.e one level higher, about the series of G(i) of G(i)**2 we are apparently searching from G(1), G(2), ... on up one important question to ask is Do we have to compute all the G(i), G(i)**2 from the ground up ? Complementary phrasing: Can we generate the G(i), G(i)**2 incrementally from the previous graph G(i-1), G(i-1)**2 ? In our case the answer is yes, we can do this incrementally. Assuming G(i-1), G(i-1)**2 exist, then G(i) = G(i-1) + edge Ei. G(i)**2 = (G(i-1)**2 + Ei + additional edges. These additional edges can be computed from Ei! Let Ei = (u,v). Then the nodes z in neighbours(u,G)-v are distance 2 to v and z in neighbours(v,G)-u are distance 2 to u Easy to ocmpute and add. And the induction is started with G(0) == G(0)**2 == The graph containing just the nodes of the input graph, and no edges. The upshot of this is, we need only create one graph X in the code, which we initialize to contain G(0)**2, and during the loop iterating over our sorted edges we then extend this graph in each round using the above rules with Ei and distance-2 edges to contain G(i)**2. Instead of createTwoSquaredGraph a better helper procedure will be extendTwoSquaredGraph Andreas |