From: Michał A. <ant...@gm...> - 2009-06-25 15:11:27
|
Hello. I updated some tests for Max-Cut ( testing subprocedures seperately ). And now TSP. After some rethinking I came up with such solution to handle those uncomplete graphs: - first step filling graph with edges to have a complete graph. At the same time saving set of original Edges. Weights on new edges are set to Inf. [ In practise algorithm is avoiding them anyway, so maybe it's even not needed to take that step ]. - further, algorithm goes the same, until creating Hamilton Cycle ( procedure findHamiltonCycle ). - what we were given was the set of nodes ( Hamilton cycle ) given based on found Eulerian tour. But when graph is not complete edges based on tour might not exist. So, at each step of adding next node to the result ( Hamilton cycle ), algorithm checks, if edge between given nodes exists. If it exists, it's allright - we don't need to change nothing. If not - we have to search for the shortest path between current two nodes and add this path to solution. - what I described assumes that we accept doubled occurances of nodes in solution. Going back to definition of the problem - it says that there shouldn't be such situations ( TSP problem is finding shortest Hamilton Cycle, with only one occurance of each vertex ). So, now it's time to decide, what to do: give that solution I describe above, or in such case return error that it's not Hamilton's graph ( don't know if I wrote it well, but I mean graph without proper Hamilton cycle ). Implementation is at SVN now. I debugged it step by step and it worked for examples I was testing it, but still there can be some cases needed consideration - I am on it. When I ran Christofides for it's Tight Example it gave me the solution that reaches maximum approximation factor ( for graph with 7 nodes, where OPT* was 6, it found solution with cost 9 ). It still needs some upgrades ( like probably variable originalEdges will be erased, as we need at that step whole original graph anyway...and other), so you don't have to check implementation very strictly now, I just wanted to show the main idea and next step in work. [*] By OPT and ALG, I used to name OPTimal solution and solution given by ALGorithm. Best Regards, Michał Antoniewski |
From: Andreas K. <and...@ac...> - 2009-06-25 16:54:57
|
Michał Antoniewski wrote: > > Hello. > > I updated some tests for Max-Cut ( testing subprocedures seperately ). Ok, will review. > And now TSP. > > After some rethinking I came up with such solution to handle those > uncomplete graphs: > - first step filling graph with edges to have a complete graph. At the > same time saving set of original Edges. Weights on new edges are set to > Inf. [ In practise algorithm is avoiding them anyway, so maybe it's even > not needed to take that step ]. > - further, algorithm goes the same, until creating Hamilton Cycle ( > procedure findHamiltonCycle ). > - what we were given was the set of nodes ( Hamilton cycle ) given based > on found Eulerian tour. But when graph is not complete edges based on > tour might not exist. So, at each step of adding next node to the result > ( Hamilton cycle ), algorithm checks, if edge between given nodes > exists. If it exists, it's allright - we don't need to change nothing. > If not - we have to search for the shortest path between current two > nodes and add this path to solution. > > - what I described assumes that we accept doubled occurances of nodes in > solution. Going back to definition of the problem > it says that there > shouldn't be such situations ( TSP problem is finding shortest Hamilton > Cycle, with only one occurance of each vertex ). My understanding of the above is that this (only one occurence) is or can be true only for complete graphs. Given that we started from an incomplete graph I cannot see us able to avoid duplication in some cases. Namely when the hamilton cycle computed in the regular manner would be forced to use edges of infinite weight ... Wait, triangle inequality (TIE) ... This gives us something we can reason with ... A graph with edges of infinite weight will not satisfy this inequality in general (*), and so they are not really suitable for the metric-tsp algorithms. Example: Assume nodes a b c, and edges a-b weight X, a-c weight Y, with X, Y not infinite, and edge b-c weight +Inf Then |b-c| > |b-a|+|a-c|, therefore TIE is violated. And any incomplete graph we make complete in this manner (adding +Inf edges) will have at least one such triangle as in the example above. Which means, our making the graph complete in this manner makes it automatically also unsuitable as input to the metric algorithms. So if we apply them anyway we should expect anomalies like a solution with duplicated nodes, or a solution which goes through an edge of +Inf weight. Does that make sense ? (Ad *) One exception is if all edges have the same weight of +Inf. Another set of exceptions are all graphs where the edges of finite weights are scattered around so that no two such share a node. Then all triangles in the completed graph have at least 2 edges of +Inf weight and satisfy TIE. But for these graphs the min spanning tree has to contain edges of +Inf weight. Before completion the graph was a forest, a set of connected pairs of nodes without other connections. > So, now it's time to > decide, what to do: give that solution I describe above, or in such case > return error that it's not Hamilton's graph ( don't know if I wrote it > well, but I mean graph without proper Hamilton cycle ). Given that we are extending the algorithm for use on graphs it is technically not suited for per my reasoning above returning a solution which is not quite satisfying the original constraints (each node only once) is not really an error IMVHO (in my very humble opinion). We will have to make this clear in the docs however. > Implementation is at SVN now. Ok, review added to work queue ... > I debugged it step by step and it worked > for examples I was testing it, but still there can be some cases needed > consideration - I am on it. When I ran Christofides for it's Tight > Example it gave me the solution that reaches maximum approximation > factor ( for graph with 7 nodes, where OPT* was 6, it found solution > with cost 9 ). > It still needs some upgrades ( like probably variable originalEdges will > be erased, as we need at that step whole original graph anyway...and > other), so you don't have to check implementation very strictly now, I > just wanted to show the main idea and next step in work. > [*] By OPT and ALG, I used to name OPTimal solution and solution given > by ALGorithm. Andreas. |
From: Michał A. <ant...@gm...> - 2009-06-25 18:16:53
|
As not much time left to midterm-evaluation - another update with some subprocedures for K Center problem and a little description of the problem. We are given undirected and completed graph G, where edge weights satisfy triangle inequality (Metric). Let the k be positive integer. Let V be set of nodes and E be set of edges for G. For any set S, which is a subset of V and node v (v is a element of V), let connect(v, S) be the min cost of the edge, which connects node v with any node in S. The goal is to find such S, that |S| = k, and maxv { connect(v,S) } is as small as possible. In other words, we are searching for such subset of V, that it will be made from k nodes, and the max value of connect function will be as small as possible. Generally, how algorithm works: - first step is to sort edges E = {e1.....em} with their costs in not decreasing order: cost(e1) <= .... <= cost(em) - now we construct m graphs Gi like this: Gi = (V, Ei). So, set of nodes remains the same, but we use only Ei edges. e.g. E5 = {e1,e2,e3,e4,e5} - procedure createGi - each graph Gi much be two squared -> Gi^2. Gi^2 is such graph that is has an edge between nodes u and v, if and only if, the distance between those nodes (in edges, not weights!) is not greater than 2. In other words, we can go from u to v using max 2 edges. ---> procedure createTwoSquaredGraph - for each Gi^2 we seek for maximum independent set Mi. Just to clear: finding maximal is of course NP hard, we seek for maximum set. - let the j be the min index, such that |Mj| <= k. - return Mj. So this is how it works in general. I think with more code and updates things will get more clear, but for now just a quick look at what it is planned to achieve. Best Regards, Michal Antoniewski |
From: Lars H. <Lar...@re...> - 2009-06-27 14:38:04
|
Michał Antoniewski skrev: > - for each Gi^2 we seek for maximum independent set Mi. Just to clear: > finding maximal is of course NP hard, we seek for maximum set. You've got maximum and maximal the wrong way round here. Finding one maximal (w.r.t set inclusion) independent set is easy (add vertices until you've covered the graph). Finding one that achieves the maximum cardinality is hard. But now that I think about it... is this really what you want? The next steps being > - let the j be the min index, such that |Mj| <= k. > > - return Mj. it would seem more natural to seek a set which is somehow minimum here (as that would make it easier to meet this condition) than to pick the maximum. Indeed, given that > The goal is to find such S, that |S| = k, and maxv { connect(v,S) } is as > small as possible. it seems more like you're after a minimal covering than a maximal independent set. Going to the square complicates the picture somewhat, but it still looks to me as though your algorithm won't work. Consider the case that Gi is the path u--v--w. The square of this is the complete graph on these vertices, so a maximal independent set contains one vertex. If you pick u or w, then you won't cover all of Gi, which seems bad (it can be argued that connect(w,{u}) is infinite). Also, there is the brute force approach of simply iterating over all k-subsets S of V, computing maxv { connect(v,S) } for all. For small fixed k this seems faster than what you're about to do, so it might be considered as an alternative algorithm for solving the problem as stated. (It could of course be that this range of parameters if fairly uninteresting.) Lars Hellström |
From: Andreas K. <and...@ac...> - 2009-06-29 18:30:45
|
Lars Hellström wrote: > Michał Antoniewski skrev: >> - for each Gi^2 we seek for maximum independent set Mi. Just to clear: >> finding maximal is of course NP hard, we seek for maximum set. > > You've got maximum and maximal the wrong way round here. Finding one > maximal (w.r.t set inclusion) independent set is easy (add vertices > until you've covered the graph). Finding one that achieves the maximum > cardinality is hard. Yes. I just googled for weighted metric k-center and found the PDF http://www.corelab.ntua.gr/courses/approx-alg/material/k-center.pdf which seems to be the basis of Michal description, it is very similar. And it also describes it as finding maximal, not maximum, the latter being NP-hard. > > But now that I think about it... is this really what you want? The next > steps being > >> - let the j be the min index, such that |Mj| <= k. >> >> - return Mj. > > it would seem more natural to seek a set which is somehow minimum here > (as that would make it easier to meet this condition) than to pick the > maximum. Indeed, given that The pdf explains a bit more, maybe that helps. >> The goal is to find such S, that |S| = k, and maxv { connect(v,S) } is as >> small as possible. > > it seems more like you're after a minimal covering than a maximal > independent set. Going to the square complicates the picture somewhat, > but it still looks to me as though your algorithm won't work. > > Consider the case that Gi is the path u--v--w. The square of this is > the complete graph on these vertices, so a maximal independent set > contains one vertex. If you pick u or w, then you won't cover all of > Gi, which seems bad (it can be argued that connect(w,{u}) is infinite). > > Also, there is the brute force approach of simply iterating over all > k-subsets S of V, computing maxv { connect(v,S) } for all. For small > fixed k this seems faster than what you're about to do, so it might be > considered as an alternative algorithm for solving the problem as > stated. (It could of course be that this range of parameters if fairly > uninteresting.) Andreas. |
From: Lars H. <Lar...@re...> - 2009-06-29 22:38:51
|
Andreas Kupries skrev: > Lars Hellström wrote: >> Michał Antoniewski skrev: >>> - for each Gi^2 we seek for maximum independent set Mi. Just to clear: >>> finding maximal is of course NP hard, we seek for maximum set. >> >> You've got maximum and maximal the wrong way round here. Finding one >> maximal (w.r.t set inclusion) independent set is easy (add vertices >> until you've covered the graph). Finding one that achieves the maximum >> cardinality is hard. > > Yes. I just googled for > weighted metric k-center > and found the PDF > > http://www.corelab.ntua.gr/courses/approx-alg/material/k-center.pdf > > which seems to be the basis of Michal description, it is very similar. Very similar indeed! Should probably be provided as reference. > And it also describes it as finding maximal, not maximum, the latter > being NP-hard. > >> >> But now that I think about it... is this really what you want? The >> next steps being >> >>> - let the j be the min index, such that |Mj| <= k. >>> >>> - return Mj. >> >> it would seem more natural to seek a set which is somehow minimum here >> (as that would make it easier to meet this condition) than to pick the >> maximum. Indeed, given that > > The pdf explains a bit more, maybe that helps. Yes, it does. The algorithm approximates the minimal cost from _below_ rather than above, as far as the selection of edges is concerned; I hadn't expected that. Also, I hadn't fully noticed the "G must be a complete graph" condition... This, of course, means any nonempty set of vertices is dominating (= a vertex cover), so one can't get it wrong, only suboptimal. Concerning that "complete graph" condition, though... Stylistically, it's a bit backwards to have a command that takes a graph as input, if only complete graphs are valid input. (Stretching a bit, it's kind of like having a command accepting a list of numbers as input, with the constraint that the numbers must all be equal to 1.) Even if the implementation very naturally exercises [struct::graph]s, one might want to take the time to think about whether a [struct::graph] is the most natural form for the input. If the implementation needs to preprocess the graph so that it is metrically complete, then maybe the output from the command that computes all shortest distances would be more practical, but if it can do the completion implicitly (which seems plausible for this algorithm; add uw at cost c(uv)+c(vw) to the queue once uv and vw have been processed) then a [struct::graph] as input becomes much more appropriate. In the latter case, the command documentation should take the "distance in general graph" point of view, rather than speaking about costs of individual edges. Lars Hellström |
From: Andreas K. <and...@ac...> - 2009-06-30 18:06:44
|
Lars Hellström wrote: > Andreas Kupries skrev: >> Lars Hellström wrote: >>> Michał Antoniewski skrev: >>>> - for each Gi^2 we seek for maximum independent set Mi. Just to clear: >>>> finding maximal is of course NP hard, we seek for maximum set. >>> You've got maximum and maximal the wrong way round here. Finding one >>> maximal (w.r.t set inclusion) independent set is easy (add vertices >>> until you've covered the graph). Finding one that achieves the maximum >>> cardinality is hard. >> Yes. I just googled for >> weighted metric k-center >> and found the PDF >> >> http://www.corelab.ntua.gr/courses/approx-alg/material/k-center.pdf >> >> which seems to be the basis of Michal description, it is very similar. > > Very similar indeed! Should probably be provided as reference. Assuming that the link stays up. >> The pdf explains a bit more, maybe that helps. > > Yes, it does. The algorithm approximates the minimal cost from _below_ Right. > rather than above, as far as the selection of edges is concerned; I > hadn't expected that. Also, I hadn't fully noticed the "G must be a > complete graph" condition... This, of course, means any nonempty set of > vertices is dominating (= a vertex cover), so one can't get it wrong, > only suboptimal. Interesting. Didn't know that. > > Concerning that "complete graph" condition, though... Stylistically, > it's a bit backwards to have a command that takes a graph as input, if > only complete graphs are valid input. > (Stretching a bit, it's kind of > like having a command accepting a list of numbers as input, with the > constraint that the numbers must all be equal to 1.) Even if the > implementation very naturally exercises [struct::graph]s, one might > want to take the time to think about whether a [struct::graph] is the > most natural form for the input. If the implementation needs to > preprocess the graph so that it is metrically complete, then maybe the > output from the command that computes all shortest distances would be > more practical, but if it can do the completion implicitly (which seems > plausible for this algorithm; add uw at cost c(uv)+c(vw) to the queue > once uv and vw have been processed) then a [struct::graph] as input > becomes much more appropriate. That sounds like a good way of completing a graph. Better than just adding Inf edges which kills the metric, often. > In the latter case, the command > documentation should take the "distance in general graph" point of > view, rather than speaking about costs of individual edges. Andreas |
From: Andreas K. <and...@ac...> - 2009-06-25 23:00:53
|
Andreas Kupries wrote: > Michał Antoniewski wrote: >> Hello. >> >> I updated some tests for Max-Cut ( testing subprocedures seperately ). > > Ok, will review. > 515: #create complete graph > 516: foreach v [$oddTGraph nodes] { > 517: foreach u [$oddTGraph nodes] { > 518: if { ![$oddTGraph arc exists $u$v] & ($u ne $v) This triggered when I was reading the max-cut code. Tcl does short-cut evaluation of conditions. In other words, the code above will currently always look for a loop-edge for u == v. It is better to write the simple condition u != v first, and when it returns false the more complex 'arc exists' check is never done. Also note that using & is likely a bug. The &-operator is the bitwise-and. You likely want the &&-operator, the 'logical and'. if { ($u ne $v) && ![$oddTGraph arc exists $u$v] } { > 554: > 555: proc ::struct::graph::op::findHamiltonCycle {G originalEdges originalGraph} { > 556: > 557: isEulerian? $G tourvar > 558: lappend result [$G arc source [lindex $tourvar 0]] > 559: > 560: lappend tourvar [lindex $tourvar 0] > 561: > 562: foreach i $tourvar { > 563: set u [$G arc target $i] > 564: > 565: if { $u ni $result } { > 566: set va [lindex $result end] > 567: set vb $u > 568: > 569: if { ("$va $vb" in $originalEdges) || ("$vb $va" in $originalEdges) } { [list $va $vb] etc. in all places (i.e. createCompleteGraph). The use of "" still allows the construction of colliding edge names. Ah, I note that the code in line 643 actually uses [list] to create arc names. So, no collisions, but still a bug. Because you have to use [list] here as well, when searching for the arcs. With the current code I can give you node names which become arc names you will not find. Example nodes 'a b' and 'c', the arc name constructed by [list] from that is '{a b} c', and you are searching for 'a b c' because of "..." instead of list. An optimization to consider in the future is to make the case of a complete graph as input fast again. I.e. createCompleteGraph could note when it did not add any arcs, and this information can be used here to avoid the searches by 'in'. An alternative providing the same information would be to record the arcs which were added instead of the originals, and then replace the condition with the complement, i.e. 'arc ni addedArcs'. A third alternative would be to special case this procedure, i.e. check if no arcs were added first, and use the original code and loop in that case, without checking at each step. > 570: lappend result $u > 571: } else { > 572: > 573: set path [dict get [struct::graph::op::dijkstra $G $va] $vb] > 574: > 575: #reversing the path > 576: set path [lreverse $path] > 577: #cutting the start element > 578: lremove path [lindex $path 0] Cutting the start element is easier by using the builtin command 'lrange'. set path [lrange $path 1 end] Even a direct use of 'lreplace' is easier than to run lindex, then lsearch (inside of lremove), and then lreplace. We know it is in index 0, no search is needed. The builtins lindex, lrange, lreplace are all index based. > 579: > 580: #adding the path and the target element > 581: lappend result $path The result is a list of nodes. What is added here is not a node as element, but a list of node, making the result a list of (nodes and lists of nodes). Likely meant is lappend result {*}$path which puts the nodes in the path into the result as independent nodes, and not as single list (list expansion operator). > 582: lappend result $vb > 583: } > 584: } > 585: } > 586: > 587: set path [dict get [struct::graph::op::dijkstra $originalGraph [lindex $result 0]] [lindex $result end]] > 588: set path [lreverse $path] > 589: > 590: lremove path [lindex $path 0] See above at line 578. > 591: if { $path ne {} } { path is a list, should be checked using a list operation. 'ne' is a string operation. if {[llength $path]} ... > 592: lappend result $path See above at line 581. > 593: } > 594: lappend result [$G arc source [lindex $tourvar 0]] > 595: return $result > 596: } > 597: > 628: proc ::struct::graph::op::createCompleteGraph {G originalEdges} { > 629: > 630: upvar $originalEdges st > 631: set st {} > 632: foreach e [$G arcs] { > 633: set v [$G arc source $e] > 634: set u [$G arc target $e] > 635: > 636: lappend st "$v $u" Use [list] instead of "...", see above at line 569. > 637: } > 638: > 639: foreach v [$G nodes] { > 640: foreach u [$G nodes] { > 641: if { ("$v $u" ni $st) && ("$u $v" ni $st) && ($u ne $v) && ![$G arc exists [list $u $v]] } { Use [list] instead of "...", see above at line 569 as well Also, move the condition ($u ne $v) to the front, see my comments at line 518 about short-cut evaluation of conditions. > 642: $G arc insert $v $u [list $v $u] > 643: $G arc setweight [list $v $u] Inf Use of [list] is good. > 650: proc ::struct::graph::op::lreverse l { > 651: set result {} > 652: set i [llength $l] > 653: incr i -1 > 654: while {$i >= 0} { > 655: lappend result [lindex $l $i] > 656: incr i -1 > 657: } > 658: return $result > 659: } This procedure is not needed. Tcl 8.5 has a lreverse builtin command. > 780: #K Center Problem - 2 approximation algorithm > 781: #------------------------------------------------------------------------------------- > 782: # > 783: # > 784: # > 785: # > 786: # > 787: > 788: #subprocedure creating graph with given set of edges > 789: proc ::struct::graph::op::createGi {G E} { > 790: > 791: set Gi [struct::graph] > 792: > 793: foreach v [$G nodes] { > 794: $Gi node insert $v > 795: } > 796: > 797: foreach e $E { > 798: set v [$G arc source $e] > 799: set u [$G arc target $e] > 800: > 801: $Gi arc insert $v $u $e > 802: } > 803: > 804: return $Gi > 805: } > 806: > 807: #subprocedure creating from graph G two squared graph > 808: #G^2 - graph in which edge between nodes u and v exists, > 809: #if and only if, when distance (in edges, not weights) > 810: #between those nodes is not greater than 2 and u != v. > 811: > 812: proc ::struct::graph::op::createTwoSquaredGraph {G} { > 813: > 814: set H [struct::graph] > 815: set copyG [struct::graph] > 816: > 817: foreach v [$G nodes] { > 818: $H node insert $v > 819: $copyG node insert $v > 820: } > 821: > 822: foreach e [$G arcs] { > 823: set v [$G arc source $e] > 824: set u [$G arc target $e] > 825: > 826: $copyG arc insert $v $u [list $v $u] > 827: $copyG arc setweight [list $v $u] 1 > 828: } > 829: > 830: foreach v [$copyG nodes] { > 831: foreach u [$copyG nodes] { > 832: if { [distance $copyG $v $u] <= 2 && ![$H arc exists [list $u $v]] && ($u ne $v)} { My understanding is that copyG is simply an exact copy of G, just with arc weights forced to 1 for the distance calculation, right ? Move condition ($u ne $v) to the front to stop execution of the expensive distance check for u == v. I believe that we should move the 'arc exists' check before the distance computation as well. Right now we even compute distances which do not matter because of the other conditions. An optimization should be possible here ... The distance command is dijsktra underneath, which has approx complexity O(n**2) in the number of nodes (depending on the complexity of our prioqueue we may get O(n*log n + e) ... The double loop around the distance calls here is complexity O(n**2), for a total of O(n**4) complexity. If we run dijkstra for each node once, saving the distance results then we have O(n**3) complexity, with the double loop here reduced to O(n**2), not that it then matters. > 833: $H arc insert $v $u [list $v $u] > 834: } > 835: } > 836: } > 837: copyG is not deleted, memory leak. > 838: return $H > 839: } Andreas |
From: Andreas K. <and...@ac...> - 2009-07-02 16:44:55
|
Michał Antoniewski wrote: > Vertices Cover algorithm is availiable - implementation + tests. Looking forward to the commit ... I.e. this doesn't seem to be available in the repository yet. > After implementing standard version, I tested it and it reached max > approximation factor unlikely often. So, I came up with a bit of > improvement and now it finds far better solutions. > > Generally, algorithm works by picking an edge, adding source and target > nodes to solution and erasing those two nodes ( so, all edges adjecent > to them also disappear ). My idea, was to sort at the beginning pairs of > nodes (only those between which there is an edge ) with degrees and then > do the algorithm steps. This gives nice boost because, at first we add > those nodes to solution, that when erased the most of the edges will > disappear. Reading up on the definition ... Yes, that sounds sensible. > I extended also procedure sortEdges into sortGraph, which now takes > graph and sort option at input, and returns sorted dictionary ( edges > and weights or nodes and degrees, I think can be useful ). Maybe, it > will be good idea to put the switch there, and more option parameters > can appear to make this general sorting procedure... Just wondering....:) Hm. > Now I remembered that you've written your implementation for sorting > too, in one of previous mails..... I'm gonna review it tomorrow and do > the possible change. > Moreover, I think it could also be useful to implement procedure > "graphName node degrees" (in the same fashion the "graphName arc > weights" is constructed). If you're interested in it, we can add it e.g. > at the end of timeline ( when it turns out that at the end, there will > be other tasks to do, I can do it later, if you like... ). Hm. Could be convenient. On the other hand, this is also [$G node degrees $N] <=> [llength [$G node degree $N]] so the method would be mostly syntactical sugar ... Lets decide at the end. > Going back to Vertices Cover, there is also other way to implement it > (with the same approximation factor) using Greedy Max Matching (which is > implemented already). But I think that version will be less suitable for > other improvements that can be possibly done ( like that degree > improvement ). > > Tomorrow, I hope full K-Center and starting with flow algorithms. Andreas. |
From: Michał A. <ant...@gm...> - 2009-07-02 22:54:52
|
Another update available - Fully Working K-Center. A little description. KCenter procedure uses such subprocedures: sortEdges, extendTwoSquaredGraph and GreedyMaxIndependentSet. Generally, it goes like this: - it searches for solution by extending two-squared graph G(i)**2 with another edge during each iteration - the edges are in sorted sequence - for each found G(i)**2, it finds Mi - maximal independent set with greedy algorithm - if |Mi| <= k, the search is ended and solution can be returned When reviewing code, please go straightly to revision 31 - it's well commented, I think it will be fast to go through the code. About GreedyMaxIndependentSet: there are some tests provided, e.g. test concerning graph from wiki reference (so there is a graph picture at the same time:) ). I've also corrected bug that, I found in extendTwoSquaredGraph procedure. I'm going to add some test cases to cover that case. Best Regards, Michal Antoniewski |
From: Michał A. <ant...@gm...> - 2009-06-25 15:59:45
|
Regarding mails before: Crush and crash....yeah my mistake:) I went with wrong letter...:) thx:) >>Now that I am focused on this I do however see one possible problem with the naming. Consider two arcs X and Y, the first >>between the nodes 'a' and 'bc' and the second between 'ab' and 'c. >> X => $u$v = 'a' + 'bc' = 'abc' >> Y => $u$v = 'ab' + 'c' = 'abc' Oh, I accidentaly ommited that case... Right, the same problem was considering Bellman's Ford, but this time I did one bad assumption and lost it. Anyways, just changing keys to lists this time didn't do the trick, because there was a problem during execution of isEulerian? method. Fleury method ( used by Eulerian ) returns error, when it's given such graphs. I will have to take a deeper look into it, as firstly I wanted to end the new TSP implementation. Probably, it's not a big problem, will see.... >>This command makes the command ::struct::graph available as ::graph >>The test environment masked the bug. >>As reminder for myself to get rid of this >> https://sourceforge.net/tracker/?func=detail&aid=2811747&group_id=12883&atid=112883 Oh, right... so now it's obvious why tests went good everytime.... I'm glad that by accident it helped to find a bug. >>Now I am reminded of RosettaCode, a nice site for exmaple tasks implemented by lots of languages for comparison. >>Tcl's page is >> http://www.rosettacode.org/wiki/Category:Tcl >>And >> http://www.rosettacode.org/wiki/Tasks_not_implemented_in_Tcl Interesting! Actually, now after spending some time coding in Tcl, I'm a bit surprised, how it happened that Tcl was almost fully omitted during classes at my University.... We was using Tcl scripts for example, when doing some simulations on computer networks, but when it came to subjects about programming in script languages, there was nothing about Tcl...weird. Best Regards, Michal Antoniewski |
From: Andreas K. <and...@ac...> - 2009-06-25 16:18:56
|
Michał Antoniewski wrote: > Regarding mails before: > > Crush and crash....yeah my mistake:) I went with wrong letter...:) thx:) No problem :) Should you ever read fanfiction you will find that things can be very much worse when it comes to mangling the english language ... Even mistakes like canon/cannon role/roll etc. are quite harmless compared to some atrocities seen. > >>Now that I am focused on this I do however see one possible problem > with the naming. Consider two arcs X and Y, the first >>between the > nodes 'a' and 'bc' and the second between 'ab' and 'c. > >> X => $u$v = 'a' + 'bc' = 'abc' > >> Y => $u$v = 'ab' + 'c' = 'abc' > > Oh, I accidentaly ommited that case... Right, the same problem was > considering Bellman's Ford, but this time I did one bad assumption and > lost it. Anyways, just changing keys to lists this time didn't do the > trick, because there was a problem during execution of isEulerian? > method. Uh. That is then a bug in isEulerian?. We should be able to handle all types of node names. > Fleury method ( used by Eulerian ) returns error, when it's > given such graphs. Definitely sounding like a bug in the existing code. > I will have to take a deeper look into it as firstly > I wanted to end the new TSP implementation. Probably, it's not a big > problem, will see.... Please file a bug report for isEulerian?/Fleury about this (*) _if_ the search for the bug in Fleury takes to much of your time. (Ad *) At https://sourceforge.net/tracker/?group_id=12883&atid=112883 > >>This command makes the command ::struct::graph available as ::graph > >>The test environment masked the bug. > >>As reminder for myself to get rid of this > >>https://sourceforge.net/tracker/?func=detail&aid=2811747&group_id=12883&atid=112883 > <https://sourceforge.net/tracker/?func=detail&aid=2811747&group_id=12883&atid=112883> > > Oh, right... so now it's obvious why tests went good everytime.... I'm > glad that by accident it helped to find a bug. I'm glad as well. > >>Now I am reminded of RosettaCode, a nice site for exmaple tasks > implemented by lots of languages for comparison. > >>Tcl's page is > >> http://www.rosettacode.org/wiki/Category:Tcl > >>And > >> http://www.rosettacode.org/wiki/Tasks_not_implemented_in_Tcl > > Interesting! Actually, now after spending some time coding in Tcl, I'm a > bit surprised, how it happened that Tcl was almost fully omitted during > classes at my University.... We was using Tcl scripts for example, when > doing some simulations on computer networks, but when it came to > subjects about programming in script languages, there was nothing about > Tcl...weird. Tcl is sometimes called the best-kept secret among the scripting languages. Companies don't want you to know the secret of their success, you might be came as effective as them in writing solutions in this easily used language. This is a bit tongue-in-cheek, but also has a kernel of truth. Hopefully we, the community, manage to change that. Andreas |
From: Michał A. <ant...@gm...> - 2009-06-29 19:24:56
|
New update available. A little description, what's new. 1. I found out, what was wrong with Fleury procedure. struct::set subtract arcs $arc - that line (1630) doesn't delete any edges from current set of edges, when edge looks like this [list $u $v], so it was causing trouble. set arcs [$copy arcs] - I replaced it with that line, which straightly sets current set of edges used by algorithm from the copy of graph algorithm is working on. Actually, IMHO that variable "arcs" can be erased from that implementation. Now, all tests are passed and "ab c/a bc" problem is corrected in TSP implementations. 2. Christofides is using Greedy Max Matching implementation now. For graph which is a Tight Example, Max Matching finds right solution, and Christofides goes well too, reaching max approximation factor. I've created sorting set of edges for a graph as a subprocedure, as it can be useful. It is used in Max Matching to choose edges with lowest cost at first. It will be also useful for K-Center problems. Tests for both TSP problems are updated. I will quickly add some more for Christofides in next update. I'm placing tests for subprocedures in files, where tests for procedures which use them are....should I change it and create separate test files for them? And should they be tested treating like separate procedures? What I mean, is for example: is the assumption that weights on arcs in graph given at input properly set proper ( because previous procedure, which uses that subprocedure, checks that before ), or there should be at the beginning: $VerifyWeightsAreOk $G and appropriate tests for it. I've added also condition for checking, if given graph is connected. 3. Changes in implementation, considering remarks from previous mails for Max-Cut and TSP problems. Next coming, Vertices Cover and further implementations and corrections for Metric K-Center. Best Regards, Michal Antoniewski |
From: Andreas K. <and...@ac...> - 2009-06-30 18:06:43
|
Michał Antoniewski wrote: > New update available. A little description, what's new. > > 1. I found out, what was wrong with Fleury procedure. > > struct::set subtract arcs $arc - that line (1630) doesn't delete any > edges from current set of edges, when edge looks like this > [list $u $v], so it was causing trouble. Hm. A bug in the struct::set implementation ?! Can you send me your example where Fleury failed without your change ? I will then go through it and see what I can find what struct::set is doing ... Ah ... Now I see ... We are trying to subtract a single arc, but 'subtract' takes a _set_ as second argument. That is ok only as long as the $arc name doesn't contain whitespace. The moment it does the code tries to remove the parts of the name, considering them the elements, and they are not found, naturally. So, changing 'subtract' to 'exclude' is also a fix for this bug. That is the correct method for removing a single element from a set. The bug was in fleury, not in struct::set. Whew. Please file a bug about this at the tcllib bug tracker at sourceforge. A quick look over graphops.tcl showed me at least one other place where this is wrong, TarjanSub. I have to review the whole graphops.tcl to see if there are more. I will then also add test cases which would trigger the problem before the fix(es) got in. Your fix works, because arcs is in essence a duplicate of the set of arcs in copy, as you saw. > set arcs [$copy arcs] - I replaced it with that line, which > straightly sets current set of edges used by algorithm from the copy of > graph algorithm is working on. Actually, IMHO that variable "arcs" can > be erased from that implementation. In principle yes. In practice I (or Alejandro) believe that using the set is faster than to continuously query the copy for its arcs. I.e. trading off memory against speed. > Now, all tests are passed and "ab c/a bc" problem is corrected in TSP > implementations. Ok. > > 2. Christofides is using Greedy Max Matching implementation now. For > graph which is a Tight Example, Max Matching finds right solution, and > Christofides goes well too, reaching max approximation factor. Ok. > I've created sorting set of edges for a graph as a subprocedure, as it > can be useful. It is used in Max Matching to choose edges with lowest > cost at first. It will be also useful for K-Center problems. Ok. > Tests for both TSP problems are updated. I will quickly add some more > for Christofides in next update. Yes, saw them. > > I'm placing tests for subprocedures in files, where tests for procedures > which use them are....should I change it and create separate test files > for them? No need to change this, having them in the same file is ok. > And should they be tested treating like separate procedures? > What I mean, is for example: is the assumption that weights on arcs in > graph given at input properly set proper ( because previous > procedure, which uses that subprocedure, checks that before ), or there > should be at the beginning: > > $VerifyWeightsAreOk $G and appropriate tests for it. As it is a helper procedure for a specific caller there is no need to recheck the conditions caller has already checked. Even if we wish to make this helper available to users of the graph ops package it is better to write and export a small wrapper procedure to perform the necessary checks, and keep them out of the helper itself. > > I've added also condition for checking, if given graph is connected. Ok. > 3. Changes in implementation, considering remarks from previous mails > for Max-Cut and TSP problems. > Next coming, Vertices Cover and further implementations and corrections > for Metric K-Center. And now for the code of rev 26 ... > 528: if { $u != $v } { > 529: if { ![$oddTGraph arc exists [list $u $v]] } { > 530: $oddTGraph arc insert $v $u [list $v $u] > 531: $oddTGraph arc setweight [list $v $u] [distance $G $v $u] > 532: } > 533: } Note that when I said to pull the ($u ne $v) condition to the front I did not really meant to make it a separate if command, just to change the order in the existing command. That is what the shortcut evaluation of conditions is about, namely that the code if { ($u ne $v) && ![$oddTGraph arc exists [list $u $v]] } { ... } is internally evaluated exactly as you now wrote explicitly. We can now leave it as is however, your call. > 562: > 563: #Greedy Max Matching procedure, which finds maximal ( not maximum ) matching > 564: #for given graph G. It adds edges to solution, beginning from edges with the > 565: #lowest cost. > 566: > 567: proc ::struct::graph::op::GreedyMaxMatching {G} { > 568: > 569: set copyG [struct::graph] The copyG doesn't seem to be necessary. I see only two places where it is used: 572-574, adding the nodes, and then line 595, where it is destroyed. > 570: set maxMatch {} > 571: > 572: foreach v [$G nodes] { > 573: $copyG node insert $v > 574: } > 575: > 576: foreach e [sortEdges $G] { > 577: set v [$G arc source $e] > 578: set u [$G arc target $e] > 579: set neighbours [$G arcs -adj $v $u] > 580: set param 1 param seems to mean 'e has no adjacent arcs in the matching'. Maybe name it 'ok' ? Or 'noAdjacentArc' ? > 581: > 582: lremove neighbours $e > 583: > 584: foreach a $neighbours { > 585: if { $a in $maxMatch } { > 586: set param 0 The moment we have found one adjacent arc in the matching there is no need to check any other. I.e. we can break the loop. > 587: } > 588: } > 589: if { $param } { > 590: lappend maxMatch $e > 591: } > 592: } > 593: > 594: > 595: $copyG destroy > 596: return $maxMatch > 597: } > 599: #Subprocedure which for given graph G, returns the set of edges > 600: #sorted with their costs. > 601: proc ::struct::graph::op::sortEdges {G} { > 602: set weights [$G arc weights] > 603: > 604: set sortedEdges {} > 605: > 606: foreach val [lsort [dict values $weights]] { > 607: foreach x [dict keys $weights] { > 608: if { [dict get $weights $x] == $val } { > 609: set weights [dict remove $weights $x] > 610: lappend sortedEdges $x ;#[list $val $x] > 611: } > 612: } > 613: } > 614: > 615: return $sortedEdges > 616: } It took me a while to understand this procedure. Iteration over the sorted weights, and adding all arcs with that weight to the result, and removed from the input ... It should work ... For comparison, in such a situation I usually do the sorting like this set tmp {} foreach {arc weight} [$G arc weights] { lappend tmp [list $arc $weight] } set sortedEdges {} foreach item [lsort -index 1 $tmp] { lappend sortedEdges [lindex $item 0] } The first loop transforms a list of the form {a1 w1 a2 w2 ...} into the form {{a1 w1} {a2 w2} ...} Then we can use the -index option of lsort to sort the {aX wX} elements by the second element, i.e. the weights. At last, the second loop then pulls the arcs out of the sorted list. If we were using Tcl 8.6, which we don't, we could even write return [dict keys [lsort -stride 2 -index 1 [$G arc weights]]] Here the -stride 2 does the grouping of my first loop internally, just for the comparison. The result is the dictionary sorted by weight, and then we can pull the arcs as the keys of that list. > 918: if { $u != $v } { > 919: if { ![$H arc exists [list $u $v]] } { > 920: if { [distance $copyG $v $u] <= 2 } { > 921: $H arc insert $v $u [list $v $u] > 922: } > 923: } > 924: } See also my notes at 528-533, and the big mail where I reconstructed the whole thing. |
From: Michał A. <ant...@gm...> - 2009-06-29 22:24:51
|
>>A graph with a Hamilton cycle is said to be /Hamiltonian/, so I suppose the term you were looking for here is "non->>Hamiltonian". That's right. Thanks. >>You've got maximum and maximal the wrong way round here. Finding one maximal (w.r.t set inclusion) independent set is >>easy (add vertices until you've covered the graph). Finding one that achieves the maximum cardinality is hard. Oh, I've mistaken the endings...AL with UM...sorry for confusion. >>Lars Hellström wrote: >>You've got maximum and maximal the wrong way round here. Finding one maximal (w.r.t set inclusion) independent set is >>easy (add vertices until you've covered the graph). Finding one that achieves the maximum cardinality is hard. >>Yes. I just googled for >> weighted metric k-center >>and found the PDF >>http://www.corelab.ntua.gr/courses/approx-alg/material/k-center.pdf >>which seems to be the basis of Michal description, it is very similar. >>And it also describes it as finding maximal, not maximum, the latter being NP-hard. Actually, my main source is great book written by Vijay V. Vazirani - Approximation Algorithms. That pdf is indeed very similar to what I've written, I think it's possible it was based on the book I mentioned ( all naming, as far I can see, is the same ). I think, except that confusion with accidentally replacing the endings, the rest of description was allright. Best Regards, Michał Antoniewski |
From: Andreas K. <and...@ac...> - 2009-06-30 18:07:34
|
Michał Antoniewski wrote: > >>A graph with a Hamilton cycle is said to be /Hamiltonian/, so I > suppose the term you were looking for here is "non->>Hamiltonian". > > That's right. Thanks. > > >>You've got maximum and maximal the wrong way round here. Finding one > maximal (w.r.t set inclusion) independent set is >>easy (add vertices > until you've covered the graph). Finding one that achieves the maximum > cardinality is hard. > > Oh, I've mistaken the endings...AL with UM...sorry for confusion. Well, we are here to help :) > >>Lars Hellström wrote: > >>You've got maximum and maximal the wrong way round here. Finding one > maximal (w.r.t set inclusion) independent set is >>easy (add vertices > until you've covered the graph). Finding one that achieves the maximum > cardinality is hard. > > >>Yes. I just googled for > >> weighted metric k-center > >>and found the PDF > >>http://www.corelab.ntua.gr/courses/approx-alg/material/k-center.pdf > >>which seems to be the basis of Michal description, it is very similar. > >>And it also describes it as finding maximal, not maximum, the latter > being NP-hard. > > Actually, my main source is great book written by Vijay V. Vazirani - Googling... http://www.cc.gatech.edu/~vazirani/ Him I guess. > Approximation Algorithms. Yes, the book is listed on the page > That pdf is indeed very similar to what I've > written, I think it's possible it was based on the book I mentioned ( > all naming, as far I can see, is the same ). That is possible. The higher url http://www.corelab.ntua.gr/courses/approx-alg/ is unfortunately all greek to me (literally). I.e. if he is referencing the book as source, I can't find it, and the pdf doesn't list references > I think, except that confusion with accidentally replacing the endings, > the rest of description was allright. Yes, I thought so too, especially after reading the k-center.pdf slides. Andreas. |
From: Michał A. <ant...@gm...> - 2009-06-30 18:17:38
|
I updated new implementations of subprocedures for K-Center. Thanks for your description regarding creating twosquared graph for K-Center - very good idea, it frees from using dijkstra and path finding well. So, there is now procedure extendTwoSquaredGraph, which takes at input graph G(i-1)**2, graph G(i) and two nodes ( source and target of Ei - edge we are adding at current iteration ). As I understood, G(i) will be needed to find proper neighbours, when finding edges with distance == 2, for nodes u and v. So, creating proper G(i), before extendTwoSquaredGraph goes, will be just adding new edge Ei. There is also procedure createSquaredGraph, which counts two-squared graph from graph G given at input - implemented as you described, without any path finding algorithms. About my first implementation, I didn't erased it yet. Actually, I think it can be useful for creating procedure that count X-Squared graph for graph G and positive int X given at input, as it would need only adding one param at input and little changes in a code, if I'm not mistaken. Procedure creating X-Squared graphs might be useful new graph operation. Best Regards, Michal Antoniewski . |
From: Andreas K. <and...@ac...> - 2009-06-30 20:54:31
Attachments:
graphops.tcl.numbered
|
Micha? Antoniewski wrote: > I updated new implementations of subprocedures for K-Center. > > Thanks for your description regarding creating twosquared graph for > K-Center - very good idea, it frees from using dijkstra and path finding > well. > > So, there is now procedure extendTwoSquaredGraph, which takes at input > graph G(i-1)**2, graph G(i) and two nodes ( source and target of Ei - > edge we are adding at current iteration ). As I understood, G(i) will be > needed to find proper neighbours, when finding edges with distance == 2, > for nodes u and v. Right. I missed that in my description, so we need two graphs during the iteration, Gi and Gi**2. > So, creating proper G(i), before extendTwoSquaredGraph goes, will be > just adding new edge Ei. Right. > > There is also procedure createSquaredGraph, which counts two-squared > graph from graph G given at input - implemented as you described, > without any path finding algorithms. Ok. > About my first implementation, I didn't erased it yet. Actually, I think > it can be useful for creating procedure that count X-Squared graph for > graph G and positive int X given at input, as it would need only adding > one param at input and little changes in a code, if I'm not mistaken. > Procedure creating X-Squared graphs might be useful new graph operation. Yes, it might be useful ... For some X we can repeatedly square, i.e (G**2)**2 => G**4, etc. For general X I currently have no idea how to avoid the [distance] calculation. Review revision 28 > 913: #Subprocedure, which for graph given at input, returns a graph H, > 914: #which is a two-squared graph G. > 915: > 916: proc ::struct::graph::op::createSquaredGraph {G} { > 917: > 918: set H [struct::graph] > 919: foreach v [$G nodes] { > 920: $H node insert $v > 921: } > 922: > 923: foreach v [$G nodes] { > 924: foreach u [$G nodes -adj $v] { > 925: if { ($v != $u) && ![$H arc exists [list $v $u]] && ![$H arc exists [list $u $v]] } { > 926: $H arc insert $u $v [list $u $v] > 927: } > 928: foreach z [$G nodes -adj $u] { > 929: if { ($v != $z) && ![$H arc exists [list $v $z]] && ![$H arc exists [list $z $v]] } { > 930: $H arc insert $v $z [list $v $z] > 931: } > 932: } > 933: } > 934: } > 935: > 936: return $H > 937: } Yes, this is how I envisioned the creation of a squared graph from scratch. > 939: #subprocedure for Metric K-Center problem > 940: # > 941: #Input: > 942: #previousGsq - graph G(i-1)**2 > 943: #currentGi - graph G(i) > 944: #u and v - source and target of an edge added in this iteration > 945: # > 946: #Output: > 947: #Graph G(i)**2 used by next steps of K-Center algorithm > 948: > 949: proc ::struct::graph::op::extendTwoSquaredGraph {previousGsq currentGi u v} { > 950: > 951: #solution graph G(i)**2 > 952: set nextGsq [struct::graph] Why creating a new graph ? In the incremental approach I envisioned that the 'previousGsq' would be manipulated directly (in place) to become the "nextGsq", content wise. > 954: #setting right set of nodes for solution graph > 955: foreach _v [$previousGsq nodes] { > 956: $nextGsq node insert $_v > 957: } > 958: #setting edges from graph G(i-1)**2 for solution graph > 959: foreach e [$previousGsq arcs] { > 960: set _v [$previousGsq arc source $e] > 961: set _u [$previousGsq arc target $e] > 962: $nextGsq arc insert $_v $_u $e > 963: } The above copying of nodes and edges/arcs is not necessary for nextGsq == previousGsq. > 965: #adding new edge > 966: if { ![$nextGsq arc exists [list $v $u]] } { > 967: $nextGsq arc insert $u $v [list $u $v] > 968: } > 969: > 970: #adding new edges to solution graph: > 971: #here edges, where source is a $u node and targets are neighbours of node $u except for $v > 972: foreach x [$currentGi nodes -adj $u] { > 973: if { $x != $v } { > 974: if { ![$nextGsq arc exists [list $v $x]] && ![$nextGsq arc exists [list $x $v]] } { > 975: $nextGsq arc insert $u $x [list $v $x] > 976: } > 977: } > 978: } > 979: #here edges, where source is a $v node and targets are neighbours of node $v except for $u > 980: foreach x [$currentGi nodes -adj $v] { > 981: if { $x != $u } { > 982: if { ![$nextGsq arc exists [list $u $x]] && ![$nextGsq arc exists [list $x $u]] } { > 983: $nextGsq arc insert $v $x [list $u $x] > 984: } > 985: } > 986: } That looks all ok. Andreas |
From: Andreas K. <and...@ac...> - 2009-06-19 18:34:05
|
Andreas Kupries wrote: > Michał Antoniewski wrote: > >> Update 1 (Revision 12): New version of Adjacency List. >> >> What's new: >> - changed the implementation considering your regards - now undirected >> case is simpler >> - lsearch isn't used now --> lsort >> - and other lesser changes. >> >> You were of course right about implementation, now it looks much >> simpler. Sometimes you have to just look at things from a slightly different angle. Ok, yes, that looks much nicer. And it should be easier to understand as well. Now, I noticed that you changed the tests for adjlist a bit. It seems that the results did not change in their essence, but only in the order nodes were sorted in the various (sub)lists, right ? In other words the changed results are in the same equivalence class as the previous ones. There is no significant difference between {n1 {n2 n3} n2 n1 n3 n1} and {n2 n1 n1 {n3 n2} n3 n1}, just order, which is irrelevant for our dictionaries. This is another issue often seen with test suites. That we check results for exact identity, causing the rejection of perfectly correct results because they are not exactly identical to whatever the previous algorithm created as its result. This was particular seen in test suites, here in Tcllib, when the Tcl core went from 8.4 to 8.5. The hash table implementation of the core changed in that transition, causing the results of 'array get' and 'array names' to change, i.e. elements were returned in a different order. This tripped all the testsuites which were not prepared to handle the equivalence class of correct results and tested exactly instead. The solution we implemented was to 'lsort' and 'dictsort' (*) the results of the tested commands, projecting all results in an equivalence class to a canonical form for that class. This canonical form can then be tested for exactly. For the two adjlist algorithms you implemented it was simply details in the implementation which changed the order. I do not believe that we have to update the testsuite right now to take this possibility into account, however please be aware this in the future. Maybe not as easy to write a specialized sorting routine which collapses equivalence classes of results into a canonical form, but after such is done algorithm changes may not require testsuite changes because of leaked implementation details. (*) And custom sorting commands for more complex data structures. Andreas. |