From: Andreas K. <and...@ac...> - 2009-06-23 18:00:48
|
Andreas Kupries wrote: > And for me it is now time to look at the max-cut code you committed yesterday. > After a bit of regular work. > 572: #Maximum Cut - 2 approximation algorithm > 573: #---------------------------------------------------------------------- > 574: # > 575: # > 576: # > 577: # > 578: # > 579: #Reference: > 580: # > 581: > 582: proc ::struct::graph::op::MaxCut {G U V} { > 583: > 584: upvar $U _U > 585: upvar $V _V > 586: set nodeNumber [llength [$G nodes]] > 587: set _U {} > 588: set _V {} > 589: set counter 0 > 590: > 591: foreach v [lsort -dict [$G nodes]] { > 592: if { [expr {$counter % 2}] eq 0 } { if {($counter % 2) == 0} { > 593: lappend _U $v > 594: } else { > 595: lappend _V $v > 596: } > 597: incr counter > 598: } Actually, the whole block (lines 591-598) can be made simpler, by using an advanced form of the foreach command. We can iterate using multiple variables. Of the two possible forms we will want foreach {u v} [lsort -dict [$G nodes]] { lappend _U $u if {$v eq ""} continue lappend _V $v } For the list {a b c d e} the variables u and v are assigned u=a, v=b u=c, v=d u=e, v="" This should also explain the check done before extending the list _V. For a list of odd length the last round through the loop will leave the variable v empty. > 599: > 600: set val 1 > 601: set ALG 0 > 602: while {$val>0} { > 603: set val [cut $G _U _V [countEdges $G $_U $_V]] > 604: if { $val > $ALG } { > 605: set ALG $val > 606: } > 607: } > 608: return $ALG > 609: } > 610: > 611: #procedure replaces nodes between sets and checks if that change is profitable > 612: proc ::struct::graph::op::cut {G U V param} { > 613: > 614: upvar $U var1 > 615: upvar $V var2 I believe that it is more idiomatic to write proc ::struct::graph::op::cut {G Uvar Vvar param} { upvar $Uvar U upvar $Vvar V I.e. to make explicit that the arguments are variable names, in their names, and to name the variables they are linked to in the scope after them. 'var1' and 'var2' are very non-explanatory names. > 616: set _V {} > 617: set _U {} > 618: set value 0 > 619: > 620: set maxValue $param > 621: set _U [reset $var1] > 622: set _V [reset $var2] See my notes blow on 'reset', leaving mean to wonder why not just set _U $var1 set _V $var2 because that is what is currently happening with the current definition of 'reset'. > 623: > 624: foreach v [$G nodes] { > 625: > 626: if { [lsearch $_U $v] <0 } { As we are using Tcl 8.5 we can use the 'in' and 'ni' operators. The latter means 'not in'. They are applicable because we do not care about the exact position, only about the contained/!contained information. if { $v ni $_U } { ; # <=> v not in U ? > 627: lappend _U $v > 628: lremove _V $v > 629: set value [countEdges $G $_U $_V] > 630: } else { > 631: lappend _V $v > 632: lremove _U $v > 633: set value [countEdges $G $_U $_V] > 634: } > 635: > 636: if { $value > $maxValue } { > 637: set var1 [reset $_U] > 638: set var2 [reset $_V] > 639: set maxValue $value > 640: } else { > 641: set _V [reset $var2] > 642: set _U [reset $var1] > 643: } > 644: } In the above, is it necessary to have a single loop ? If not, we could maybe two loops, one over the nodes in V, the other over the nodes in U, obviating the need to perform the 'lsearch check'. The one trouble I see with my idea is that the changes (moves of nodes to the other set) made by the first loop would affect the second one. Or, to avoid this, more book-keeping is needed to know which nodes moved and can be ignored by the second loop. So at second glance maybe better not, it doesn't feel anymore as if the code would be simpler. Same complexity I guess. Ah well. Even so, some other thing to consider in the longer-term future would be an investigation into optimization of the countEdges calls in the loop. Because, as only one node N has moved, only its edges can affect the count (all outgoing edges of N which were to the same set are now to the other set (+), and all edges to the other set are now to the same set (-)). This would avoid a full recount. And also, actually allow us to compute the new value before moving the node at all. > 645: > 646: set value $maxValue > 647: > 648: if { $value > $param } { > 649: return $value > 650: } else { > 651: return 0 > 652: } > 653: } > 654: #Removing element from the list - auxiliary procedure > 655: proc ::struct::graph::op::lremove {listVariable value} { > 656: upvar 1 $listVariable var > 657: set idx [lsearch -exact $var $value] > 658: set var [lreplace $var $idx $idx] > 659: } Alternative to lremove is to add package require struct::list and then use struct::list delete Doesn't need to be done yet. If we have only one place where we need this it is more effective to stay with the local 'lremove' instead of pulling in another big package we use only a very small part of. > 660: #Auxiliary procedure to set again proper values of two cut sets > 661: proc ::struct::graph::op::reset {U} { > 662: set value {} > 663: foreach u $U { > 664: lappend value $u > 665: } > 666: return $value > 667: } This whole procedure seems to be equivalent to proc ::struct::graph::op::reset {U} { return $U } I see an iteration over the list U which is reconstructed in the variable 'value' and then returned. So value == U, thus we can return it the argument directly. And at this point I have to ask if the procedure is needed at all. > 668: #procedure counts edges that link two sets of nodes > 669: proc ::struct::graph::op::countEdges {G U V} { > 670: > 671: set value 0 > 672: > 673: foreach e [$G arcs] { > 674: set v [$G arc source $e] > 675: set u [$G arc target $e] > 676: > 677: if { [lsearch $U $v] >=0 && [lsearch $V $u] >=0 } { See comments regarding line 626. Operations 'in' and ni'. if {($v in $U) && ($u in $V)} { ; # e is arc U -> V > 678: incr value > 679: } > 680: if { [lsearch $U $u] >=0 && [lsearch $V $v] >=0 } { if {($u in $U) && ($v in $V)} { ; # e is arc V -> U > 681: incr value > 682: } > 683: } > 684: > 685: return $value > 686: } Hm. Lets try an alternate ... foreach u $U { foreach e [$G arcs -out $u] { set v [$G arc target $e] if {$v ni $V} continue incr value } } foreach v $V { foreach e [$G arcs -out $v] { set u [$G arc target $e] if {$u ni $U} continue incr value } } The main point to the alternate is that the iteration over U and V eliminates one lsearch/in/ni query, because we know that the node is in the set, and which ... Time complexity of the original: O(#arcs * #nodes). The #nodes because of the lsearch, which is linear. Time complexity of the alternate: O(#nodes * ?). The ? could be #arcs, as we go over all arcs, just scattered across the nodes. The time constants might be lower (because of the reduced number of lsearch/in ops). Andreas |