|
From: Vladimir Y. <vol...@cs...> - 2008-08-17 02:49:34
|
Hello, I'm currently working on making the implementation more efficient, particularly for sparse graphs. I modified pending/disjoint_sets.hpp to have fast and dirty implementation of "disjoint sets with weights" which are used by Tarjan's implementation to update edge weights when collapsing cycles. Currently the implementation updates every incoming edge to every strongly connected component vertex on a cycle being collapsed (this particularly inefficient when the graph becomes dense and, I think, leads to Omega(VE) complexity). Another thing that should be done is to use Boost's Fibonacci or Relaxed heaps as a container for edges so that we have O(E log V) time. Finally Fibonacci heaps implementation of O(E log V) algorithm can be later modified into O (V log V). Any comments, help, especially on correct declaration/interfacing with Boost is greatly appreciated - my Boost experience is tiny and writing function bodies is much easier for me than the headers. One question about the current implementation. Why are all data structures initialized with (max_vertex_idx +1) elements and Union Finds with 2*(max_vertex_idx +1)? What are these x2 and "+1" needed for? Should not max_vertex_idx be enough? Thanks, Vladimir |
|
From: Vladimir Y. <vol...@gm...> - 2008-08-17 02:56:22
|
Hello, I'm currently working on making the implementation more efficient, particularly for sparse graphs. I modified pending/disjoint_sets.hpp to have fast and dirty implementation of "disjoint sets with weights" which are used by Tarjan's implementation to update edge weights when collapsing cycles. Currently the implementation updates every incoming edge to every strongly connected component vertex on a cycle being collapsed (this particularly inefficient when the graph becomes dense and, I think, leads to Omega(VE) complexity). Another thing that should be done is to use Boost's Fibonacci or Relaxed heaps as a container for edges so that we have O(E log V) time. Finally Fibonacci heaps implementation of O(E log V) algorithm can be later modified into O (V log V). Any comments, help, especially on correct declaration/interfacing with Boost is greatly appreciated - my Boost experience is tiny and writing function bodies is much easier for me than the headers. One question about the current implementation. Why are all data structures initialized with (max_vertex_idx +1) elements and Union Finds with 2*(max_vertex_idx +1)? What are these x2 and "+1" needed for? Should not max_vertex_idx be enough? Thanks, Vladimir |
|
From: Ali T. <ali...@gm...> - 2008-08-18 15:13:07
|
On Sun, Aug 17, 2008 at 04:49, Vladimir Yanovsky <vol...@cs...>wrote: > I modified pending/disjoint_sets.hpp to have fast and dirty > implementation of "disjoint sets with weights" which are used by > Tarjan's implementation to update edge weights when collapsing cycles. > Currently the implementation updates every incoming edge to every > strongly connected component vertex on a cycle being collapsed (this > particularly inefficient when the graph becomes dense and, I think, > leads to Omega(VE) complexity). Actually, I'm quite sure that the time complexity is O(V^2) for the algorithm in the current implementation. You seem to have overlooked the fact that the sizes of the lists containing the incoming edges are at most |V|: Each list contains at most one outgoing edge for each strongly connected component. So we are not actually updating _every_ incoming edge, but at most |V| of them. A careful examination also reveals that the total number of list merge operations is O(V) (I think 2*|V| is the actual upper bound). Since each list contains at most |V| edges, the total time of merging the lists is O(V^2). Updating the weights before or during the merges clearly makes no difference in terms of time complexity. Still, I'm very interested to know if there is any dramatic change in the running time with your implementation of disjoint sets with weights. Have you run any tests? Another thing that should be done is to use Boost's Fibonacci or > Relaxed heaps as a container for edges so that we have O(E log V) > time. Finally Fibonacci heaps implementation of O(E log V) algorithm > can be later modified into O (V log V). Yes, I agree. I'm aware of the article describing this. I just haven't had any use for it myself since all my graphs are dense. I'll need some (continuous) time on my hands to be able to do that, though of course now I have a motivation since someone may actually use it. Any comments, help, especially on correct declaration/interfacing with > Boost is greatly appreciated - my Boost experience is tiny and writing > function bodies is much easier for me than the headers. A great resource is of course the boost users mailing list where more knowledgeable people can help with Boost matters. That would be my first choice of resource when dealing with Boost-specific issues. > One question about the current implementation. Why are all data > structures initialized with (max_vertex_idx +1) elements and Union > Finds with 2*(max_vertex_idx +1)? What are these x2 and "+1" needed > for? Should not max_vertex_idx be enough? > Since the vertex indices range from zero to (and including) max_vertex_idx, there are max_vertex_idx + 1 actual vertices. The disjoint set datastructure in Boost needs to know the maximum number of disjoint sets; in our case, the weakly and strongly connected components. The upper bound on the number of created components is 2*|V|. Cheers, /Ali |