|
From: Vladimir Y. <vol...@cs...> - 2008-08-19 05:48:12
|
No, I have not yet finished incorporating it into edmonds-alg - have to change accessing weights to property maps. Will post the results when ready. Regarding O(N^2) complexity, you are right. The number of list merge operations is |V|-1 as each merge decreases the number of remaining of vertices (SCCs) by 1. Btw, Tarjan decreases the weight of all incoming edges by the difference between the cycle's edge weight and the weight of the cheapest edge in the cycle. Subtracting the weight of the cycle edge from the weights of all edges sharing with it the target should be good as well and will save additional arithmetic operation per edge. By induction, this will not causes any edge weights to become negative as result. Moreover, the decrement can be performed when we select a critical edge -not waiting the cycle colapse stage- just decrease all the incoming edges by its weight. The weights of edges on weakly connected components then become zero. This is faster then going through all cycle edges to decrement due to cache locality. Regarding the O(VlogV) algorithm, it should be faster than O(V^2) even for dense graphs. I'll work on its implementation as we get O(ElogV) ready - it's very similar and the constants in O look low. Do you have some toy example of creating a sparse graph of V vertices, initially no edges, so that I can add edges myself and have your implementation run on it? This will help a lot in my understanding of the Boost related staff of the implementation. Thanks, Vlad On Sun, Aug 17, 2008 at 04:49, Vladimir Yanovsky <vol...@cs...>wrote: > I modified pending/disjoint_sets.hpp to have fast and dirty > implementation of "disjoint sets with weights" which are used by > Tarjan's implementation to update edge weights when collapsing cycles. > Currently the implementation updates every incoming edge to every > strongly connected component vertex on a cycle being collapsed (this > particularly inefficient when the graph becomes dense and, I think, > leads to Omega(VE) complexity). Actually, I'm quite sure that the time complexity is O(V^2) for the algorithm in the current implementation. You seem to have overlooked the fact that the sizes of the lists containing the incoming edges are at most |V|: Each list contains at most one outgoing edge for each strongly connected component. So we are not actually updating _every_ incoming edge, but at most |V| of them. A careful examination also reveals that the total number of list merge operations is O(V) (I think 2*|V| is the actual upper bound). Since each list contains at most |V| edges, the total time of merging the lists is O(V^2). Updating the weights before or during the merges clearly makes no difference in terms of time complexity. Still, I'm very interested to know if there is any dramatic change in the running time with your implementation of disjoint sets with weights. Have you run any tests? Another thing that should be done is to use Boost's Fibonacci or > Relaxed heaps as a container for edges so that we have O(E log V) > time. Finally Fibonacci heaps implementation of O(E log V) algorithm > can be later modified into O (V log V). Yes, I agree. I'm aware of the article describing this. I just haven't had any use for it myself since all my graphs are dense. I'll need some (continuous) time on my hands to be able to do that, though of course now I have a motivation since someone may actually use it. Any comments, help, especially on correct declaration/interfacing with > Boost is greatly appreciated - my Boost experience is tiny and writing > function bodies is much easier for me than the headers. A great resource is of course the boost users mailing list where more knowledgeable people can help with Boost matters. That would be my first choice of resource when dealing with Boost-specific issues. > One question about the current implementation. Why are all data > structures initialized with (max_vertex_idx +1) elements and Union > Finds with 2*(max_vertex_idx +1)? What are these x2 and "+1" needed > for? Should not max_vertex_idx be enough? > Since the vertex indices range from zero to (and including) max_vertex_idx, there are max_vertex_idx + 1 actual vertices. The disjoint set datastructure in Boost needs to know the maximum number of disjoint sets; in our case, the weakly and strongly connected components. The upper bound on the number of created components is 2*|V|. Cheers, /Ali |