[Edmonds-alg-users] Implementation improvements

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

No, I have not yet finished incorporating it into edmonds-alg - have
to change accessing
weights to property maps. Will post the results when ready.

Regarding O(N^2) complexity, you are right. The number of list merge operations
is |V|-1 as each merge decreases the number of remaining of vertices
(SCCs) by 1.
Btw, Tarjan decreases the weight of all incoming edges by the
difference between the
cycle's edge weight and the weight of the cheapest edge in the cycle.
Subtracting the
weight of the cycle edge from the weights of all edges sharing with it
the target should
be good as well and will save additional arithmetic operation per
edge. By induction, this will
not causes any edge weights to become negative as result. Moreover,
the decrement
can be performed when we select a critical edge -not waiting the cycle
colapse stage- just decrease all the incoming edges
by its weight. The weights of edges on weakly connected components
then become zero. This is faster then going
through all cycle edges to decrement due to cache locality.

Regarding the O(VlogV) algorithm, it should be faster than O(V^2) even
for dense graphs. I'll work
on its implementation as we get O(ElogV) ready - it's very similar and
the constants in O look low.

Do you have some toy example of creating a sparse graph of V vertices,
initially no edges, so that I can add edges
myself and have your implementation  run on it? This will help a
lot in my understanding of the Boost related staff of the implementation.

Thanks,
Vlad

On Sun, Aug 17, 2008 at 04:49, Vladimir Yanovsky <vol...@cs...>wrote:

> I modified pending/disjoint_sets.hpp to have fast and dirty
> implementation of "disjoint sets with weights" which are used by
> Tarjan's implementation to update edge weights when collapsing cycles.
> Currently the implementation updates every incoming edge to every
> strongly connected component vertex on a cycle being collapsed (this
> particularly inefficient when the graph becomes dense and, I think,
> leads to Omega(VE) complexity).

Actually, I'm quite sure that the time complexity is O(V^2) for the
algorithm in the current implementation. You seem to have overlooked the
fact that the sizes of the lists containing the incoming edges are at most
|V|: Each list contains at most one outgoing edge for each strongly
connected component. So we are not actually updating _every_ incoming edge,
but at most |V| of them. A careful examination also reveals that the total
number of list merge operations is O(V) (I think 2*|V| is the actual upper
bound). Since each list contains at most |V| edges, the total time of
merging the lists is O(V^2). Updating the weights before or during the
merges clearly makes no difference in terms of time complexity.

Still, I'm very interested to know if there is any dramatic change in the
running time with your implementation of disjoint sets with weights. Have
you run any tests?

Another thing that should be done is to use Boost's Fibonacci or
> Relaxed heaps as a container for edges so that we have O(E log V)
> time. Finally Fibonacci heaps implementation of O(E log V) algorithm
> can be later modified into O (V log V).

Yes, I agree. I'm aware of the article describing this. I just haven't had
any use for it myself since all my graphs are dense. I'll need some
(continuous) time on my hands to be able to do that, though of course now I
have a motivation since someone may actually use it.

Any comments, help, especially on correct declaration/interfacing with
> Boost is greatly appreciated - my Boost experience is tiny and writing
> function bodies is much easier for me than the headers.

A great resource is of course the boost users mailing list where more
knowledgeable people can help with Boost matters. That would be my first
choice of resource when dealing with Boost-specific issues.

> One question about the current implementation. Why are all data
> structures initialized with (max_vertex_idx +1) elements and Union
> Finds with  2*(max_vertex_idx +1)? What are these x2 and "+1" needed
> for? Should not  max_vertex_idx be enough?
>

Since the vertex indices range from zero to (and including) max_vertex_idx,
there are max_vertex_idx + 1 actual vertices. The disjoint set datastructure
in Boost needs to know the maximum number of disjoint sets; in our case, the
weakly and strongly connected components. The upper bound on the number of
created components is 2*|V|.

Cheers,
/Ali