From: Lars H. <Lar...@re...> - 2009-06-03 14:09:07
|
Andreas Kupries skrev: > Lars Hellström wrote: >> As an example, one kind of manipulation that seems useful (especially >> when you get to the flow algorithms, in a couple of weeks) would be an >> operation that splits every vertex in two, where one part receives the >> incoming edges and the other gets the outbound edges, with a new edge >> from in-part to out-part. (This is a classical trick for reducing flow >> problems with capacities on vertices to flow problems with capacities >> only on the edges.) > > I would put such an operation into a graph::ops procedure, and export it > from the ops package. I.e. a helper, but valuable enough to be a regular > public operation. Which of such helpers we then move to the foundation > class, i.e. struct::graph, can then be discussed separately. Well, although the operation is useful, I'd have to add that an operation which does exactly this feels very much like a lame special case of some more general "insert gadget" operation, and it is really this general operation that should be implemented. The catch is that there's no generally accepted formalism (that I know of) for describing the very intuitive operation of replacing every vertex by some gadget. (Actually, I have done some work on such matters -- see http://abel.math.umu.se/~lars/diamond/paper-gr.pdf, Section 4 -- but that approach is far from immediately applicable to struct::graph.) >>> > As it is in >>>> timeline I started with Bellman-Ford and Johnson's. >> >> In general, I think time is better spent implementing one algorith >> each for doing A, B, and C, than three different algorithms for doing >> A, but YMMV. > > The three distances algorithms (dijkstra, bellman-ford, johnson) have > different time complexities and limitations. As such also different > areas of application, depending on the graphs to be expected as inputs. Yes, but each of them solves maybe 80% of all graph distance problems users will pose. The question is then whether one prioritises the remaining 20% of this problem class A, or the first 80% of some other problem class B... > And automatic selection based on the input is often not possible. And usually a Bad Idea, even when it is possible. > Information like this, i.e. raw time complexities and limitations, and > then at the higher level which algorithm is better in what situation is > definitely something we have to put into the documentation. > > >> Also, I find the rule applied in struct::graph::op about naming >> commands after algorithms that is a bit curious. There's certainly a >> point in doing so when you've got many algorithms for doing the same >> thing, but on the other hand it renders code employing these commands >> more obscure. > > Maybe. It depends a bit on the audience. Assuming that the reader knows > graph theory I see now problem. Well, if one is regularly teaching a graph theory course that covers these algorithms, then I suppose one immediately knows Prim from Kruskal, but otherwise I'm not so sure. Often one remembers algorithms in vaguer terms, such as the edge- or vertex-oriented respectively algorithm for solving the problem in question -- when one remembers individual algorithms at all. A lot of the time, it is sufficient to recall for what problems a polynomial algorithm exists; sorting out the details can typically be delayed until one starts coding. There is also the issue that case conventions can make the names harder to recognise. I recognise "kruskal" and "dijkstra" as names even in all-lower-case, but found "prim" and "tarjan" far more mysterious on first sight. Another problem is that some persons have discovered several algorithms, and for those this naming scheme isn't unique anymore. > Another thing using code can do is to > define aliases mapping from, for example, 'distance', to 'dijkstra'. > That location would then also be a good place for the user to comment on > why a specific algorithm was chosen, see above about that. Yes. Furthermore it could be a good idea to have an overview section /before/ the long list of command descriptions. This section would be organised by topic (e.g.: paths, trees, network flows, connectivity, etc.) and mention what algorithms are available (preferably with hyperlinks to command descriptions). > The kruskal and prim implementations are from last year's GSoC. Have to > check who wrote the docs. Might have been me, as part of the integration > into tcllib (The '08 operations were written outside of Tcllib first). > > As such this is not something to be done by Michal. > Not during the GSoC period. I am fine if he wishes > to do that after. By then however, I might have fixed > things already. > > Can you (Lars) please go over the docs and file regular Tcllib bugs for > where you believe information is missing ? Have filed some earlier today. Could be more on the way. >> Also, regarding the API provided by struct::graph, I find it a bit >> curious that "weight" is singled out as a special property of arcs, >> with its own methods for setting, getting, and so on. I suspect a more >> practical approach for a higher level package like struct::graph::op >> would be that the weight was some attribute given to the arcs, and >> that a command for computing e.g. a shortest path would take as an >> argument (or option value) the name of the attribute to use as >> "weight" in this particular calculation. > > This was also a decision made during GSoC 2008. I am not recalling > details at the moment. There should be something in my mail archives > however as I was involved, although not as primary mentor. For this I > however should have some mails. Thinking a bit more about this lead me to the idea of dividing the struct::graph API into smaller functional groups, but that is getting so long that I'll write a separate mail about it. Lars Hellström |