From: Stefan F. (web.de) <ste...@we...> - 2010-06-08 20:59:11
|
This (again) a specific mail on revenue calculation, that addresses Alex directly, but I prefer to keep the discussion on the develop list to allow archiving. I used the Rails archive throughout the work on my implementation and dug out several valuable thoughts (a special thanks goes to John Tamplin's contributions, his pseudo code form one of the mails is pretty similar to a core part). I hope that not too many complain, but in fact nobody is forced to read it to the end ;-) Alex, I have a first phase two prototype working. In fact it was not a too difficult task, as I only create a multigraph with new edges, that define the new routes. Then I define "EdgeTravelSets", which define edges that are mutually exclusive. This is tracked by the calculator using reference counters, as each edge might be excluded by several others. The first results are promising so far and I believe I found the cause for the difference in number of evaluations compared to your implementation. The main issue is that on a two phase graph the selection of neighbors is different to the original one (by definition). The effect was that the search for the 4-train set (8, 5+5, 4D, 6E) took between 30 - 140 seconds (compared to 70 seconds without). At first glance it was amazing that the run time did strongly vary and the fact that the best route is found remarkebly late in the process sheds more light, why you believe that my revenue predicitor is of the magic type. I still sort the neighbors by value, but the ordering of the (potentially multiple) routes to them is random, as they are retrieved from a HashSet. The ordering of routes is important for the revenue predictor to work well. Finding a good solution quick is important. Without phase two optimization, the choice to visit the most valuable (neighboring) vertex is clearly the optimal approach, as each neigbor is one track piece away. Thus the "costs" of each neighbor is identical. If you have done a phase two optimization, the most valuable vertex can be quite far away, thus blocking several track pieces at the same time and ruling out a lot of other connections. Thus for phase two the cost of each neighboring vertex vary substantially. Accordingly the cost has to be considered: The number of edges excluded must be an additional important criteria to select the routes. Take an example: Is it better to run for a nearby 20 vertex (which excludes one other route) or the far away 50 vertex (which excludes three other routes and potentially bypasses the 10 vertex). There is no clear answer, but our preliminary results and my experiences with 18xx routes, implies that it is usually better to go for nearby stations first and save track first. Thus two approaches are possible: * Use Nb of edges excluded as a first criteria, value as a second only. (Thus minimize costs first, then maximize revenue). In that case you would run to the 20 first. * Or combine the two and use something like cost-modified value approach. For example choose the ratio between value and number of routes excluded (including the route selected). In that case you would run to the 50 first. (20/2 = 10, 50/4 = 12.5) I have implemented the ordering based on the first idea (cost-based) and the time is now stable an 24 seconds run for the 4 train set, thus running time decreased by factor 3. Number of evaluations one pass (70 seconds) 480961 evaluations, 2935133 predictions and 8828871 edges travelled. Number of evalutaions two pass with cost ordering (24 seconds) 475533 evaluations, 2910152 predictions and 2841394 edges travelled Number of evaluations two pass with value ordering (81 seconds) 1421981 evaluations, 7302480 predictions and 8297888 edges travelled It seems that edge travelling is the time limiting factor in my approach, but that is not surprising as evaluation and prediction is pretty cheap given the continuous update of the run values. Stefan To compare the validity of my approach, this are my figures about vertices and edges: In scenario A the phase two graph consists of 23 vertices and 57 edges/routes. There are several routes that overlap with 9 other routes (J3.1 -> J9.1, K4.1 -> J3.1,J9.1 -> M2.1,K4.1 -> M2.1). In scenario B the numbers are 20 vertices and 151 edges/routes. The route that overlaps with most others is E8->D5 (over 19 hexes) that shares track with 111 other routes. |