Re: [Rails-devel] Implementation details for revenue calculation

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Stefan,

My comments are inline
> I have a first phase two prototype working. In fact it was not a too  
> difficult
> task, as I only create a multigraph with new edges, that define the new
> routes.
>
> Then I define "EdgeTravelSets", which define edges that are mutually
> exclusive. This is tracked by the calculator using reference counters,  
> as each
> edge might be excluded by several others.
I'm keeping a vector of exclusions for each edge. I haven't investigated  
what approach is more efficient. In the testcases I ran I've never seen  
this being a bottleneck, so my guess is that the storage method is  
probably unimportant for performance.

>
> The first results are promising so far and I believe I found the cause  
> for the
> difference in number of evaluations compared to your implementation.
> The main issue is that on a two phase graph the selection of neighbors is
> different to the original one (by definition).
>
> The effect was that the search for the 4-train set (8, 5+5, 4D, 6E) took
> between 30 - 140 seconds (compared to 70 seconds without).
What did the time depend on? I always had very stable times for any  
scenario we've tried. For the example above it was ~30 seconds, perhaps  
with 1 second spread.

> At first glance it was amazing that the run time did strongly vary and  
> the
> fact that the best  route is found remarkebly late in the process sheds  
> more
> light, why you believe that my revenue predicitor is of the magic type.
>
> I still sort the neighbors by value, but the ordering of the (potentially
> multiple) routes to them is random, as they are retrieved from a HashSet.
> The ordering of routes is important for the revenue predictor to work  
> well.
> Finding a good solution quick is important.
This is a very good point - I choose edge iteration order more or less  
randomly - I don't use any criteria to sort them. I kind of thought of  
trying to make more intelligent choice, but didn't come up with any good  
criterion.

> Without phase two optimization, the choice to visit the most valuable
> (neighboring) vertex is clearly the optimal approach, as each neigbor is  
> one
> track piece away. Thus the "costs" of each neighbor is identical.
>
> If you have done a phase two optimization, the most valuable vertex can  
> be
> quite far away, thus blocking several track pieces at the same time and
> ruling out a lot of other connections. Thus for phase two the cost of  
> each
> neighboring vertex vary substantially. Accordingly the cost has to be
> considered: The number of edges excluded must be an additional important
> criteria to select the routes.
>
> Take an example:
> Is it better to run for a nearby 20 vertex (which excludes one other  
> route) or
> the far away 50 vertex (which excludes three other routes and potentially
> bypasses the 10 vertex). There is no clear answer, but our preliminary
> results and my experiences with 18xx routes, implies that it is usually
> better to go for nearby stations first and save track first.
>
> Thus two approaches are possible:
> * Use Nb of edges excluded as a first criteria, value as a second only.  
> (Thus
> minimize costs first, then maximize revenue).
> In that case you would run to the 20 first.
>
> * Or combine the two and use something like cost-modified value  
> approach. For
> example choose the ratio between value and number of routes excluded
> (including the route selected).
> In that case you would run to the 50 first. (20/2 = 10, 50/4 = 12.5)
>
> I have implemented the ordering based on the first idea (cost-based) and  
> the
> time is now stable an 24 seconds run for the 4 train set, thus running  
> time
> decreased by factor 3.
I like your idea, I think it's a good criterion. My intuitive feeling is  
that (1) is a better approach. I agree that in normal 18xx games the  
routes follow specific pattern - generally people try to connect valuable  
location by the shortest route and the long scenic routes between two  
cities/towns is usually a sign of a well blocked map :)

>
> Number of evaluations one pass (70 seconds)
> 480961 evaluations, 2935133 predictions and 8828871 edges travelled.
>
> Number of evalutaions two pass with cost ordering (24 seconds)
> 475533 evaluations, 2910152 predictions and 2841394 edges travelled
>
> Number of evaluations two pass with value ordering (81 seconds)
> 1421981 evaluations, 7302480 predictions and 8297888 edges travelled
>
> It seems that edge travelling is the time limiting factor in my  
> approach, but
> that is not surprising as evaluation and prediction is pretty cheap  
> given the
> continuous update of the run values.
I want to try to use your approach to the edge ordering, but currently I  
took my implementation apart, because I wanted to restore vector  
computation functionality (which I broke when I added predictor) and this  
proves to be quite challenging (and, besides, the summer finally came here  
:) )

> To compare the validity of my approach, this are my figures about  
> vertices and
> edges:
>
> In scenario A the phase two graph consists of 23 vertices and 57  
> edges/routes.
> There are several routes that overlap with 9 other routes (J3.1 -> J9.1,
> K4.1 -> J3.1,J9.1 -> M2.1,K4.1 -> M2.1).
> In scenario B the numbers are 20 vertices and 151 edges/routes.
> The route that overlaps with most others is E8->D5 (over 19 hexes) that  
> shares
> track with 111 other routes.
I think I had the same numbers.

Alex.