From: Stephen D. <sd...@gm...> - 2006-10-07 15:23:17
|
On 10/7/06, Zoran Vasiljevic <zv...@ar...> wrote: > > On 07.10.2006, at 00:39, Stephen Deasey wrote: > > > > > But the call may take longer than the caller budgeted for, due to all > > the hidden timeouts, which are additive. > > > > True. > > > So the callers time budget is 5 seconds, and that's what they pass to > > -evaltimeout. But by default both the sendtimeout and recvtimeout are > > 5 seconds. So the total time spent on a successful call to ns_proxy > > eval could be over 15 seconds, which is 3x the time budget. > > > > The time budget is a single target value. In the future, for the > > majority of users, this is going to be set per URL. For /checkout you > > may allow up to 60 seconds to serve the page before deciding you're > > overloaded. For /ads you may give up much sooner. Your server is busy > > and you need to shed load, so you shed the least important traffic. > > > > For a single page with some time budget, which depends on the URL, > > some of it may be used up in a call to ns_cache_eval before there is a > > change to call ns_proxy eval. i.e. the time budget is pretty dynamic. > > > > I don't see how the multiple fine-grained settings of ns_proxy can be > > used effectively in the simple case of a web page with a time budget > > which runs multiple commands with timeouts. > > How would you handle this issue then? > Given you have 5 seconds budget to run the proxy command > and the fact that running the command involves round-trips > to some remote proxy on another host, how would you implement > that? You have to send the data to the proxy and then you need > to wait for all the data to come back. Would you break the > potentially valid request because it needs 5.01 secs to get the > last byte transfered back just because your total limit was set to > 5 seconds? > > If you can give a good solution, I will implement that immediately. The caller doesn't have a time budget for executing the code in the slave, they have a budget for sending the code, executing it, and receiving the result. So yes, if it takes 5.01 secs with one byte remaining, you fail. No crystal ball. Exactly the same problem arises if you have an additional timeout of 1sec for receiving the result. What if it takes 1.01 sec with one byte remaining? Where do you draw the line? The difference is that now you've implicitly stated that your time budget is 6 secs, but you're less flexible because you've partitioned it. Increasing the original time budget to 6 secs would have exactly the same effect, but avoid spurious errors due to timeouts on one counter with time remaining on the other. > But if we make this change, then we need no counters of errors at > various places because they will make no sense. Effectively > we have a budget of -evaltimeout which is divided across all > possible points where we will/must wait. A timeout expiring at > any of this point has no meaningfull information at all any more. > Right? No, you still need to count each error type. The caller of the code can't do much at the time to solve the problem, but someone needs to solve it, and to do that you need information. So for example, if the code was timing out sending code to the slave, you won't tune the code in the slave to be faster because that's not the problem. If you're timing out in the mutex wait for a slave handle, maybe the pool size needs to be increased. Perhaps the fact that *some* bytes have been received but the receive timed out can be used to distinguish the case where a slave successfully executes but the comm channel fails? The more useful info we can gather the better, I think. |