From: SourceForge.net <no...@so...> - 2008-04-16 08:28:55
|
Bugs item #1858818, was opened at 2007-12-27 07:43 Message generated for change (Comment added) made by shevek You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105095&aid=1858818&group_id=5095 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None Status: Open Resolution: None Priority: 7 Private: No Submitted By: Roland Clobus (rclobus) Assigned to: Roland Clobus (rclobus) Summary: 0.11.4 Server assert (crash): state machine stack overflow Initial Comment: When I was about to move the robber, the server crashed, due to an assert. It looks like the state machine has a stack overflow of the state machine. ---------------------------------------------------------------------- >Comment By: Bas Wijnen (shevek) Date: 2008-04-16 10:29 Message: Logged In: YES user_id=42389 Originator: NO You are right that the server doesn't need to wait. I thought you were saying that because the server doesn't wait, the stack can overflow. I agree that it is a good idea, too, if the server doesn't wait. We're using TCP connections, so we may assume that messages are delivered. To avoid problems like this one, the non-crash would in itself indeed not be enough. I have an extra idea which can help: we add some stack_assert (stack, state) calls, which give an error or a warning (depending on wether debugging is compiled in) when the stack does not have exactly one state, which is "state". In case of a warning (debugging off), a pop_all_and_goto (stack, state) is then performed to clean up. In case of an error, the program aborts. These calls can be inserted in places like "new turn" and "end trade". The result would be that such bugs are found in an early stage, while still not crashing people's running games. ---------------------------------------------------------------------- Comment By: Roland Clobus (rclobus) Date: 2008-04-16 08:34 Message: Logged In: YES user_id=831677 Originator: YES You are right that the server currently waits for the clients, but as far as I can see there is no need to do that. If the client is somehow slow, it will correctly react to new messages after it has updated its GUI state. If the client is buggy, it will will respond with 'Unknown message in mode_<some_mode>', which is a sign that something has gone wrong on the client side of the connection. If the server would disconnect a client when the client was buggy (because it did not send an ack), the client would still be unable to reconnect, because the state machine at the server side is still full. It is not automatically repaired, and I think a reconnection will fail with a new stack overflow, which effectively has the same effect as the server stopping with an assert: an unplayable situation. So I still think the server should ignore the ack, and pop its state, and let the user at the client side review any error messages and decide that a reconnect would be required, which will resolve the problems at the client side. ---------------------------------------------------------------------- Comment By: Bas Wijnen (shevek) Date: 2008-04-15 12:25 Message: Logged In: YES user_id=42389 Originator: NO AFAIK the server should wait for clients to ack the trade-end. That is, the game should be stopped until all clients have responded. I think this is the only sensible way to deal with this situation. When the server and a client are not in sync, the one which is ahead must wait, or when things are really bad, the client must be disconnected. That is a good idea in general: when the server gets a stack overflow on a client stack, it should kick the client out instead of crashing. That way, the client can reconnect, and the game can continue. ---------------------------------------------------------------------- Comment By: Roland Clobus (rclobus) Date: 2008-04-13 14:30 Message: Logged In: YES user_id=831677 Originator: YES I think I've found the problem. The server expects all clients to acknowlegde that the trade has ended. If one client (for some reason) does not acknowlegde it, its state machine in the server will not be popped, and eventually the stack will overflow. I'm not sure what causes the client to refrain from sending the acknowledgement, but a possible fix could be to let the server ignore the acknowledgement from the client, and pop the state anyway. I'm working on this. ---------------------------------------------------------------------- Comment By: Roland Clobus (rclobus) Date: 2008-03-06 22:36 Message: Logged In: YES user_id=831677 Originator: YES I'm marking this bug as required for 0.11.4 ---------------------------------------------------------------------- Comment By: Aaron (haqdiesel) Date: 2008-03-06 17:11 Message: Logged In: YES user_id=1868077 Originator: NO I had the same problem, about 12 rounds into a 5-player game, when one player tried to roll the dice. This was what the console server spit out: 03:43:00 *ERROR* State stack overflow. Stack dump sent to standard error. Stack 0: mode_idle Stack 1: mode_wait_quote_exit Stack 2: mode_wait_quote_exit Stack 3: mode_wait_quote_exit Stack 4: mode_wait_quote_exit Stack 5: mode_wait_quote_exit Stack 6: mode_wait_quote_exit Stack 7: mode_wait_quote_exit Stack 8: mode_wait_quote_exit Stack 9: mode_wait_quote_exit Stack 10: mode_wait_quote_exit Stack 11: mode_wait_quote_exit Stack 12: mode_wait_quote_exit Stack 13: mode_wait_quote_exit Stack 14: mode_wait_quote_exit Stack 15: mode_wait_quote_exit ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105095&aid=1858818&group_id=5095 |