Menu

#2720 At Ruins, expedition is lost; gives "The Server can not do that" message

Fixed_0.11.0
closed-fixed
nobody
5
2015-04-26
2014-10-20
Simon Small
No

A Scout investigates a Ruins. The result is the expedition disappears! (expected. Does not have to be a scout)

A message is shown, saying "The server can not do that."

The log file shows:
net.sf.freecol.server.model.ServerUnit csExploreLostCityRumour
INFO: Unit unit:6481 is exploring rumour EXPEDITION_VANISHES
Mon Oct 20 11:49:50 BST 2014
Thread: 29
net.sf.freecol.server.ai.AIUnit dispose
WARNING: Disposing of unit:6481 but owner is null!
Mon Oct 20 11:49:50 BST 2014
Thread: 29
net.sf.freecol.common.networking.DOMMessage clientError
WARNING: Not a net.sf.freecol.common.model.Unit: unit:6481
Mon Oct 20 11:49:50 BST 2014
Thread: 30
net.sf.freecol.common.networking.ServerAPI askExpecting
WARNING: ServerAPI. setDestination error, messageId: server.reject message: Not a net.sf.freecol.common.model.Unit: unit:6481
Mon Oct 20 11:49:50 BST 2014
Thread: 12

Discussion

  • Mike Pope

    Mike Pope - 2014-10-21

    This has been reported before, but no one has ever reported a reproducible case. "Expedition vanishes" LCRs work for me. Can you reproduce this failure from an earlier saved game?

    Also, it is usually better to provide the whole log file (use "Add attachments"). The message about unit:6481 is interesting, and makes me wonder what unit:6481 did prior to moving to the LCR. But I do not have that option.

     
  • Simon Small

    Simon Small - 2014-10-23

    LCR? Does this refer to the Ruins?

    It is difficult to repeat this outcome as the result of the exploration is random. Perhaps the code could be changed (in a users environment) to always give this result. I don't have the log file any more as it was overwritten when I started a new game. I'll take a copy next time something happens; I did not think the remaining log looked relevant

     
  • Mike Pope

    Mike Pope - 2014-10-23

    LCR? Does this refer to the Ruins?

    Lost City Rumour

    It is difficult to repeat this outcome as the result of the exploration is random.

    Not random, pseudo-random with saved state. If you save a game, investigate a rumour and get result X, then restoring the game and repeating the investigation will reliably yield result X again. You will get a different result only if you do something else to perturb the pseudo-random number generator's sequence. FreeCol is not completely deterministic due to races between threads, but user triggered events should be highly repeatable.

    So, do you have a saved game, especially one from just before moving to the LCR?

    I don't have the log file any more as it was overwritten when I started a new game. I'll take a copy next time something happens; I did not think the remaining log looked relevant

    Always err on the side of too much information. A quick look through the bug list will show that too little information is exasperatingly common.

     
    • Evgeny

      Evgeny - 2014-10-28

      Yes, it was https://sourceforge.net/p/freecol/bugs/2518/

      The point is:
      I could reproduce the same error every time by loading my game, and performing the sequence steps. But! Other testers could not reproduce the bug from the same saved game.

       
  • Simon Small

    Simon Small - 2014-10-24

    I have a vague memory that the random results in Col1 were different when replayed. The Freecol behaviour makes things a lot more reproducible :)

    I had a saved game that had a LCR close to the landing tile. I started Freecol then opened and played that Game a few times until I got the result I wanted. I attach the saved game from the start of that turn (Bug2720.fsg) and the log from the start of playing (Bug2720log.txt). You can open this game, then select the unit on the ship and land on the LCR tile. The error happens (I checked it happens a few times). I have also include the log file from starting Freecol, opening the saved game I have provided and playing the required turn (Bug2720xlog.txt)

    I started the game from a Windows Command window using the following command:

    \java\bin\java -Xmx512M -jar FreeCol.jar --freecol-data data --user-cache-directory play --user-config-directory play --user-data-directory play --windowed=1200x800 --log-level INFO

    I am using the downloaded Windows zip file, version 0.11.0, unpacked into it's own directory, with a new directory "play" added there to hold all the user files

    I hope it is useful

     

    Last edit: Simon Small 2014-10-24
  • Mike Pope

    Mike Pope - 2014-10-25

    OK, the saved game works normally for me, but comparing the whole log file suggests there is a race going on. Different Java implementations may be the ultimate cause of this one. Since you are comfortable with a command line, can I get you to add a "--debug COMMS" option and explore the rumour again. This will make FreeCol display all the client-server communications, which is quite voluminous and will make the game rather slow at times --- capture that somehow (does the Windows CLI use the '>' syntax?) and upload it here. I am now hopeful that will definitively reveal the source of this longstanding problem.

     
  • Simon Small

    Simon Small - 2014-10-25

    Slightly out of my knowledge area... I don't seem to be able to capture the debug output using '>' as this captures the usual output "[INFO] init()" etc.

    There is a lot of output to the command prompt window, which is too much for the scroll area of that window. I have attached what I can copy from there. Is the debug ouput sent to a different stream than STDOUT?

    I don't get the "Server cannot do that message with the --debug COMMS option, but get an exception "Not a net.sf.freecol.common.model.Unit: unit:6452" message instead

     
  • Mike Pope

    Mike Pope - 2014-10-25

    Is the debug ouput sent to a different stream than STDOUT?

    It uses stderr, or System.err in Java terminology. I use the "2>&1" syntax with unix shells to then redirect stderr to stdout, other methods include ">& <file>" and/or "2> <file>". However...

    I don't get the "Server cannot do that message with the --debug COMMS option, but get an exception "Not a net.sf.freecol.common.model.Unit: unit:6452" message instead

    OK, should have thought of that. If it is a race condition, COMMS debugging may well slow things down enough to prevent the race. So this is more evidence in support of that theory, but not a clincher, and even if you had captured the COMMS output the failure to trigger the bug would have made the test less convincing.

    The "not a Unit" message is harmless whining from one of the AIs, I have been meaning to work out why it is happening for some time, but because it does not break anything it is low priority. The "server can not do that" is the real problem.

    What we could try next is to push the debug level up to FINEST. That should at least give some indication of the server thread ordering which potentially could nail the problem. Unless of course the extra debug messages break the race again... if that happens, we are stuck with having to apply special purpose patches. You are ok with building from trunk?

     
  • Mike Pope

    Mike Pope - 2014-10-26
    • labels: --> Server Gameplay
    • status: open --> open-needs-info
     
  • Simon Small

    Simon Small - 2014-10-26

    I added "--debug FINEST" and "2>bug2720.log" and got a logfile...

    26-Oct-2014 09:44:15 net.sf.freecol.common.debug.FreeColDebugger setDebugModes
    WARNING: Unrecognized debug mode: FINEST
    Debug mode list () expected.

    However, the log file from the error message pop-up is attached (Bug2720FINEST.txt)

    I am mostly happy trying technical things, but might have to ask for clarification. I have trunk (from just before v0.11.0) building using NetBeans. It is an easy way for me to understand what I am doing

     
  • Mike Pope

    Mike Pope - 2014-10-26

    push the debug level up to FINEST

    Argh. Of course, I meant "push the log level up to FINEST" (i.e. instead of INFO). And to clarify further do not use any debug setting, which happened anyway because FINEST is not a debug level.

    Anyway, looking at the log file there are a few clues although its not as clear as COMMS output would be. I can see server threads handling "setDestination" successfully (18), then handling the move onto the LCR (19), then another failing setDestination (20), and the client thread (11) responding in turn to the server. So I have hunted around for places where setDestination could race, but remain perplexed. Nevertheless, I have picked the most likely source of the second setDestination call and explicitly guarded it against dead units in git. 9e3090e. Let us know if that fixes it. If it does the mystery then is why does the bug not occur for me... however I can live with bugs that mysteriously do not occur.

     
  • Mike Pope

    Mike Pope - 2015-04-26

    No further sightings of this over the last six months, good chance the above fix worked. Closing.

     
  • Mike Pope

    Mike Pope - 2015-04-26
    • status: open-needs-info --> closed-fixed
    • Group: Current --> Fixed_0.11.0
     

Log in to post a comment.