Menu

#4828 BS2: Game lockup in British Museum

Broken Sword 2
closed-fixed
5
2010-05-17
2010-03-24
digitall
No

Playing through BS2, when Nico talks to the Attendant in the British Museum, causes a interactive lockup.

ScummVM is not frozen and the debug console is available, but BS2 will not respond i.e. this looks like a script bug.

This is likely to be the same or similar issue to bug 2607089.

ScummVM 1.1.0pre48374 (Mar 23 2010 23:16:52)
Features compiled in: Vorbis FLAC MP3 ALSA RGB zLib FluidSynth
built on Linux x86_32 2.6.31 by GCC 4.3.4

Discussion

1 2 > >> (Page 1 of 2)
  • digitall

    digitall - 2010-03-24

    Savegame prior to bug

     
  • digitall

    digitall - 2010-03-24

    Console Output

     
  • digitall

    digitall - 2010-03-24

    Screen capture of lockup with debug overlay information activated

     
  • digitall

    digitall - 2010-03-24

    Have attached :
    * latest savegame prior to the lockup occurring,
    * a capture of the console output which like bug 2607089 shows a lot of "WARNING: No free slot in FX queue!", but I'm not sure that this is related to the issue seen, and
    * a screen capture of the lockup. Since the lockup did not affect the engine, I used the debug console to enable debug output information, which is shown.

    I think the line "Script waiting for attendant_71 (3029)" is the issue at hand.

     
  • Willem Jan Palenstijn

    • summary: Game lockup in British Museum --> BS2: Game lockup in British Museum
     
  • Torbjörn Andersson

    Does it always happen? I can't reproduce it at all.

     
  • digitall

    digitall - 2010-04-01

    I was probably not clear about the phrase "Playing through BS2".

    I have also been unable to replicate this by loading the savegame, but
    this occurred when I was playtesting BS2 and had played through almost the
    entire game from the start. I had been saving and I think I loaded from a savegame
    fairly near the beginning, but otherwise it was a straight play run.

    This bug looks like it might be caused by a resource leak of some kind, maybe related to
    "WARNING: No free slot in FX queue!" which then causes a script to get hung up, but
    replication is likely to require playing through the game from the start with a debugger attached to
    try to get the same resource exhaustion...

     
  • Torbjörn Andersson

    The strange thing is that I'm pretty sure the FX queue is cleared every time you enter a new room, so I don't know why it would fill up here. The only thing I can think of is that the resource manager (if memory serves me) expires resources when it runs out of memory. I guess it's possible, in some rare case, that it expires something that it shouldn't have, and that this would cause it to forget that a sound has already been added to the queue, or something silly like that. (I don't think there's any safeguard against adding the same sound more than once, and for all I know there may be cases where it's sensible to allow it.)

    I've added an "fxq" debugger command to the trunk, which prints the contents of the FX queue. Maybe that will come in use some day.

     
  • digitall

    digitall - 2010-04-02

    eriktorbjorn : Thanks for the "fxq" command.
    Will try playing through from start and see if I can replicate and see what "fxq" then gives.
    Will also give valgrind a try.
    Suspect this is going to be a slow one to get replication, but will try.

     
  • digitall

    digitall - 2010-04-02
    • assigned_to: nobody --> eriktorbjorn
     
  • digitall

    digitall - 2010-04-04

    Screen capture of lockup with debug output overlay Run #2

     
  • digitall

    digitall - 2010-04-04

    fxq output for Run #2 - Part #1

     
  • digitall

    digitall - 2010-04-04

    fxq output for Run #2 - Part #2

     
  • digitall

    digitall - 2010-04-04

    reslist output for Run #2

     
  • digitall

    digitall - 2010-04-04

    fxq output after savegame reload i.e. no bug

     
  • digitall

    digitall - 2010-04-04

    Have played through again from beginning with :
    ScummVM 1.2.0svn48496 (Apr 4 2010 00:27:34)
    Features compiled in: Vorbis FLAC MP3 ALSA RGB zLib FluidSynth

    And reliably hit the bug.

    The "WARNING: No free slot in FX queue!" are the cause. The fxq command shows that
    the FX queue has been exahusted of slots as upon entering the British Museum (As opposed
    to starting to talk to the attendant), three background effects are meant to be loaded.

    For some reason, when you play from the start, these three are loaded repeatedly i.e about
    10+ times exhausting the FX queue and causing the script problems afterwards.

    Loading a savegame in the British Museum or at Ketch's Landing in the previous scene,
    does not trigger this problem.. so replication still requires playing from a new game state.

    I have attached copies of the fxq and reslist output with the bug and fxq without.

     
  • Torbjörn Andersson

    I'm going to think out loud for a while...

    A game resource is whatever game data that's loaded from disk. It could be graphics, sound, scripts, etc.

    The engine is full of cases where it does something like "open resource, manipulate resource, close resource". However, closing a resource is not the same as removing it from memory. It simply tells the resource manager that it's done with the resource for now. The resource manager is expected to cache the resource for later use. In fact, some resources are stateful (e.g. script resources contain local variables) which means that if they were thrown away on closing and reloaded on opening, information would be lost. Incidentally, configuring the resource manager to not cache causes "No free slot in FX queue" warnings at the very beginning of the game.

    But obviously, resources cannot be cached forever or we'd run out of memory. The resource manager deals with this by keeping track of how much memory is allocated, which resources are open at the moment, and when it exceeds its memory quota it expires what it thinks are the oldest resources that aren't open at the time.

    So maybe the resource manager is expiring some stateful resource that belongs to the current room, and this causes things to go haywire. In that case, I can think of two possibilities:

    1) There's something wrong with the "expire the oldest resources" logic. However, I can't see anything obviously wrong at the moment.

    2) The game leaks resources, i.e. it opens some resource that it should have closed but doesn't. (There are resources that it legitimately should never close, of course. The global variables resource, for instance.) When I tried a random savegame, I had to enter three rooms before it started expiring anything, so normally it shouldn't expire anything museum-related while you're still there. But maybe if you play the game from the beginning, the museum is the point where the leak has accumulated to where it can no longer hold an entire room's resources in what memory remains?

    I'll have to play around a bit with the latter assumption. Not right now, though.

     
  • Torbjörn Andersson

    After playing through the early parts of the game, I didn't see any leaks in the list of loaded resources. The current implementation doesn't always evict the least recently opened resource first. Actually, it evicts the least recently *closed* resource, but I think that should still be ok because we can assume that resources are used while they're held open. (I believe it's primarily sound effects that are held open for long periods of time, but I think the sounds from the previous room are still thrown out before anything from the current room.)

    I'm not sure what to look for next. Perhaps checking if the memory usage is calculated properly. If it's not, and it thinks it's using more memory than it is, it will of course start throwing out resources much faster than before. But I don't see how that could happen...

     
  • digitall

    digitall - 2010-04-08

    Have run through to this bug with:
    ScummVM 1.2.0svn48496 (Apr 4 2010 00:27:34)
    Features compiled in: Vorbis FLAC MP3 ALSA RGB zLib FluidSynth

    running under Valgrind.

    This has revealed two memory errors, which should be corrected.

    The first is :
    ==8765== Conditional jump or move depends on uninitialised value(s)
    ==8765== at 0x8138334: Sword2::Logic::fnSetFrame(int*) (function.cpp:406)
    ==8765== by 0x813A328: Sword2::Logic::fnSpeechProcess(int*) (function.cpp:1356)
    ==8765== by 0x813C7F7: Sword2::Logic::runScript2(unsigned char*, unsigned char*, unsigned char*) (interpreter.cpp:620)
    ==8765== by 0x813E59D: Sword2::Logic::processSession() (logic.cpp:130)
    ==8765== by 0x81247A0: Sword2::Sword2Engine::gameCycle() (sword2.cpp:717)
    ==8765== by 0x8125E20: Sword2::Sword2Engine::run() (sword2.cpp:536)
    ==8765== by 0x8056470: runGame(PluginSubclass<MetaEngine> const*, OSystem&, Common::String const&) (main.cpp:217)
    ==8765== by 0x8056E19: scummvm_main (main.cpp:390)
    ==8765== by 0x80532C5: main (main.cpp:65)

    And occurs in Montfaulcon at the Cafe, during the "Meanwhile" non-interactive cutscene, with Nico.
    To replicate, load the attached savegame. Talk to the waiter about Labino etc. which triggers Labino
    to arrive. Talk to Labino, especially about the Pot, until the "meanwhile" cutscene is triggered.
    Note : Sorry, can't get a closer savegame as these sections during conversations are non-saveable points.

     
  • digitall

    digitall - 2010-04-08

    Savegame for replication of 1st valgrind error - "Meanwhile" cutscene

     
  • digitall

    digitall - 2010-04-08

    The second is :
    ==24377== Use of uninitialised value of size 4
    ==24377== at 0x816FCD4: Sword2::Screen::scaleImageGood(unsigned char*, unsigned short, unsigned short, unsigned short, unsigned char*, unsigned short, unsigned short, unsigned short, unsigned char*) (render.cpp:171)
    ==24377== by 0x8182170: Sword2::Screen::drawSprite(Sword2::SpriteInfo*) (sprite.cpp:704)
    ==24377== by 0x817B2EC: Sword2::Screen::processImage(Sword2::BuildUnit*) (screen.cpp:673)
    ==24377== by 0x817B84E: Sword2::Screen::drawSortFrames(unsigned char*) (screen.cpp:482)
    ==24377== by 0x817D3EC: Sword2::Screen::buildDisplay() (screen.cpp:315)
    ==24377== by 0x814B7AB: Sword2::Sword2Engine::run() (sword2.cpp:556)
    ==24377== by 0x80565CC: runGame(PluginSubclass<MetaEngine> const*, OSystem&, Common::String const&) (main.cpp:217)
    ==24377== by 0x8056F75: scummvm_main (main.cpp:390)
    ==24377== by 0x8053421: main (main.cpp:65)

    This error repeats 11 times, with lines in render.cpp as follows :
    171-3, 175-7, 163-5 and 167-9

    And occurs when you enter the Mining Company in Quaramonte.
    To replicate, load the attached savegame and enter the stairs to
    the mining company.

     
  • digitall

    digitall - 2010-04-08

    Savegame for replication of 2nd valgrind error - Mining Company, Quaramonte

     
  • digitall

    digitall - 2010-04-11

    Debug Patch for 2nd valgrind error - Mining Company, Quaramonte

     
  • digitall

    digitall - 2010-04-11

    I have taken a look at the second valgrind error in the Mining Company in Quaramonte.
    I have traced the source of the uninitalised usage to the backPtr accessing outside the screen area,
    when y = 161 and rd.top = 278.
    My debugging patch for this is attached. If run with valgrind, the output should be aligned.
    Note, this will only manifest with the highest image quality setting in the menu i.e. this triggers the use of scaleImageGood, rather than scaleImageFast which does not use backBuf.

    eriktorbjorn: I could clamp backBuf accesses, but I'm not sure that is the best solution for this.

     
  • Torbjörn Andersson

    Sorry about the delay. I've committed a fix (hopefully) for the scaler Valgrind warning, but I doubt that's the cause of this bug. From what I understand, it was only reading data where it shouldn't, not writing. (Even the bad reads may very well have been within the allocated buffer.)

    I haven't looked at the other warning yet.

     
1 2 > >> (Page 1 of 2)