Texture memory access problem with Dark (SS2)

  • kolya

    Hey Volca, this is Kolya from strangebedfellows.de. I see you're still working on OPDE, that's great! :)

    I wanted to ask you about a problem we have with the original DarkEngine of System Shock 2. Maybe you can shed a light on this:
    With the modification ADaoB 0.30 (http://www.strangebedfellows.de/index.php?topic=25) animated high resolution (256 x 256) monitor textures were added to System Shock 2.
    Now sometimes moving items in the inventory changes the object icon to a keycard (or nanites, CMs, or nothing) and then crashes the game.

    What we know about this
    Apparently it's a bug in the way SS2 handles memory for graphics. The game is freeing the memory for the cursors while it's still using it.
    We asked Telliamed about it and this is his analysis:

    The original error clearly states that it's getting a NULL pointer from
    struct IRes * __cdecl LoadPCX(char const *,char *,enum eShockLoadFlags)

    Which loads an image resource then does something with the palette. That's why I suspected TGAs, since it doesn't check whether or not the image actually has a palette before trying to use it.

    OTOH, the load could be failing deeper in the resource manager. I haven't spent much time on it.

    Well, if you have an idea what might be the cause and how to fix it, that would be cool.
    In any case, thank you for your time and keep up the good work on OPDE.

  • Filip Volejnik
    Filip Volejnik

    Hi Kolya,

    thanks for stopping by :)

    I took a really quick sneak-peak at the thing you mention. Without any look at the assembly I'd conclude you're hitting some memory limit and the allocations in the Resource system are failing (this can be either some global limit or maybe anim. textures are stored in one piece and get bigger than the page size - depends if you get this with a certain anim texture alone or if it starts showing only after a few monitor textures get replaced).

    There is a memory limit setting config variable (mem_cap) - you could try setting it to some insanely high value in cam.cfg (units should be 16384 bytes). Also, try looking at these shocked/dromed commands (hopefully those will work):

    heap_alloc_cap, heap_dump_stats, heap_dump_modules, heap_dump_blocks, heap_dump_all, heap blocks, heap_test

    If I'm not mistaken (never tested them personally) those will spew some texts into monolog. Try looking at "cap" and comparing it with the current allocated memory.

    That's it, not sure if I'm going in the right direction but it should not hurt to verify this hypothesis anyway…

  • kolya

    Thank you for your answer Volca, we'll look into that.
    As I learned today, Telliamed created a fix in the mean time, which prevents the crash. I haven't actually seen the fix yet but was told the patching process was rather awkward and the affected icons still get faulty colours. So checking the memory caps might still produce a better fix.

  • Nameless Voice
    Nameless Voice

    This is quite old at this stage, but I see that Kolya didn't have a full description of the problem at the time.

    Here's an excerpt from an email I received from Telliamed where he explained what he'd found out about the error:

    I've been able to investigate this problem. It seems to be a race condition between the resource manager and the drawing of cursors. The problem is aggravated not only be having hi-res textures, but also the screen size as it appeared when I began using a widescreen mode. It doesn't happen in game, presumably because the memory requirements of Dromed are higher.

    As you may (or may not) know, resource loading is done asynchronously in a worker thread. One of the "features" of the resource manager is to unload resources that aren't in use. Presumably triggered by a maximum
    allocation limit.

    The Shock cursor icon is loaded and written to a global variable to be later used when drawing the cursor. If a resource purge happens before the cursor is drawn, it can collect the icon and the global pointer invalidated by the time it needs to be used.

    Disabling asynchronous resources in LG.INI didn't prevent the problem.
    Though it did change when the crash happens.

    I've been able to avoid the crash with a patch that increases the reference count of cursor icons. I can't use Release though, because the global pointers aren't always cleared when the cursor is removed. I can only assume that the resource manager is punting them regardless of the reference count. So there may be some memory leakage with this patch.
    Keep an eye on it and I'll try to find a better way. (If this is only an issue with inventory icons I can focus on those resources alone.)

    Another thing could be to increase the memory threshold. If it's as simple as a config variable that would be great.

    The patch he supplied seems to work sometimes, but not always, and is excessively complicated to install.

    I do have the source code for the patch if that would be helpful.

  • kolya

    Meanwhile I tested "mem_cap 16777216" in cam.cfg but without a change.
    The shocked commands "heap_alloc_cap", "heap_dump_stats" produce the following output in monolog.txt, the others commands create no output.

    Set light_bright to 1
    HEY! Used all available anim_txt frames, maybe more for blin (20)
    HEY! Used all available anim_txt frames, maybe more for blout (20)
    WARNING: Memory allocation suspiciously high (25201624 bytes)!
    WARNING: Memory allocation suspiciously high (37753532 bytes)!
    WARNING: Memory allocation suspiciously high (50347412 bytes)!
    WARNING: Memory allocation suspiciously high (62915140 bytes)!
    allocCap total 70431k, cap 0k, init 0k, peak 70583k
    allocCap total 70429k, cap 0k, init 0k, peak 70583k
  • Filip Volejnik
    Filip Volejnik

    Okay, I'll be looking at this more.

    There is a default limitation to the memory cap to about 40 megabytes at maximum (If I read it correctly), which can be overridden by setting "MemoryCap" variable in  section of lg.ini (Some info about this file here: http://www.ttlg.com/FORUMS/showthread.php?t=110694) - this variable should be in bytes directly.

    If I find time, I'll try to reproduce the problems you mention and fix it via these settings, but please be sure to test this yourself - you may just be quicker than me :)