Menu

core dump's surpressed?

GnuCOBOL
pottmi
2022-05-12
2022-05-15
  • pottmi

    pottmi - 2022-05-12

    GnuCOBOL Crew,

    We are trying to get a GnuCOBOL program to coredump in addition to output
    the error message.

    This is the error message:

    libcob: /home/jimp/bregion1/bregion1/regress1/srcpp/cbl/GMSHR102.cbl:82:
    error: BASED/LINKAGE item 'STR1' has NULL address Last statement of
    "GMSHR102" was at line 82 of
    /home/jimp/bregion1/bregion1/regress1/srcpp/cbl/GMSHR102.cbl
    /home/jimp/bregion1/bregion1/regress1/srcpp/cbl/GMSHR102.cbl:82: attempt to
    reference invalid memory address (signal SIGSEGV)

    I would post some sample code but this is all dependent the OpenKicks
    runtime system so I can't post the code.

    I am not asking for help solving that particular problem; I want to know
    how to turn on coredumps.

    We have coredumps enabled on the machine and can core dump other things.
    It is just gnuCOBOL running under OpenKIcks.

    I suspect some runtime flag for GnuCOBOL but google is not turning up any
    information.

     
    • Brian Tiffin

      Brian Tiffin - 2022-05-12

      Try looking at ulimit, Michael. prompt$ ulimit -S -c.

      -c is max core size. -S is the soft limits (simplification: your process space), -H is the hard system limits. You probably want prompt$ ulimit -S -c unlimited. Or set permanently in /etc/security/limits.conf under the soft core entry. Again, usually 0, but can also be set to "unlimited". ulimit is a shell builtin, so use help ulimit for details. There usually isn't a man 1 ulimit page.

      You can get a hint at where core is stored from the pattern in /proc/sys/kernel/core_pattern/ by cat or with prompt$ sysctl kernel.core_pattern. You can set it from there with sudo sysctl -w key=value but the value can be wonky and full of %letter replacements, man 5 core for those.

      That is one, probable, setting to tweak. There may be others on your system depending on SELINUX and whatnot.

      Cheers,
      Blue

       

      Last edit: Brian Tiffin 2022-05-12
      • pottmi

        pottmi - 2022-05-12

        We dump the value of ulimit in the code to confirm that the code sees that
        limit so we know it is set properly. We also have a pure C program that
        crashes to test that we have the system settings set.

        As far as we can tell the only difference from when we can and can't get a
        core dump is when we call COBOL.

        Is there anything in the cobol runtime system it self that
        suppresses coredumps?

        for instance, what does --debug do that might affect runtime system?

         

        Last edit: Simon Sobisch 2022-05-15
        • Simon Sobisch

          Simon Sobisch - 2022-05-15

          The issue here is that libcob registers a signal handler for SIGSEGV - so the kernel does not uses its "default" handler of creating a core dump, if this is configured.

          There are some possible solutions, a "starter" list:

          1. register a signal handler for SIGSEGV via cob_reg_sighnd from a C program that starts the module (instead of cobcrun / a compiled COBOL executable) - and ask for core creation there - but this will likely lead to the coredump pointing there, not the original COBOL place
          2. after COBOL initialization: reset the SIGSEGV handler from C or COBOL (the later is likely unportable)
          3. uncomment the parts handling the SIGSEGV in libcob/common.c (cob_set_signal) - all SIGSEGV will then have the default behavior
          4. in many cases: don't have a SIGSEGV happen - by enabling all bound checks (in all modules) with -fec=bounds (included with --debug); then you'd commonly get to the runtime check which then gives you a nice error message of out-of-bounds read and write; without it you'd commonly get a SIGSEGV later (and the coredump may not help with debugging this issue as it possibly is somewhere completely else, possibly with the program that led to this being CANCELed already). The COBOL runtime error handling includes a COBOL-centric dump (depending on which modules were compiled with which option for -fdump).
          5. provide a patch to add a new runtime configuration core_on_error
          6. provide a patch to add an environment variable that allows to tune the signal handlers posted (so have option 3 from above be configurable (only before start, obviously)
          7. start the process under control of gdb or gdbserver - this should allow you to directly debug if an issue happens - or create a coredump from there
           

          Last edit: Simon Sobisch 2022-05-15

Log in to post a comment.