We are trying to get a GnuCOBOL program to coredump in addition to output
the error message.
This is the error message:
libcob: /home/jimp/bregion1/bregion1/regress1/srcpp/cbl/GMSHR102.cbl:82:
error: BASED/LINKAGE item 'STR1' has NULL address Last statement of
"GMSHR102" was at line 82 of
/home/jimp/bregion1/bregion1/regress1/srcpp/cbl/GMSHR102.cbl
/home/jimp/bregion1/bregion1/regress1/srcpp/cbl/GMSHR102.cbl:82: attempt to
reference invalid memory address (signal SIGSEGV)
I would post some sample code but this is all dependent the OpenKicks
runtime system so I can't post the code.
I am not asking for help solving that particular problem; I want to know
how to turn on coredumps.
We have coredumps enabled on the machine and can core dump other things.
It is just gnuCOBOL running under OpenKIcks.
I suspect some runtime flag for GnuCOBOL but google is not turning up any
information.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Try looking at ulimit, Michael. prompt$ ulimit -S -c.
-c is max core size. -S is the soft limits (simplification: your process space), -H is the hard system limits. You probably want prompt$ ulimit -S -c unlimited. Or set permanently in /etc/security/limits.conf under the soft core entry. Again, usually 0, but can also be set to "unlimited". ulimit is a shell builtin, so use help ulimit for details. There usually isn't a man 1 ulimit page.
You can get a hint at where core is stored from the pattern in /proc/sys/kernel/core_pattern/ by cat or with prompt$ sysctl kernel.core_pattern. You can set it from there with sudo sysctl -w key=value but the value can be wonky and full of %letter replacements, man 5 core for those.
That is one, probable, setting to tweak. There may be others on your system depending on SELINUX and whatnot.
Cheers,
Blue
Last edit: Brian Tiffin 2022-05-12
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
We dump the value of ulimit in the code to confirm that the code sees that
limit so we know it is set properly. We also have a pure C program that
crashes to test that we have the system settings set.
As far as we can tell the only difference from when we can and can't get a
core dump is when we call COBOL.
Is there anything in the cobol runtime system it self that
suppresses coredumps?
for instance, what does --debug do that might affect runtime system?
Last edit: Simon Sobisch 2022-05-15
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
The issue here is that libcob registers a signal handler for SIGSEGV - so the kernel does not uses its "default" handler of creating a core dump, if this is configured.
There are some possible solutions, a "starter" list:
register a signal handler for SIGSEGV via cob_reg_sighnd from a C program that starts the module (instead of cobcrun / a compiled COBOL executable) - and ask for core creation there - but this will likely lead to the coredump pointing there, not the original COBOL place
after COBOL initialization: reset the SIGSEGV handler from C or COBOL (the later is likely unportable)
uncomment the parts handling the SIGSEGV in libcob/common.c (cob_set_signal) - all SIGSEGV will then have the default behavior
in many cases: don't have a SIGSEGV happen - by enabling all bound checks (in all modules) with -fec=bounds (included with --debug); then you'd commonly get to the runtime check which then gives you a nice error message of out-of-bounds read and write; without it you'd commonly get a SIGSEGV later (and the coredump may not help with debugging this issue as it possibly is somewhere completely else, possibly with the program that led to this being CANCELed already). The COBOL runtime error handling includes a COBOL-centric dump (depending on which modules were compiled with which option for -fdump).
provide a patch to add a new runtime configuration core_on_error
provide a patch to add an environment variable that allows to tune the signal handlers posted (so have option 3 from above be configurable (only before start, obviously)
start the process under control of gdb or gdbserver - this should allow you to directly debug if an issue happens - or create a coredump from there
Last edit: Simon Sobisch 2022-05-15
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
GnuCOBOL Crew,
We are trying to get a GnuCOBOL program to coredump in addition to output
the error message.
This is the error message:
libcob: /home/jimp/bregion1/bregion1/regress1/srcpp/cbl/GMSHR102.cbl:82:
error: BASED/LINKAGE item 'STR1' has NULL address Last statement of
"GMSHR102" was at line 82 of
/home/jimp/bregion1/bregion1/regress1/srcpp/cbl/GMSHR102.cbl
/home/jimp/bregion1/bregion1/regress1/srcpp/cbl/GMSHR102.cbl:82: attempt to
reference invalid memory address (signal SIGSEGV)
I would post some sample code but this is all dependent the OpenKicks
runtime system so I can't post the code.
I am not asking for help solving that particular problem; I want to know
how to turn on coredumps.
We have coredumps enabled on the machine and can core dump other things.
It is just gnuCOBOL running under OpenKIcks.
I suspect some runtime flag for GnuCOBOL but google is not turning up any
information.
Try looking at
ulimit
, Michael.prompt$ ulimit -S -c
.-c
is max core size.-S
is the soft limits (simplification: your process space),-H
is the hard system limits. You probably wantprompt$ ulimit -S -c unlimited
. Or set permanently in /etc/security/limits.conf under thesoft core
entry. Again, usually 0, but can also be set to "unlimited".ulimit
is a shell builtin, so usehelp ulimit
for details. There usually isn't aman 1 ulimit
page.You can get a hint at where core is stored from the pattern in /proc/sys/kernel/core_pattern/ by
cat
or withprompt$ sysctl kernel.core_pattern
. You can set it from there withsudo sysctl -w key=value
but the value can be wonky and full of%letter
replacements,man 5 core
for those.That is one, probable, setting to tweak. There may be others on your system depending on SELINUX and whatnot.
Cheers,
Blue
Last edit: Brian Tiffin 2022-05-12
We dump the value of ulimit in the code to confirm that the code sees that
limit so we know it is set properly. We also have a pure C program that
crashes to test that we have the system settings set.
As far as we can tell the only difference from when we can and can't get a
core dump is when we call COBOL.
Is there anything in the cobol runtime system it self that
suppresses coredumps?
for instance, what does --debug do that might affect runtime system?
Last edit: Simon Sobisch 2022-05-15
The issue here is that libcob registers a signal handler for SIGSEGV - so the kernel does not uses its "default" handler of creating a core dump, if this is configured.
There are some possible solutions, a "starter" list:
-fec=bounds
(included with--debug
); then you'd commonly get to the runtime check which then gives you a nice error message of out-of-bounds read and write; without it you'd commonly get a SIGSEGV later (and the coredump may not help with debugging this issue as it possibly is somewhere completely else, possibly with the program that led to this being CANCELed already). The COBOL runtime error handling includes a COBOL-centric dump (depending on which modules were compiled with which option for-fdump
).core_on_error
Last edit: Simon Sobisch 2022-05-15