From: SourceForge.net <no...@so...> - 2007-01-05 09:51:28
|
Bugs item #1623151, was opened at 2006-12-27 16:09 Message generated for change (Settings changed) made by bigrixx You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=684730&aid=1623151&group_id=119701 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Interpreter Group: v3.1 >Status: Pending >Resolution: Duplicate Priority: 5 Private: No Submitted By: Ari Uniksoki (ariunikoski) Assigned to: Rick McGuire (bigrixx) Summary: segmentation violation in large programs Initial Comment: We are developing a large infrastructure in oorexx, version 3.1.1, on a x86 linux box. We are using redhat enterprise server 3 update 2. When I do rexx -v I get as output: Open Object Rexx Interpreter Version 3.1.1 for LINUX Build date: Nov 13 2006 The project consists of over 2700 lines of code, from approximately 50 routines using the ::ROUTINE command. At some point we seem to cross a certain limit and any program we write crashes with a segmentation violation. If we arbitrarily remove one routine, or another, it will work - but eventually we will need all the routines to be available. Please help asap - I have four programmers going around in circles and a deadline approaching... Attached is a tar file with a sample that demonstrates the problem. ttt.rexx is the main. When we run it, we get a segmentation violation. ttt does *requires* of genned.rexx genned does *requires* of seclev.rexx. Note that we get the segmentation violation even before the first line of code is executed. This gives a reasonable simulation of our code - I think the business of 2 levels of requires is critical to the bug - see below. Please note the following that occurred when I was constructing the sample: a/ You will notice that in *genned* I have commented out code. b/ If I do NOT include the *requires* to *seclev*, and I uncomment ALL of *genned*, it runs correctly. c/ If I then add the *requires* to *seclev*, but *seclev* is a file consisting solely of #!/usr/bin/rexx I get the following output: 11 *-* ::requires *genned.rexx* REX0005E: Error 5 running /home/jcl2ksh/ari/rexxcrash/ttt.rexx line 11: System resources exhausted d/ This is why I had to comment out parts of genned. Then the code ran, but when I added *body* to seclev, it crashed on the segmentation violation. ---------------------------------------------------------------------- Comment By: Mark Miesfeld (miesfeld) Date: 2006-12-28 11:40 Message: Logged In: YES user_id=191588 Originator: NO Simple back trace from gdb: Raven:/work/tools/work.ooRexx/bugs/bug.1623151 # gdb rexx GNU gdb 6.4 ... This GDB was configured as "i686-pc-linux-gnu"...(no debugging symbols found) Using host libthread_db library "/lib/libthread_db.so.1". (gdb) set args ttt.rexx (gdb) run Starting program: /usr/bin/rexx ttt.rexx ... [Thread debugging using libthread_db enabled] [New Thread -1212119376 (LWP 26200)] (no debugging symbols found) ... Program received signal SIGSEGV, Segmentation fault. [Switching to Thread -1212119376 (LWP 26200)] 0xb7f32f93 in RexxBehaviour::methodLookup () from /opt/ooRexx/lib/ooRexx/librexx.so.3 (gdb) where #0 0xb7f32f93 in RexxBehaviour::methodLookup () from /opt/ooRexx/lib/ooRexx/librexx.so.3 #1 0xb7eddd75 in RexxObject::messageSend () from /opt/ooRexx/lib/ooRexx/librexx.so.3 #2 0xb7ee3ee9 in RexxString::hash () from /opt/ooRexx/lib/ooRexx/librexx.so.3 #3 0xb7f36f3c in RexxHashTable::stringGet () from /opt/ooRexx/lib/ooRexx/librexx.so.3 #4 0xb7f0fa95 in RexxSource::addVariable () from /opt/ooRexx/lib/ooRexx/librexx.so.3 #5 0xb7f1226a in RexxSource::addText () from /opt/ooRexx/lib/ooRexx/librexx.so.3 #6 0xb7f138b3 in RexxSource::subTerm () from /opt/ooRexx/lib/ooRexx/librexx.so.3 #7 0xb7f142a7 in RexxSource::messageTerm () from /opt/ooRexx/lib/ooRexx/librexx.so.3 #8 0xb7f143f9 in RexxSource::instruction () from /opt/ooRexx/lib/ooRexx/librexx.so.3 #9 0xb7f14b18 in RexxSource::translateBlock () from /opt/ooRexx/lib/ooRexx/librexx.so.3 #10 0xb7f16007 in RexxSource::directive () from /opt/ooRexx/lib/ooRexx/librexx.so.3 #11 0xb7f166b8 in RexxSource::translate () from /opt/ooRexx/lib/ooRexx/librexx.so.3 #12 0xb7f16968 in RexxSource::method () from /opt/ooRexx/lib/ooRexx/librexx.so.3 #13 0xb7ed32f8 in RexxMethodClass::newRexxMethod () from /opt/ooRexx/lib/ooRexx/librexx.so.3 #14 0xb7ed336b in RexxMethodClass::newRexxBuffer () from /opt/ooRexx/lib/ooRexx/librexx.so.3 #15 0xb7f1bbac in SysRestoreProgram () from /opt/ooRexx/lib/ooRexx/librexx.so.3 #16 0xb7f284cd in RexxActivation::loadRequired () from /opt/ooRexx/lib/ooRexx/librexx.so.3 #17 0xb7f10c81 in RexxSource::processInstall () from /opt/ooRexx/lib/ooRexx/librexx.so.3 #18 0xb7f29ea4 in RexxActivation::run () from /opt/ooRexx/lib/ooRexx/librexx.so.3 #19 0xb7ed4775 in RexxMethod::call () from /opt/ooRexx/lib/ooRexx/librexx.so.3 #20 0xb7f285c2 in RexxActivation::loadRequired () from /opt/ooRexx/lib/ooRexx/librexx.so.3 #21 0xb7f10c81 in RexxSource::processInstall () from /opt/ooRexx/lib/ooRexx/librexx.so.3 #22 0xb7f29ea4 in RexxActivation::run () from /opt/ooRexx/lib/ooRexx/librexx.so.3 #23 0xb7ed4849 in RexxMethod::call () from /opt/ooRexx/lib/ooRexx/librexx.so.3 #24 0xb7edcfc6 in RexxObject::shriekRun () from /opt/ooRexx/lib/ooRexx/librexx.so.3 #25 0xb7f1abdf in SysRunProgram () from /opt/ooRexx/lib/ooRexx/librexx.so.3 #26 0xb7f3ddb0 in RexxLocal::runProgram () from /opt/ooRexx/lib/ooRexx/librexx.so.3 #27 0xb7ed446e in RexxMethod::run () from /opt/ooRexx/lib/ooRexx/librexx.so.3 #28 0xb7edddd6 in RexxObject::messageSend () from /opt/ooRexx/lib/ooRexx/librexx.so.3 #29 0xb7f3229f in RexxSendMessage () from /opt/ooRexx/lib/ooRexx/librexx.so.3 #30 0xb7f1b1b3 in RexxStart () from /opt/ooRexx/lib/ooRexx/librexx.so.3 #31 0x08048996 in ?? () #32 0xb7c1f87c in __libc_start_main () from /lib/libc.so.6 #33 0x080487c1 in ?? () (gdb) quit ---------------------------------------------------------------------- Comment By: Rick McGuire (bigrixx) Date: 2006-12-28 11:11 Message: Logged In: YES user_id=1125291 Originator: NO Mark, any chance you can generate a call stack traceback of the point of the failure? Knowing the point where this method lookup is getting performed would be of enormous assistance. Rick ---------------------------------------------------------------------- Comment By: Rick McGuire (bigrixx) Date: 2006-12-28 10:33 Message: Logged In: YES user_id=1125291 Originator: NO And it's already been getting some of my attention, but a bit handicapped by A) a nasty case of the flu, and B) current lack of a Linux system to try to reproduce/debug this. The initial report seemed to point to a garbage collection problem, but I couldn't figure out why it would fail on Linux but not on Windows. However, the System Resources Exhausted error message I think provides a vital clue. I'm guessing this program bumps up against the initial interpreter memory allocation, and attempts to extend the memory by allocating additional memory segments is failing for some reason. This is handled in the platform-specific layer, which would explain why it wouldn't fail on Windows. Now we just need to narrow this down a little. Rick ---------------------------------------------------------------------- Comment By: Mark Miesfeld (miesfeld) Date: 2006-12-28 10:32 Message: Logged In: YES user_id=191588 Originator: NO The specific cause of the seg fault, when I run the program on a SuSE 10.1 box is in RexxBehaviour::methodLookup. In that function, the problem is in this line: methodObject = (RexxMethod *)this->methodDictionary->stringGet(messageName); At the point of the crash, the methodDictionary object pointer is corrupt. The function tests that the pointer is not OREF_NULL, and it isn't. But it is not a valid pointer. Some print out showing this: methodLookup() this: 0xb78adce0 methodName ptr: 0xb78829fc methodName: UNINIT methodDictionary not null: 0xb78b5368 methodLookup() this: 0xb788746c methodName ptr: 0xb78d5868 methodName: RUN_PROGRAM methodDictionary not null: 0xb788e6a4 methodLookup() this: 0xb791ff78 methodName ptr: 0xb787119c methodName: == methodDictionary not null: 0x5 Segmentation fault Raven:/work/tools/work.ooRexx/bugs/bug.1623151 # ---------------------------------------------------------------------- Comment By: Mark Miesfeld (miesfeld) Date: 2006-12-28 10:22 Message: Logged In: YES user_id=191588 Originator: NO Ari, It was great that you provided the sample program. I'm going to put some notes on my initial investigation here. Then, hopefully, if Rick can take a look at this he won't have to duplicate my work. 1.) The segmentation fault is easily reproduced on my system with your example program. a. The exact same program that crashes on Linux, does not crash on Windows and runs fine. b. The Linux box I used is SuSE 10.1, so it seems to be a generic problem on Linux. c. The program behaves the same way (seg faults) under ooRexx 3.1.0 also. 2.) The program behaves on my system as you described, in that it will seg fault as written, but if you comment out some of the the public routines, you will get the System resources exhausted message. Unfortunately, I do not have near the understanding of, or experience with, the internals of the ooRexx interpretor that Rick does. I will continue to work on this, but it may take me a while. When Rick has some free time, I'm sure he will give this his consideration. Ultimately, the solution is likely to come from him. ---------------------------------------------------------------------- Comment By: Mark Miesfeld (miesfeld) Date: 2006-12-27 17:16 Message: Logged In: YES user_id=191588 Originator: NO Ari, Thanks for providing the additional information. I'll take a look at this, but I will have to do it tomorrow - I don't have a Linux box up and running here. Rick McGuire will have better insight into this than I and he might also comment. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=684730&aid=1623151&group_id=119701 |