Menu

#2530 Tcl core dumps with 64bit builds on AIX

obsolete: 8.4.4
closed-invalid
5
2003-11-14
2003-11-13
No

If I build with

../libtcl8.4/unix/configure --enable-64bit

on AIX 5.1 using the IBM compiler, the build will complete
successfully, and produce a library and tclsh file. The tclsh
binary, however, coredumps upon execution of any command
(I do get a tcl prompt).

I've tested using shared and static library builds, with the
same results. I also get the same results using the non-
threaded cc or the cc_r compiler. This is using the latest
8.4.4 release, and I get identical behavior in 8.4.2.

Here is a stack trace in dbx after performing a make and
make install:

> ~/brlcad.sp3/bin/tclsh8.4
% ls
Segmentation fault (core dumped)
> dbx ~/brlcad.sp3/bin/tclsh8.4
Type 'help' for help.
reading symbolic information ...
[using memory image in core]

Segmentation fault in TclExecuteByteCode at
0x9000000007624a8
0x9000000007624a8 (TclExecuteByteCode+0x190) 88850000
lbz r4,0x0(r5)
(dbx) where
TclExecuteByteCode() at 0x9000000007624a8
TclCompEvalObj() at 0x9000000007675f4
TclObjInterpProc() at 0x90000000074a524
TclEvalObjvInternal() at 0x900000000750570
TclExecuteByteCode() at 0x9000000007628e0
TclCompEvalObj() at 0x9000000007675f4
TclObjInterpProc() at 0x90000000074a524
TclEvalObjvInternal() at 0x900000000750570
TclExecuteByteCode() at 0x9000000007628e0
TclCompEvalObj() at 0x9000000007675f4
TclObjInterpProc() at 0x90000000074a524
TclEvalObjvInternal() at 0x900000000750570
TclEvalObjvInternal() at 0x90000000075077c
TclExecuteByteCode() at 0x9000000007628e0
TclCompEvalObj() at 0x9000000007675f4
Tcl_EvalObjEx() at 0x9000000007522e0
Tcl_RecordAndEvalObj() at 0x9000000007a35e8
Tcl_Main() at 0x9000000007a3030
main() at 0x1000003f4

As getting this working is a high-priority need for our team,
I'd be happy to work with whomever in getting things
running.

Cheers!
Sean

Discussion

  • Sean Morrison

    Sean Morrison - 2003-11-13

    Logged In: YES
    user_id=785737

    I just now found the other AIX bug reporting this exact same
    behavior already posted under tracker 818630. Apologies on the
    duplication.

     
  • Sean Morrison

    Sean Morrison - 2003-11-13

    Logged In: YES
    user_id=785737

    Since the other report appears to have a myrid of issues, I'll
    continue with updates here. I've added the --enable-symbols=all
    as msofer suggests to try and isolate the bug. After performing a
    make and make install using the following options:

    ../libtcl8.4/unix/configure --enable-64bit --prefix=/blah/blah/
    brlcad.sp3 --disable-shared --enable-symbols=all

    .. it WORKS. At least it doesn't core dump when I issue ls, pwd,
    etc. Just checked, and it also works for shared library builds to,
    using cc or cc_r. Very odd, indeed.. I'll be happy to keep
    investigating if you have anything else you'd like me to try, but I
    can at least move our project forward again. Thanks for the
    inadvertant solution!

     
  • Sean Morrison

    Sean Morrison - 2003-11-13
    • assigned_to: mdejong --> msofer
     
  • miguel sofer

    miguel sofer - 2003-11-13

    Logged In: YES
    user_id=148712

    I wouldn't put too much trust in that build, memory might be
    corrupted somewhere :(
    The fact that the bug disappears with debug and memory
    symbols makes this a difficult hunt. It will require your
    cooperation, as I can't repro here (no suitable platform
    available).

    First, let us subject your build to a more rigorous test:
    please do 'make test'
    and let us hope that it produces a nice core file ...

    If that does not work, we may later try to
    '--enable-symbols' instead of '--enable-symbols=all' (closer
    to the no-symbols build)

    Notes to self (pasted from the chat):
    tclguy miguel - I think it's because TCL_VARARGS isn't
    defined 100% correctly for AIX builds
    tclguy I haven't confirmed that though ...
    jenglish ... I thought the TCL_VARARGS problem was fixed in
    8.4.something
    miguel And: why would symbols change anything if that was
    the problem?
    tclguy you get stack corruption if it isn't right

     
  • Sean Morrison

    Sean Morrison - 2003-11-13

    Logged In: YES
    user_id=785737

    More updates on the hunt. I was also able to successfully build
    using -O5 in conjunction with -g, and have tclsh seemly run
    correctly. I did not expect that to work either, but am glad
    regardless. So, on to the tests..

    Before beginning, it's noteworthy that the build of the tests failed
    from the onset. They were not utilizing the link flags that tcl used,
    yet they used the build flags. So, it builds 64bit objects, and tries
    to link 32bit resulting in an error. I simply modified the makefile
    and manually appended -b64 to the link, and everyone was happy.

    I then ran make test and only had 1 failure out of all of the tests:
    ---- Test generated error; Return code was: 1
    ---- Return code should have been one of: 0 2
    ==== fCmd-9.8 FAILED

    all.tcl: Total 10503 Passed 9636 Skipped 866 Failed
    1

    The tests skipped include: dontCopyLinks, emptyTest,
    hasIsoLocale, knownBug, largefileSupport, localeRegexp,
    longIs32bit, macOnly, macOrWin, needPST, nonPortable, nonRoot,
    pcOnly, singleTestInterp, testthread, testwinclock, testwordend,
    umask2, unixOnly && testthread, unknownFailure,
    wideIntExpressions, wideIntegerUnparsed, win, winOnly, xdev

    So.. no core dump.. :/ I think that's good and bad news. I'm
    leaning towards a compiler bug myself, since everything seems to
    be okay once -g is added.. but it's still funky of course.

    More ideas?

     
  • miguel sofer

    miguel sofer - 2003-11-14
    • status: open --> closed-invalid
     
  • miguel sofer

    miguel sofer - 2003-11-14

    Logged In: YES
    user_id=148712

    Mhhh ... compiler bug sounds possible. As we can't repro and
    you seem satisfied, let me close this ticket now. Please
    reopen if/when new symptoms appear.

    The build error is a different bug. Could you please file a
    new bug ticket with that so that the area maintainer can fix
    that? I'd rather you file it, so that you get the automatic
    mail updates.