If I build with
../libtcl8.4/unix/configure --enable-64bit
on AIX 5.1 using the IBM compiler, the build will complete
successfully, and produce a library and tclsh file. The tclsh
binary, however, coredumps upon execution of any command
(I do get a tcl prompt).
I've tested using shared and static library builds, with the
same results. I also get the same results using the non-
threaded cc or the cc_r compiler. This is using the latest
8.4.4 release, and I get identical behavior in 8.4.2.
Here is a stack trace in dbx after performing a make and
make install:
> ~/brlcad.sp3/bin/tclsh8.4
% ls
Segmentation fault (core dumped)
> dbx ~/brlcad.sp3/bin/tclsh8.4
Type 'help' for help.
reading symbolic information ...
[using memory image in core]
Segmentation fault in TclExecuteByteCode at
0x9000000007624a8
0x9000000007624a8 (TclExecuteByteCode+0x190) 88850000
lbz r4,0x0(r5)
(dbx) where
TclExecuteByteCode() at 0x9000000007624a8
TclCompEvalObj() at 0x9000000007675f4
TclObjInterpProc() at 0x90000000074a524
TclEvalObjvInternal() at 0x900000000750570
TclExecuteByteCode() at 0x9000000007628e0
TclCompEvalObj() at 0x9000000007675f4
TclObjInterpProc() at 0x90000000074a524
TclEvalObjvInternal() at 0x900000000750570
TclExecuteByteCode() at 0x9000000007628e0
TclCompEvalObj() at 0x9000000007675f4
TclObjInterpProc() at 0x90000000074a524
TclEvalObjvInternal() at 0x900000000750570
TclEvalObjvInternal() at 0x90000000075077c
TclExecuteByteCode() at 0x9000000007628e0
TclCompEvalObj() at 0x9000000007675f4
Tcl_EvalObjEx() at 0x9000000007522e0
Tcl_RecordAndEvalObj() at 0x9000000007a35e8
Tcl_Main() at 0x9000000007a3030
main() at 0x1000003f4
As getting this working is a high-priority need for our team,
I'd be happy to work with whomever in getting things
running.
Cheers!
Sean
Logged In: YES
user_id=785737
I just now found the other AIX bug reporting this exact same
behavior already posted under tracker 818630. Apologies on the
duplication.
Logged In: YES
user_id=785737
Since the other report appears to have a myrid of issues, I'll
continue with updates here. I've added the --enable-symbols=all
as msofer suggests to try and isolate the bug. After performing a
make and make install using the following options:
../libtcl8.4/unix/configure --enable-64bit --prefix=/blah/blah/
brlcad.sp3 --disable-shared --enable-symbols=all
.. it WORKS. At least it doesn't core dump when I issue ls, pwd,
etc. Just checked, and it also works for shared library builds to,
using cc or cc_r. Very odd, indeed.. I'll be happy to keep
investigating if you have anything else you'd like me to try, but I
can at least move our project forward again. Thanks for the
inadvertant solution!
Logged In: YES
user_id=148712
I wouldn't put too much trust in that build, memory might be
corrupted somewhere :(
The fact that the bug disappears with debug and memory
symbols makes this a difficult hunt. It will require your
cooperation, as I can't repro here (no suitable platform
available).
First, let us subject your build to a more rigorous test:
please do 'make test'
and let us hope that it produces a nice core file ...
If that does not work, we may later try to
'--enable-symbols' instead of '--enable-symbols=all' (closer
to the no-symbols build)
Notes to self (pasted from the chat):
tclguy miguel - I think it's because TCL_VARARGS isn't
defined 100% correctly for AIX builds
tclguy I haven't confirmed that though ...
jenglish ... I thought the TCL_VARARGS problem was fixed in
8.4.something
miguel And: why would symbols change anything if that was
the problem?
tclguy you get stack corruption if it isn't right
Logged In: YES
user_id=785737
More updates on the hunt. I was also able to successfully build
using -O5 in conjunction with -g, and have tclsh seemly run
correctly. I did not expect that to work either, but am glad
regardless. So, on to the tests..
Before beginning, it's noteworthy that the build of the tests failed
from the onset. They were not utilizing the link flags that tcl used,
yet they used the build flags. So, it builds 64bit objects, and tries
to link 32bit resulting in an error. I simply modified the makefile
and manually appended -b64 to the link, and everyone was happy.
I then ran make test and only had 1 failure out of all of the tests:
---- Test generated error; Return code was: 1
---- Return code should have been one of: 0 2
==== fCmd-9.8 FAILED
all.tcl: Total 10503 Passed 9636 Skipped 866 Failed
1
The tests skipped include: dontCopyLinks, emptyTest,
hasIsoLocale, knownBug, largefileSupport, localeRegexp,
longIs32bit, macOnly, macOrWin, needPST, nonPortable, nonRoot,
pcOnly, singleTestInterp, testthread, testwinclock, testwordend,
umask2, unixOnly && testthread, unknownFailure,
wideIntExpressions, wideIntegerUnparsed, win, winOnly, xdev
So.. no core dump.. :/ I think that's good and bad news. I'm
leaning towards a compiler bug myself, since everything seems to
be okay once -g is added.. but it's still funky of course.
More ideas?
Logged In: YES
user_id=148712
Mhhh ... compiler bug sounds possible. As we can't repro and
you seem satisfied, let me close this ticket now. Please
reopen if/when new symptoms appear.
The build error is a different bug. Could you please file a
new bug ticket with that so that the area maintainer can fix
that? I'd rather you file it, so that you get the automatic
mail updates.