From: SourceForge.net <no...@so...> - 2013-01-31 15:59:55
|
Bugs item #3602706, was opened at 2013-01-30 13:23 Message generated for change (Comment added) made by nijtmans You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110894&aid=3602706&group_id=10894 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: 49. Threading Group: current: 8.5.13 Status: Open Resolution: None Priority: 9 Private: No Submitted By: Thomas Perschak (tombert) Assigned to: Andreas Kupries (andreas_kupries) Summary: Segfault on tclsh startup Initial Comment: Somewhere between the checkins from 2012-12-07 till 2013-01-30 a bug was introduced. There is no need to run a script, it crashes by simply starting the tclsh. Sidenote: I also updated tcl 8.6 and there is no crash. Happens in Windows 7, MinGw, gcc 4.6.2. Here is the output of gdb: Starting program: E:\CM.git\tcltk85\debug\bin\tclsh85g.exe [New Thread 9456.0x1ec8] Program received signal SIGSEGV, Segmentation fault. TclThreadAllocObj () at ./../generic/tclThreadAlloc.c:570 570 cachePtr->firstObjPtr = objPtr->internalRep.otherValuePtr; ---------------------------------- thx for your support ---------------------------------------------------------------------- >Comment By: Jan Nijtmans (nijtmans) Date: 2013-01-31 07:59 Message: It looks like we must thank dkf that trunk didn't have the crash. In the following commit he added a lot of IncrRefCount/DecrRefCount pairs on "trunk": <http://core.tcl.tk/tcl/info/d91c86d0da> Those were never backported to core-8-5-branch. ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2013-01-31 07:49 Message: Thanks! Will examine. ---------------------------------------------------------------------- Comment By: Jan Nijtmans (nijtmans) Date: 2013-01-31 07:24 Message: Yes it has the bug. I also checked my theory that this is a refCount bug, by uncommenting all Tcl_DecrRefCount() calls (which makes all tests pass) and then restoring them one by one until the tests fail again. It turns out the the refCount goes wrong with the part2Ptr in the function Tcl_GetVar2Ex. Minimal fix is in the branch now. I'm not 100% sure this is the right fix, but it works! ---------------------------------------------------------------------- Comment By: Don Porter (dgp) Date: 2013-01-31 06:59 Message: Opened new branch bug-3602706 for finding and fixing the bug. Please confirm it demonstrates the problem. It Works For Me. ---------------------------------------------------------------------- Comment By: Jan Nijtmans (nijtmans) Date: 2013-01-31 01:30 Message: Looking at the changes in [8aca9a8e961fb172], I suspect that it is related to refCounting objects. The segfault occurs in the thread allocator, where objPtr point to invalid memory, so somewhere the Tcl_IncRefCount/Tcl_DecrRefCount's don't match. That's my current theory. (this might be in win-specific code, indeed, which is fixed in trunk) The easiest fix would be to simply revert [8aca9a8e961fb172] on core-8-5-branch and then merge-mark to trunk (so the improvement is kept there) That gives some time to investigate this, without everyone (like me) being hindered by this. ---------------------------------------------------------------------- Comment By: Donal K. Fellows (dkf) Date: 2013-01-31 01:27 Message: The trigger for this crash is not on a codepath that my machine seems to exercise (for whatever reason); I can't hunt it directly. I *SUSPECT* that it is an unfixed location in the Win-specific code, but I have no true justification for that other than a hunch. ---------------------------------------------------------------------- Comment By: Jan Nijtmans (nijtmans) Date: 2013-01-31 01:22 Message: When bisecting Bug #3601260, I came to exactly the same commit which started the problem! Looks they are really dups, even though the effect is different with different mingw compiler versions. ---------------------------------------------------------------------- Comment By: Donal K. Fellows (dkf) Date: 2013-01-31 01:21 Message: In the context of the 8aca9a8e961fb172 commit, do we have a backtrace at the point of the crash? Are any problems hit in an --enable-symbols=mem build? ---------------------------------------------------------------------- Comment By: Thomas Perschak (tombert) Date: 2013-01-31 01:15 Message: 2013-01-13 23:15:10 8aca9a8e961fb172 BAD 2013-01-13 09:04:10 4c2509336a9c2b2c GOOD CURRENT ---------------------------------------------------------------------- Comment By: Thomas Perschak (tombert) Date: 2013-01-30 22:40 Message: It happens in the release version as well. Ok - I will "fossil bisect" into it ... ---------------------------------------------------------------------- Comment By: Jan Nijtmans (nijtmans) Date: 2013-01-30 13:50 Message: I suspect this is a dup of 3601260 Does the following branch have the problem as well? <http://core.tcl.tk/tcl/info/10b2640e48> I wasn't able to create a strack-trace. Regards, Jan Nijtmans ---------------------------------------------------------------------- Comment By: Andreas Kupries (andreas_kupries) Date: 2013-01-30 13:31 Message: Thomas, is possible for you to 'fossil bisect' the issue until you have the exact revision which causes the crash? I rmemeber we had recent issues with CPUID detection and code. Are you by chance on a 64bit platform? From the naming of the tclsh I deduce that you have a debug build of some kind, is that correct ? If yes, does the crash happen for a non-debug build also ? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110894&aid=3602706&group_id=10894 |