Work at SourceForge, help us to make it a better place! We have an immediate need for a Support Technician in our San Francisco or Denver office.

Close

#4457 segfault in test encoding-24.2 (8.5)

obsolete: 8.5.7
closed-fixed
Don Porter
8
2010-08-02
2009-11-03
Joe Mistachkin
No

This crash is with Tcl 8.5 (core-8-5-branch, threaded debug build),

-DTCL_MEM_DEBUG -DTCL_COMPILE_DEBUG
-DTCL_COMPILE_STATS -DTCL_THREADS=1 -DUSE_THREAD_ALLOC=1

==== encoding-24.2 EscapeFreeProc on open channels FAILED
==== Contents of test case:

viewable [exec [interpreter] $file]

---- Test generated error; Return code was: 1
---- Return code should have been one of: 0 2
---- errorInfo: ab$B8C$(DD%(Bg
child killed: segmentation violation
while executing
"exec [interpreter] $file"
("uplevel" body line 2)
invoked from within
"uplevel 1 $script"
---- errorCode: CHILDKILLED 1360 SIGSEGV {segmentation violation}
==== encoding-24.2 FAILED

stack trace:

> tcl85tg.dll!FreeEncoding(Tcl_Encoding_ * encoding=0x00aae0d0) Line 848 + 0x3 bytes C
tcl85tg.dll!EscapeFreeProc(void * clientData=0x00ab0360) Line 3398 + 0xc bytes C
tcl85tg.dll!FreeEncoding(Tcl_Encoding_ * encoding=0x00ab0790) Line 851 + 0x11 bytes C
tcl85tg.dll!TclFinalizeEncodingSubsystem() Line 670 + 0xc bytes C
tcl85tg.dll!Tcl_Finalize() Line 1113 C
tcl85tg.dll!Tcl_Exit(int status=0x00000000) Line 908 C
tcl85tg.dll!Tcl_ExitObjCmd(void * dummy=0x00000000, Tcl_Interp * interp=0x00a2c4b0, int objc=0x00000001, Tcl_Obj * const * objv=0x00a30250) Line 727 + 0x9 bytes C
tcl85tg.dll!TclEvalObjvInternal(Tcl_Interp * interp=0x00a2c4b0, int objc=0x00000001, Tcl_Obj * const * objv=0x00a30250, const char * command=0x00a980f8, int length=0x00000005, int flags=0x00000000) Line 3689 + 0x1d bytes C
tcl85tg.dll!TclEvalEx(Tcl_Interp * interp=0x00a2c4b0, const char * script=0x00a980b8, int numBytes=0x0000004a, int flags=0x00000000, int line=0x00000004, int * clNextOuter=0x00000000, const char * outerScript=0x00a980b8) Line 4387 + 0x21 bytes C
tcl85tg.dll!Tcl_EvalEx(Tcl_Interp * interp=0x00a2c4b0, const char * script=0x00a980b8, int numBytes=0x0000004a, int flags=0x00000000) Line 4043 + 0x1d bytes C
tcl85tg.dll!Tcl_FSEvalFileEx(Tcl_Interp * interp=0x00a2c4b0, Tcl_Obj * pathPtr=0x00a8b4d8, const char * encodingName=0x00000000) Line 1814 + 0x13 bytes C
tcl85tg.dll!Tcl_Main(int argc=0xffffffff, char * * argv=0x00a27f98, int (Tcl_Interp *)* appInitProc=0x004165d0) Line 441 + 0x11 bytes C
tcltest.exe!main(int argc=0x00000002, char * * argv=0x00a27f90) Line 102 + 0x15 bytes C
tcltest.exe!__tmainCRTStartup() Line 586 + 0x19 bytes C
tcltest.exe!mainCRTStartup() Line 403 C
kernel32.dll!7c817077()
[Frames below may be incorrect and/or missing, no symbols loaded for kernel32.dll]

Discussion

1 2 3 .. 5 > >> (Page 1 of 5)
  • Don Porter
    Don Porter
    2009-11-04

    Is this a new failure? Since when?

     
    • labels: --> 10. Objects
    • assigned_to: nobody --> msofer
     
  • Cannot seem to repro on XPSP3, 8.5 HEAD.
    Please confirm procedure:
    (1) edit Makefile: COMPILE_DEBUG_FLAGS = -DTCL_MEM_DEBUG -DTCL_COMPILE_DEBUG -DTCL_COMPILE_STATS -DTCL_THREADS=1 -DUSE_THREAD_ALLOC=1
    (2) make clean;make;make test TESTFLAGS="-f encoding.test"

    Didn't make genstubs, right ?

    (As a side note, DTCL_THREADS=1 is strange because it seems to indicate a #if while in the code it is an #ifdef).

     
  • Cannot reproduce on OSX with 8.5 tip, compiled with gcc.

     
  • While I'd still appreciate to reproduce the crash, I am seeing strange things.
    I added fprintf's on entry to Create and Free encoding, giving

    encodingPtr (refcount) ProcessId ThreadId

    And, as you can see below, some encodings get re-freed after their refcount has reached zero and they've been freed for good (see the large refcount).
    One nice thing is that all this is done from the same thread...

    ** NewEnc: 003E5ED0 / 3236 / 3844
    ** NewEnc: 003E5FB8 / 3236 / 3844
    ** NewEnc: 003E6E50 / 3236 / 3844
    **FreeEnc: 00000000(0) / 3236 / 3844
    **FreeEnc: 003E5FB8(2) / 3236 / 3844
    **FreeEnc: 003E4DE0(3) / 3236 / 3844
    ** NewEnc: 003E7E68 / 3236 / 3844
    **FreeEnc: 003E5ED0(2) / 3236 / 3844
    **FreeEnc: 003E4DE0(2) / 3236 / 3844
    **FreeEnc: 003E7E68(2) / 3236 / 3844
    **FreeEnc: 003E7E68(2) / 3236 / 3844
    **FreeEnc: 003E7E68(2) / 3236 / 3844
    **FreeEnc: 003E7E68(3) / 3236 / 3844
    **FreeEnc: 003E7E68(4) / 3236 / 3844
    **FreeEnc: 003E7E68(4) / 3236 / 3844
    ** NewEnc: 00AC78C8 / 3236 / 3844
    **FreeEnc: 003E5ED0(3) / 3236 / 3844
    **FreeEnc: 003E7E68(4) / 3236 / 3844
    ** NewEnc: 00AC7920 / 3236 / 3844
    **FreeEnc: 003E5ED0(3) / 3236 / 3844
    **FreeEnc: 003E7E68(4) / 3236 / 3844
    ** NewEnc: 00AE19D0 / 3236 / 3844
    **FreeEnc: 003E5ED0(3) / 3236 / 3844
    **FreeEnc: 003E7E68(4) / 3236 / 3844
    ** NewEnc: 00AE3000 / 3236 / 3844
    **FreeEnc: 003E5ED0(3) / 3236 / 3844
    **FreeEnc: 003E7E68(4) / 3236 / 3844
    ** NewEnc: 00AE2FA8 / 3236 / 3844
    **FreeEnc: 003E5ED0(3) / 3236 / 3844
    **FreeEnc: 003E7E68(4) / 3236 / 3844
    ** NewEnc: 00AE5C48 / 3236 / 3844
    **FreeEnc: 003E5ED0(3) / 3236 / 3844
    ** NewEnc: 00AE5BF0 / 3236 / 3844
    **FreeEnc: 003E5ED0(2) / 3236 / 3844
    **FreeEnc: 003E7E68(3) / 3236 / 3844
    **FreeEnc: 003E5FB8(3) / 3236 / 3844
    **FreeEnc: 003E7E68(2) / 3236 / 3844
    **FreeEnc: 00AC7920(1) / 3236 / 3844
    **FreeEnc: 00AE2FA8(1) / 3236 / 3844
    **FreeEnc: 003E7E68(1) / 3236 / 3844
    **FreeEnc: 00AE19D0(2) / 3236 / 3844
    **FreeEnc: 00AE19D0(1) / 3236 / 3844
    **FreeEnc: 00AE5BF0(1) / 3236 / 3844
    **FreeEnc: 00AC78C8(1) / 3236 / 3844
    **FreeEnc: 00AC7920(1633771873) / 3236 / 3844
    **FreeEnc: 00AE19D0(1633771873) / 3236 / 3844
    **FreeEnc: 00AE19D0(1633771872) / 3236 / 3844
    **FreeEnc: 00AE3000(1) / 3236 / 3844
    **FreeEnc: 00AE2FA8(1633771873) / 3236 / 3844
    **FreeEnc: 00AE5C48(1) / 3236 / 3844
    **FreeEnc: 003E4DE0(1) / 3236 / 3844
    **FreeEnc: 003E5ED0(1) / 3236 / 3844
    **FreeEnc: 003E5FB8(2) / 3236 / 3844
    **FreeEnc: 003E5FB8(1) / 3236 / 3844
    **FreeEnc: 003E6E50(1) / 3236 / 3844
    **FreeEnc: 003EC6C8(4) / 568 / 3212

     
  • Even better, removing all five special -D flags, I still see frees-below-1 (three 0s and one -1). So the bug is in there in default builds !

    Test file error: ** NewEnc: 003E46C8 / 3084 / 1148
    ** NewEnc: 003E5720 / 3084 / 1148
    ** NewEnc: 003E5770 / 3084 / 1148
    ** NewEnc: 003E64F0 / 3084 / 1148
    **FreeEnc: 00000000(0) / 3084 / 1148
    **FreeEnc: 003E5770(2) / 3084 / 1148
    **FreeEnc: 003E46C8(3) / 3084 / 1148
    ** NewEnc: 003E7750 / 3084 / 1148
    **FreeEnc: 003E5720(2) / 3084 / 1148
    **FreeEnc: 003E46C8(2) / 3084 / 1148
    **FreeEnc: 003E7750(2) / 3084 / 1148
    **FreeEnc: 003E7750(2) / 3084 / 1148
    **FreeEnc: 003E7750(2) / 3084 / 1148
    **FreeEnc: 003E7750(3) / 3084 / 1148
    **FreeEnc: 003E7750(4) / 3084 / 1148
    **FreeEnc: 003E7750(4) / 3084 / 1148
    ** NewEnc: 00AA64D8 / 3084 / 1148
    **FreeEnc: 003E5720(3) / 3084 / 1148
    **FreeEnc: 003E7750(4) / 3084 / 1148
    ** NewEnc: 00AA7280 / 3084 / 1148
    **FreeEnc: 003E5720(3) / 3084 / 1148
    **FreeEnc: 003E7750(4) / 3084 / 1148
    ** NewEnc: 00AA73D0 / 3084 / 1148
    **FreeEnc: 003E5720(3) / 3084 / 1148
    **FreeEnc: 003E7750(4) / 3084 / 1148
    ** NewEnc: 00AAEED8 / 3084 / 1148
    **FreeEnc: 003E5720(3) / 3084 / 1148
    **FreeEnc: 003E7750(4) / 3084 / 1148
    ** NewEnc: 00AA7428 / 3084 / 1148
    **FreeEnc: 003E5720(3) / 3084 / 1148
    **FreeEnc: 003E7750(4) / 3084 / 1148
    ** NewEnc: 00AAF5F0 / 3084 / 1148
    **FreeEnc: 003E5720(3) / 3084 / 1148
    ** NewEnc: 00AAF8E0 / 3084 / 1148
    **FreeEnc: 003E5720(2) / 3084 / 1148
    **FreeEnc: 003E7750(3) / 3084 / 1148
    **FreeEnc: 003E5770(3) / 3084 / 1148
    **FreeEnc: 003E7750(2) / 3084 / 1148
    **FreeEnc: 00AA7280(1) / 3084 / 1148
    **FreeEnc: 00AA7428(1) / 3084 / 1148
    **FreeEnc: 003E7750(1) / 3084 / 1148
    **FreeEnc: 00AA73D0(2) / 3084 / 1148
    **FreeEnc: 00AA73D0(1) / 3084 / 1148
    **FreeEnc: 00AAF8E0(1) / 3084 / 1148
    **FreeEnc: 00AA64D8(1) / 3084 / 1148
    **FreeEnc: 00AA7280(0) / 3084 / 1148
    **FreeEnc: 00AA73D0(0) / 3084 / 1148
    **FreeEnc: 00AA73D0(-1) / 3084 / 1148
    **FreeEnc: 00AAEED8(1) / 3084 / 1148
    **FreeEnc: 00AA7428(0) / 3084 / 1148
    **FreeEnc: 00AAF5F0(1) / 3084 / 1148
    **FreeEnc: 003E46C8(1) / 3084 / 1148
    **FreeEnc: 003E5720(1) / 3084 / 1148
    **FreeEnc: 003E5770(2) / 3084 / 1148
    **FreeEnc: 003E5770(1) / 3084 / 1148
    **FreeEnc: 003E64F0(1) / 3084 / 1148
    **FreeEnc: 003EBF00(4) / 3720 / 3812

     
  • Closing in.
    Breaking on entry to FreeEncoding with refcount==0, one sees the culprit is freeing the encoding from a subTablePtr. Now looking at the beginning of its lifecycle, we see in tclEncoding.c+3459:

    subTablePtr->encodingPtr = encodingPtr;

    without any ++ on the refcount !

     
  • Hum, red herring. Tcl_GetEncoding does the refcount++.
    Investigation goes on.

     
  • OK, several important things:
    - it is linked with the "jis0201" encoding appearing twice in the encoding hashtable + once as an escape encoding (subtable), but with total refcount 2 instead of 3.
    - adding an "if (refcount<=0) Tcl_Panic()" in FreeEncoding makes it reliably detectable.
    - it is reproducible even on unix, provided the system encoding is set to cp1252 on test entry
    Bottom line: please wait a bit before relasing 8.5.8, I think this bug is both terribly dangerous and rather easy to fix.

    - though the bug is mostly asymptomatic, it is there all the time
    - under harsher memory conditions it could explode in full po

     
  • Got it now. Mechanism:
    - when an escape encoding is loaded, as a side-effect it that its sub-encodings get in the encoding table with refcount 1.
    - at finalization table, the whole contents of the table are purged.
    - when an escape encoding is purged, the sub-encodings get purged recursively (EscapeFreeProc).
    - *if* the final (refcount==1) purge of a sub-encoding occurs *first* from within this sub scan, it gets properly removed from the toplevel table
    - *if* the (random) table hash order makes it so that the purge occurs *first* at toplevel, then the encoding gets nuked while being still referenced as a sub.; its pointer soon gets a stab in the sub scan --> CRASH.

    Note that incr'ing refcount on sub creation is not an option, because we'd then leak a ref at toplevel, hence in a non-finalization case (freeing an isolated escape encoding), we'd leak the subs.

    This all is a case where something like weak references would help, but we have only strong ones here.

    A simpler fix should be to detect the finalization case and refrain from the sub-scan FreeEnc() loop in EscapeFreeProc.

     
1 2 3 .. 5 > >> (Page 1 of 5)