#4770 Stack fault due to off-by-one in GrowEvaluationStack

obsolete: 8.6b1
closed-fixed
miguel sofer
7
2011-04-12
2010-12-22
brad harder
No

lsort -integer violates memory allocation when Tcl is compiled w/ TCL_MEM_DEBUG. This violation was introduced between 20Aug2009 and 21Aug2009 in CVS repo.

Discussion

1 2 > >> (Page 1 of 2)
  • brad harder
    brad harder
    2010-12-22

    Example code that will trigger exception

     
    Attachments
  • Jeffrey Hobbs
    Jeffrey Hobbs
    2010-12-22

    • priority: 5 --> 8
     
  • Jeffrey Hobbs
    Jeffrey Hobbs
    2010-12-22

    Possibly related to the Tcl_LsortObjCmd change from ckalloc to stackalloc?

     
  • brad harder
    brad harder
    2010-12-23

    see attached patch-aa for a minimal (absolute minimal?) patch against 14 Dec code that appears to solve this specific issue (as hinted by jeff hobbs).

     
  • brad harder
    brad harder
    2010-12-23

    Can confirm patch applies cleanly against CVS co of 23Dec2010 as well, and that Tcl 8.6 build passes test.

     
  • I don't see any problem when building on OSX. Do you have a stack trace?

     
  • BTW, your patch is definitely not minimal. It touches unrelated code (the bits relating to lines 1312 to 1346 are nothing to do with [lsort]).

     
  • Jeffrey Hobbs
    Jeffrey Hobbs
    2010-12-23

    Donal - you have to compile with --enable-symbols=all (IOW -DTCL_MEM_DEBUG) to ensure the crash. Brad used NetBSD amd64 and I used SuSE Linux-64 to confirm.

     
  • brad harder
    brad harder
    2010-12-23

    Hi Donal -- indeed, my first patch was poor... I've reviewed and submitted 'patch-aa.v2' which is much more specific in the code it targets (strictly [lsort]) and does pass the case that was previously faulting. This is _not_ to say that [lindex] etc won't be subject to this same class of bug, but I won't speculate with my patch.

     
  • I usually build with --enable-symbols=mem (so I can run memleak tests) and wasn't seeing it, and even with --enable-symbols=all in a clean build I still don't see it.

     
  • I'd like to emphasize that I don't mean to be difficult. I suspect that there may be some weird problem in the code to handle the temporary memory handling in the lsort command. But I can't reproduce for some reason; I need the stack trace from the crash and can't make it myself. :-(

    There's too many things that are trickier than they look in this code for me to hunt without at least a copy of a picture of the smoking gun available to me. :-/

     
  • Could you retest with the HEAD? I fixed another bug that I found when studying what this could possibly be, so maybe that will resolve this too (as I greatly simplified the memory management logic in the process; the handling of -index was unnecessarily over-clever before).

     
  • brad harder
    brad harder
    2010-12-27

    Tested, still failed. I've updated the patch that solves this particular ticket to work w/ your latest changes. (patch-aa.v3)

     
  • brad harder
    brad harder
    2010-12-27

    applies against CVS head of 27Dec, 2010

     
    Attachments
  • Finally managed to reproduce; it's a 64-bit only bug.

     
  • fails on iteration 1334; stack trace follows:

    #0 0x00007fff83b9ff16 in __kill ()
    #1 0x00007fff83c10f6d in abort ()
    #2 0x00000001000ffc9c in Tcl_PanicVA (format=0x100167bd8 "Memory validation failure", argList=0x7fff5fbfeab0) at /Users/dkf/Documents/software/tcl8.6-commits/generic/tclPanic.c:118
    #3 0x00000001000ffd7b in Tcl_Panic (format=0x100167bd8 "Memory validation failure") at /Users/dkf/Documents/software/tcl8.6-commits/generic/tclPanic.c:147
    #4 0x0000000100029d2f in ValidateMemory (memHeaderP=0x100247000, file=0x10016c260 "/Users/dkf/Documents/software/tcl8.6-commits/generic/tclExecute.c", line=910, nukeGuards=1) at /Users/dkf/Documents/software/tcl8.6-commits/generic/tclCkalloc.c:270
    #5 0x000000010002a642 in Tcl_DbCkfree (ptr=0x100247038 "8?\"", file=0x10016c260 "/Users/dkf/Documents/software/tcl8.6-commits/generic/tclExecute.c", line=910) at /Users/dkf/Documents/software/tcl8.6-commits/generic/tclCkalloc.c:612
    #6 0x000000010009fc79 in DeleteExecStack (esPtr=0x100247038) at /Users/dkf/Documents/software/tcl8.6-commits/generic/tclExecute.c:910
    #7 0x000000010009ff91 in GrowEvaluationStack (eePtr=0x1003067b8, growth=4002, move=0) at /Users/dkf/Documents/software/tcl8.6-commits/generic/tclExecute.c:1092
    #8 0x00000001000a01a6 in StackAllocWords (interp=0x100809e38, numWords=4002) at /Users/dkf/Documents/software/tcl8.6-commits/generic/tclExecute.c:1179
    #9 0x00000001000a03d2 in TclStackAlloc (interp=0x100809e38, numBytes=32016) at /Users/dkf/Documents/software/tcl8.6-commits/generic/tclExecute.c:1263
    #10 0x0000000100039605 in Tcl_LsortObjCmd (clientData=0x0, interp=0x100809e38, objc=3, objv=0x10022d488) at /Users/dkf/Documents/software/tcl8.6-commits/generic/tclCmdIL.c:3907
    #11 0x000000010001d5a2 in NRRunObjProc (data=0x100458390, interp=0x100809e38, result=0) at /Users/dkf/Documents/software/tcl8.6-commits/generic/tclBasic.c:4385
    #12 0x000000010001d341 in TclNRRunCallbacks (interp=0x100809e38, result=0, rootPtr=0x1003b58a8) at /Users/dkf/Documents/software/tcl8.6-commits/generic/tclBasic.c:4332
    #13 0x000000010001fab9 in TclEvalObjEx (interp=0x100809e38, objPtr=0x6, flags=0, invoker=0x0, word=0) at /Users/dkf/Documents/software/tcl8.6-commits/generic/tclBasic.c:5907
    #14 0x000000010001fa5d in Tcl_EvalObjEx (interp=0x100809e38, objPtr=0x6, flags=0) at /Users/dkf/Documents/software/tcl8.6-commits/generic/tclBasic.c:5888
    #15 0x0000000100041fa1 in Tcl_TimeObjCmd (dummy=0x0, interp=0x100809e38, objc=2, objv=0x10022d1b8) at /Users/dkf/Documents/software/tcl8.6-commits/generic/tclCmdMZ.c:4089
    #16 0x000000010001d5a2 in NRRunObjProc (data=0x1003c1d80, interp=0x100809e38, result=0) at /Users/dkf/Documents/software/tcl8.6-commits/generic/tclBasic.c:4385
    #17 0x000000010001d341 in TclNRRunCallbacks (interp=0x100809e38, result=0, rootPtr=0x0) at /Users/dkf/Documents/software/tcl8.6-commits/generic/tclBasic.c:4332
    #18 0x000000010001fab9 in TclEvalObjEx (interp=0x100809e38, objPtr=0x6, flags=131072, invoker=0x0, word=0) at /Users/dkf/Documents/software/tcl8.6-commits/generic/tclBasic.c:5907
    #19 0x000000010001fa5d in Tcl_EvalObjEx (interp=0x100809e38, objPtr=0x6, flags=131072) at /Users/dkf/Documents/software/tcl8.6-commits/generic/tclBasic.c:5888
    #20 0x00000001000c6ec9 in Tcl_RecordAndEvalObj (interp=0x100809e38, cmdPtr=0x1003a42c8, flags=131072) at /Users/dkf/Documents/software/tcl8.6-commits/generic/tclHistory.c:192
    #21 0x00000001000f35bf in Tcl_MainEx (argc=-1, argv=0x7fff5fbff608, appInitProc=0x1000029f7 <Tcl_AppInit>, interp=0x100809e38) at /Users/dkf/Documents/software/tcl8.6-commits/generic/tclMain.c:518
    #22 0x00000001000f3a39 in Tcl_Main (argc=1, argv=0x7fff5fbff600, appInitProc=0x1000029f7 <Tcl_AppInit>) at /Users/dkf/Documents/software/tcl8.6-commits/generic/tclMain.c:667
    #23 0x00000001000029f0 in main (argc=1, argv=0x7fff5fbff600) at /Users/dkf/Documents/software/tcl8.6-commits/unix/tclAppInit.c:80

     
  • I'm guessing the issue is the need to increase the size of Tcl stack space (which appears to have hit a 32k boundary).

     
    • assigned_to: dkf --> msofer
     
  • On the basis of that, I suspect it's not a problem with [lsort] (which now has very conventional memory handling) but rather elsewhere; as such, the suggested patch is just papering over the real problem, not fixing it. Assigning to Miguel for comment.

     
  • (and students of stack traces will see that I have added [time] to it; it appears to be a by-stander in this)

     
  • Actual crash message:

    hi guard byte 0 is 0x0
    hi guard byte 1 is 0x0
    hi guard byte 2 is 0x0
    hi guard byte 3 is 0x0
    hi guard byte 4 is 0x0
    hi guard byte 5 is 0x0
    hi guard byte 6 is 0x0
    hi guard byte 7 is 0x0
    total mallocs 68005
    total frees 55127
    current packets allocated 12878
    current bytes allocated 645210
    maximum packets allocated 12885
    maximum bytes allocated 655885
    high guard failed at 100247038, /Users/dkf/Documents/software/tcl8.6-commits/generic/tclExecute.c 910
    32040 bytes allocated at (/Users/dkf/Documents/software/tcl8.6-commits/generic/tclExecute.c 1110)
    Memory validation failure

     
    • assigned_to: msofer --> dkf
     
    • assigned_to: dkf --> msofer
     
  • miguel sofer
    miguel sofer
    2010-12-30

    No 64bit platform handy to repro. Any chance to run under valgrind a -DPURIFY build? It should have symbols but no mem debug ...

     
  • Not in any config I have; my toolchain is somewhat obsolete now.

     
1 2 > >> (Page 1 of 2)