Menu

#4715 breakage on head Windows triggered by install-tzdata

obsolete: 8.6b1.1
closed-fixed
8
2010-09-21
2010-09-17
No

Previously recorded at 3065455, we have found that there is a general issue newly introduced on head on Windows, I suspect related to the -DUNICODE work.

You can trigger this bug with install-tzdata, but only when you use a prefix path of sufficient length. This one in particular worked for me:

prefix = C:/Tcl-test/we-want-a-long-path/and-this-makes-it-longer/dont-you-know-longer-still/yes-this-will-be-long-enough
exec_prefix = C:/Tcl-test/we-want-a-long-path/and-this-makes-it-longer/dont-you-know-longer-still/yes-this-will-be-long-enough
bindir = ${exec_prefix}/bin
libdir = C:/Tcl-test/we-want-a-long-path/and-this-makes-it-longer/dont-you-know-longer-still/yes-this-will-be-long-enough/lib

aka --prefix=C:/Tcl-test/we-want-a-long-path/and-this-makes-it-longer/dont-you-know-longer-still/yes-this-will-be-long-enough when building.

So what's happening here? I believe something in the switch from regular to -DUNICODE mode is now overflowing some aspect of a dstring (they have static space of 200b, and must grow otherwise). Or perhaps a buffer overrun where 1b was appended where 1*sizeof(WCHAR) is needed now. In any case, the crash symptoms indicate dstring related smashing, which is of course used all over Windows file handling. The -DUNICODE will have distinctly affected this behavior as well.

Discussion

  • Jeffrey Hobbs

    Jeffrey Hobbs - 2010-09-17

    One crash stack, which crashed in a mem debug build with "unable to alloc 306 bytes, z:/cvs/tcl/tcl8.6/win/tclWinFile.c line 3232". Of course I have enough mem, but some mem smashing must be occurring.

    tcl86g.dll!Tcl_DbCkalloc(unsigned int size=306, const char * file=0x1018425c, int line=3232) Line 393 + 0x16 bytes C
    > tcl86g.dll!TclNativeCreateNativeRep(Tcl_Obj * pathPtr=0x023d6310) Line 3232 + 0x13 bytes C
    tcl86g.dll!Tcl_FSGetInternalRep(Tcl_Obj * pathPtr=0x023d6310, const Tcl_Filesystem * fsPtr=0x10132c38) Line 2160 + 0x7 bytes C
    tcl86g.dll!Tcl_FSGetNativePath(Tcl_Obj * pathPtr=0x023d6310) Line 4571 + 0xe bytes C
    tcl86g.dll!TclpObjStat(Tcl_Obj * pathPtr=0x023d6310, _stat64 * statPtr=0x0017ebd4) Line 1985 + 0xf bytes C
    tcl86g.dll!Tcl_FSStat(Tcl_Obj * pathPtr=0x023d6310, _stat64 * buf=0x0017ebd4) Line 2034 + 0x10 bytes C
    tcl86g.dll!FileCopyRename(Tcl_Interp * interp=0x023004d8, int objc=5, Tcl_Obj * const * objv=0x023015e8, int copyFlag=1) Line 147 + 0xd bytes C

     
  • Jeffrey Hobbs

    Jeffrey Hobbs - 2010-09-17

    BTW, this is heisenbug like, and does need multiple runs to confirm.

     
  • Andreas Kupries

    Andreas Kupries - 2010-09-17

    Trace of DString activity for a particular dstring adress.

     
  • Andreas Kupries

    Andreas Kupries - 2010-09-17

    After rewriting the DString functions to have Debug variants (like we have for Tcl_Bj machinery) I got a good trace of DString actions. The attached log is actually a reduced form which looks only actions on the DString which crashed with high guard error in DStringFree.

    Investigating the locations recorded it seems that the crash is in
    "TclpObjRemoveDirectory" lines 1007-1014, with "TraverseWinTree" (*)
    the main suspect doing the mem smash. The DString is the native
    representation of some path.

    Call sequence
    TclpObjRemoveDirectory
    --> DoRemoveDirectory (1)
    --> TraverseWinTree

    The "DoRemoveJustDirectory" called by (1) seems to only read
    this DString (sees it only as a TCHAR* pointer).

    Regardless, this seems to be the area to look into.

     
  • Jan Nijtmans

    Jan Nijtmans - 2010-09-18
    • status: open --> open-fixed
     
  • Jan Nijtmans

    Jan Nijtmans - 2010-09-18

    I suspect the problematic line is:
    nativeSource[oldSourceLen + 1] = '\0';
    because if nativeSource is a TCHAR, then
    this does something different than expected.

    However, in order to quickly fix the HEAD
    for now just compile this file without -DUNCODE
    (there are more files like that, which still need
    to be investigated, so that doesn' t hurt)

    Therefore, keeping open until it is fully fixed.

     
  • Jan Nijtmans

    Jan Nijtmans - 2010-09-21

    proposed fix

     
  • Jan Nijtmans

    Jan Nijtmans - 2010-09-21

    I worked through the code, trying to find locations where UNICODE
    strings were appended to a buffer, trying to let is end with a double
    null-byte. In most places, this is done by using
    Tcl_DStringSetLength(.... , len+1);
    Tcl_DStringSetLength(.... , len);
    The first call allocates enough space to accomodate for the extra
    null byte, The second call sets the length correctly, and puts
    the principal null-byte there.

    However, there were two places where it was done differently.
    In tclWinFCmd.c, line 1307:
    nativeSource[oldSourceLen + 1] = '\0';
    If oldSourceLen is the string length in bytes, and
    nativeSource is a WCHAR array, this will set two
    null-bytes in the string, but at a double index
    of the intended location. The UNICODE string
    will not be null-terminated, which explains
    exactly Andreas' stack trace.

    Another location was at tclWinPipe.c, line 3136:
    *(WCHAR *)(namePtr + Tcl_DStringLength(&buf) + 1) = '\0';
    Not that bad, but casting a char * to a WCHAR *, while
    not even being sure that the pointer is aligned to
    a two-byte boundry, is clearly not correct.

    Here is the patch, which changed both locations to
    use the double Tcl_DStringSetLength method. This
    should - once and for all - fix this.

    Andreas (or Jeff), could you please confirm that this
    fixes the bug? Thanks!

    Originally, I intended to modify tclUtil.c such that
    on Windows two null-bytes are always guaranteed.
    However, there is a lot of code which assumes
    insertion of a single null-byte, so this lead to
    test failures.

     
  • Jan Nijtmans

    Jan Nijtmans - 2010-09-21
    • assigned_to: nijtmans --> andreas_kupries
     
  • Jan Nijtmans

    Jan Nijtmans - 2010-09-21
    • assigned_to: andreas_kupries --> nijtmans
    • status: open-fixed --> closed-fixed
     
  • Jan Nijtmans

    Jan Nijtmans - 2010-09-21

    Fixed on HEAD. Thanks for the big help, Andreas!