Learn how easy it is to sync an existing GitHub or Google Code repo to a SourceForge project! See Demo

Close

#2818 Threaded Tcl calls non-thread-safe library functions

obsolete: 8.4.4
closed-fixed
7
2007-02-08
2004-07-28
Rob Crittenden
No

This thread-safety issue is well documented in this
closed bug
http://sourceforge.net/tracker/?group_id=10894&atid=110894&func=detail&aid=217833

Tcl calls at least the following non-thread-safe functions:

getgrnam()
getpwnam()
getpwuid()
gethostbyaddr()
gethostbyname()

A stack trace from AOLserver 4.0.1 with Tcl 8.4.4
running on Solaris 8 that crashed calling gethostbyname is:

----------------- lwp# 281 / thread# 280
--------------------
fefbb2c8 __inet_address_is_local_af (5f75d18, 2, 0, 2,
2ee4f, c5189ad9) + 108
fefba358 order_haddrlist_af (ff02263c, ffff, 2, 2edd8,
2, 4b9eae88) + 274
fef9b5f8 _get_hostserv_inetnetdir_byname (2edd8,
c5189be0, 0, c5189be0, ff01ef28, ffffffff) + 830
fefb54c0 gethostbyname_r (4ba27760, 2edbc, 2edd0, 920,
2edbc, ff01ef28) + a0
ff34c944 GetAddr (0, c518a08c, c5189cd0, c518a890,
3226db80, 0) + 28
ff34c734 DnsGet (0, c5189db8, 3ae558, c518a08c,
3226db80, 1) + 134
----------------- lwp# 280 / thread# 279
--------------------
ff11f20c _lwp_sema_wait (c5d8be60, fef6c000, 0,
c5d8bd98, 0, 0) + c
fef490d8 _swtch (c5d8bd98, 0, fef6c000, 5, 1000, 0)
+ 158
fef482ec _cond_wait (c5d8bd98, 4356, fef6c000,
fef6e968, ff022668, 10) + d4
fef4c164 rw_rdlock (fef7785c, 5000, fef6c000, 5257,
ff01ef28, ff022668) + d8
fefba1dc order_haddrlist_af (ff02263c, ff022648, 2,
2ede4, 2, 2edbc) + f8
fef9b5f8 _get_hostserv_inetnetdir_byname (2ede4,
c5d881d8, 0, c5d881d8, ff01ef28, ffffffff) + 830
fefb54c0 gethostbyname_r (475a17d0, 2edbc, 2edd0, 920,
2edbc, ff01ef28) + a0
ff291a18 CreateSocketAddress (c5d883b8, 13f97230,
c5d88264, ff13c000, 33fb4a02, 0) + 58
ff291704 CreateSocket (3be6eec0, 0, ffffffff, 0, 0, 0)
+ 20
ff291ac8 Tcl_OpenTcpClient (3be6eec0, 50, 13f97230, 0,
0, 0) + 2c

Don't be confused by the reference to gethostbyname_r.
gethostbyname apparently calls gethostbyname_r using
the same shared return buffer.

Discussion

    • assigned_to: andreas_kupries --> vasiljevic
     
  • Logged In: YES
    user_id=75003

    I am not sure why assigned to me. Giving to Zoran for comments.

     
  • Logged In: YES
    user_id=95086

    Took me quite some time to get rid of those :-(

    Anyways... this is now done in core-8-4-branch
    and head branch. I'm leaving the ticket open for
    some weeks now and unless we find some other
    non-mt-safe call in Tcl core, I will close it.

     
  • Don Porter
    Don Porter
    2006-09-07

    Logged In: YES
    user_id=80530

    latest sources have many crashes
    on both HEAD and 8.4.-branch on
    Solaris 8 in the --disable-threads
    configuration when compiled with
    Forte Developer 7 C 5.4 2002/03/09.

    First example:

    % info patch
    8.4.14
    % file attributes ~
    Bus Error

     
  • Don Porter
    Don Porter
    2006-09-07

    • priority: 5 --> 7
     
  • Don Porter
    Don Porter
    2006-09-07

    Logged In: YES
    user_id=80530

    The --enable-threads build is
    much better. Just a couple
    failed tests:

    ==== unixFCmd-15.1 SetGroupAttribute - invalid group FAILED
    ==== Contents of test case:

    catch {file delete -force -- foo.test}
    list [catch {file attributes foo.test -group foozzz}
    msg] $msg [file delete -force -- foo.test]

    ---- Result was:
    1 {could not set group for file "foo.test": no such file or
    directory} {}
    ---- Result should have been (exact matching):
    1 {could not set group for file "foo.test": group "foozzz"
    does not exist} {}
    ==== unixFCmd-15.1 FAILED

    ==== unixFCmd-16.3 SetOwnerAttribute - invalid owner FAILED
    ==== Contents of test case:

    catch {file delete -force -- foo.test}
    list [catch {file attributes foo.test -owner foozzz}
    msg] $msg

    ---- Result was:
    1 {could not set owner for file "foo.test": no such file or
    directory}
    ---- Result should have been (exact matching):
    1 {could not set owner for file "foo.test": user "foozzz"
    does not exist}
    ==== unixFCmd-16.3 FAILED

     
  • Logged In: YES
    user_id=95086

    There you go...

    Well, I have only GCC and this revealed no problems:

    ./tclsh
    % info patch
    8.4.14
    % file attributes ~
    -group develop -owner zoran -permissions 040755

    This is most odd.
    Do you have a chance to compile with GCC just to
    see if this makes any difference?

     
  • Don Porter
    Don Porter
    2006-09-07

    Logged In: YES
    user_id=80530

    Same failures using gcc 3.2.3.

     
  • Logged In: YES
    user_id=95086

    Even more weird. I need not tell you that on

    bash-2.03$ uname -a
    SunOS Develop 5.8 Generic_117350-20 sun4u sparc SUNW,Sun-Blade-2500
    Solaris
    bash-2.03$ gcc -v
    Reading specs from /usr/local/lib/gcc-lib/sparc-sun-solaris2.8/3.3.2/specs
    Configured with: ../configure --with-as=/usr/ccs/bin/as --with-ld=/usr/
    ccs/bin/ld --disable-nls --disable-libgcj --enable-languages=c,c++
    Thread model: posix
    gcc version 3.3.2

    I get all working fine?

    We will have to see what's up there...

    I will shortly checkin a tclUnixCompat.c with fallback to original
    mt-unsafe calls in case TCL_THREADS is not defined. But it would
    help me if you can isolate a single test and run it under debugger
    so we se at which point it breaks. Can you do that?

     
  • Don Porter
    Don Porter
    2006-09-07

    Logged In: YES
    user_id=80530

    gdb is not on this system.

    trying (unfamiliar) dbx, I
    see a suggestion the issue
    is alignment?

    % file attributes ~
    signal BUS (invalid address alignment) in CopyArray Symbol
    *0xc23588

     
  • Logged In: YES
    user_id=95086

    Ah! Allright. Yes, this is the alignment problem. Let me see if
    I can get this one blindly (I think I can).
    Also, what I will do is to make a !defined(TCL_THREADS) which
    will simply fallback to the mt-unsafe call. This will take care
    of the issue more "naturally".
    Nevertheless it is good to know that there is an alignment problem
    there which I will fix now.
    When this is done, we must see other errors you mentioned
    which appear in the threaded build. Hmhm...

     
  • Don Porter
    Don Porter
    2006-09-07

    Logged In: YES
    user_id=80530

    The functioning of CopyArray()
    is not immediately clear to me
    (Comments please!)... but it
    looks like an array of (char)
    is getting cast to an array
    of (char *) ? That does seem
    like an operation that's perilous
    when it comes to alignment matters.

     
  • Logged In: YES
    user_id=95086

    Well, it says in the code:

    /*

    *----------------------------------------------------------------
    -----------
    *
    * CopyArray --
    *
    * Copies array of NULL-terminated or fixed-length strings
    * to the private buffer, honouring the size of the buffer.
    *
    * Results:
    * Number of bytes copied on success or -1 on error (errno = ERANGE)
    *
    * Side effects:
    * None.
    *

    *----------------------------------------------------------------
    -----------
    */

    This is basically copying the

    src[0] .... src[n]

    to a user-passed buffer. The buffer stores the copy of the
    array and following it the strings referenced by the array.

    I look and still do not see the problem.
    There is one assigment there

    len = (sizeof(char *)*(i + 1)); /* Leave place for the array */
    new = (char **)buf;

    But this should be innocent really...

     
  • Logged In: YES
    user_id=95086

    Damn! I'm pretty blind!

    Replace (in CopyArray)

    new = (char **)buf;

    with

    new = &buf;

    and recompile.

     
  • Logged In: YES
    user_id=95086

    But this still does not work everywhere...
    Let me study this in more detail... I will take care of that in
    the next 1/2 of hour.

     
  • Logged In: YES
    user_id=95086

    Allright. This should be all fixed now.
    It was indeed an alignment problem of char* arrays.
    I believe both threaded and non-threaded tests must
    go OK now in both 8.4 and 8.5 branches.
    If not, you will tell me on time ;-)

     
  • Don Porter
    Don Porter
    2006-09-07

    Logged In: YES
    user_id=80530

    All my issues are now resolved.

     
  • Jeffrey Hobbs
    Jeffrey Hobbs
    2007-02-08

    • status: open --> closed-fixed