Menu

#2900 TCL crashes if the application runs for a long time

final: 8.3.5
closed-invalid
5
2004-09-28
2004-09-28
Anonymous
No

OS Platform and Version :
W2K

Problem Behaviour:
We have an application built using Tcl/Tk V8.3.5. This
application will be typically used for a long duration (say
12 to 60 hrs). During such a prolonged usage the
application crashes and a Dr.Watson dump is generated
(the log is attached). On the first analysis of the log I
could figure out that the crash has occured in the
function TclpStrftime, which will typically be invoked by
the usage of the Tcl command [clock clicks -
milliseconds]. The crash does not happen during the
short duration of the usage of the tool.

Expected Behaviour:
The application should not crash

Contact email id : murali.venkat@siemens.com

Discussion

  • Nobody/Anonymous

    Dr.Watson log

     
  • Kevin B KENNY

    Kevin B KENNY - 2004-09-28
    • labels: 105650 --> 01. Notifier
    • status: open --> closed-invalid
     
  • Kevin B KENNY

    Kevin B KENNY - 2004-09-28

    Logged In: YES
    user_id=99768

    There are multiple sources of confusion in this bug report;
    let me try to untangle a few of them, or else the
    explanation will appear wholly unrelated.

    First, despite the indications in DrWatson.log, the crash
    did *not* occur in or around TclpStrftime. Rather,
    TclpStrftime was the last exported name before the code in
    question. (This fact is not surprising; it's the last
    exported name in the Tcl library.)

    The code that faulted was, in actuality, a bit of
    generated code, in another segment, that handles probing the
    large activation record of TclRegExec (the 'exec' function
    in generic/regexec.c). The stack probes went below the base
    of the stack segment at 0x34000 and faulted. This is a
    usual behaviour of most software confronted with a stack
    overflow.

    Tcl contains logic to make stack overflows more benign, in
    the function TclpCheckStackSpace in TclWin32Dll.c.
    Unfortunately, in the release you're using, the stack
    commitment that TclpCheckStackSpace imposes is not enough to
    handle the demands of TclRegExec (whose activation record is
    extremely large). This problem is fixed in release 8.4.7;
    see

    http://sourceforge.net/tracker/index.php?func=detail&aid=947070&group_id=10894&atid=110894
    and

    http://cvs.sourceforge.net/viewcvs.py/tcl/tcl/win/tclWinInt.h?r1=1.20.2.2&r2=1.20.2.3
    for the details.

    The bad news is that this change will, in the log that I'm
    seeing, apparently just convert a crash to a Tcl error; the
    stack will still have overflowed. Unlike the case with most
    stack overflows, I'm not seeing tremendously deep recursive
    invocations of Tcl code. Rather, I'm observing that there's
    an unusually large amount of stack in use prior to a call to
    Tcl_DoOneEvent in or near a procedure named Q_Init (which is
    not part of Tcl, so I can't comment on it). I suspect
    several possibilities here:

    (1) It's possible that Q_Init (or something called from it)
    is leaking memory that is allocated with the 'alloca'
    library call; 'alloca' allocates memory by expanding the
    activation record. Eventually, there isn't enough stack
    space left to run the event handler, and the process
    crashes.

    (2) Another possibility is that Q_Init calls a deeply
    recursive nest of functions, each of which is compiled
    with frame pointers omitted. Since the DLL in question
    has no symbol information, DrWtsn32 can't trace calls
    through it.

    (3) Yet another possibility is that an event handler in C
    (again, compiled with frame pointers omitted) is
    invoking Tcl_DoOneEvent (or invoking Tcl code that calls
    [update] or [vwait]) and Tcl_DoOneEvent finds another
    event pending. The second event in turn also does
    Tcl_DoOneEvent in its event handler, and so on.
    Eventually, there are enough unfinished event handlers
    stacked that the process crashes. If this is the case,
    the most likely cause is that something does [after
    idle] or Tcl_DoWhenIdle from an idle handler - the
    documentation remarks that doing so is not safe.

    Since the stack exhaustion appears to be the result of
    Tcl_DoOneEvent being entered with inadequate stack space
    remaining, rather than any inherent fault in the Tcl library
    itself, I'm closing this bug. If you need further help
    tracking things down, I'd suggest visiting
    http://mini.net/cgi-bin/chat.cgi
    and talking to the Tcl developers there.

     
MongoDB Logo MongoDB