#5118 TclpFinalizeSockets hangs on windows

obsolete: 8.5.12
closed-fixed
Don Porter
9
2012-11-07
2012-10-04
Ralf Fassel
No

This seems related to the change introduced by ID: 1437595.

When running the attached code on Linux, the program opens the server
socket, waits one second and then exits as expected.

When running the same code on Windows-7 against tcl 8.3, the program
behaves as above on Linux.

When running the same code on Windows-7 against tcl 8.5 (Activestate,
or non-threaded makes no difference) the program blocks
in the _exit() call.

The reason is to be seen in TclFinalize() where on
Windows it waits for the socket thread to exit when unloading the TCL
DLL, but the corresponding thread either does no longer exist, or does
not get the Message, or does not respond to it:

win/tclWin32Dll.c:
DLLMain()
case DLL_PROCESS_DETACH:
...
Tcl_Finalize();
=>
generic/TclEvent.c
Tcl_Finalize()
=> Tcl_FinalizeThread()
=> TclFinalizeIOSubsystem();
=> TclpFinalizeSockets();

=>
win/tclWinSock.c
TclpFinalizeSockets()
=> PostMessage(tsdPtr->hwnd, SOCKET_TERMINATE, 0, 0);
// this blocks:
WaitForSingleObject(tsdPtr->readyEvent, INFINITE);

The PostMessage() return code is 0 at that point, indicating an error,
the corresponding error is 1400/invalid window handle.
If I change the code to

if (res) WaitForSingleObject(tsdPtr->readyEvent, INFINITE);

the program terminates as expected. It seems an error to call
WaitForSingleObject() if the PostMessage() call did not succeed, since
the Wait() relies on the other thread to signal the Wait().
If the other thread does not get the call, it cannot signal...

Proposed patch against tcl8.5.12:

--- tcl8.5.12/win/tclWinSock.c 2012/09/18 18:43:47 1.1
+++ tcl8.5.12/win/tclWinSock.c 2012/10/04 15:04:50
@@ -461,14 +461,14 @@
if (tsdPtr != NULL) {
if (tsdPtr->socketThread != NULL) {
if (tsdPtr->hwnd != NULL) {
- PostMessage(tsdPtr->hwnd, SOCKET_TERMINATE, 0, 0);
+ int res = PostMessage(tsdPtr->hwnd, SOCKET_TERMINATE, 0, 0);

/*
* Wait for the thread to exit. This ensures that we are
* completely cleaned up before we leave this function.
*/

- WaitForSingleObject(tsdPtr->readyEvent, INFINITE);
+ if (res) WaitForSingleObject(tsdPtr->readyEvent, INFINITE);
tsdPtr->hwnd = NULL;
}
CloseHandle(tsdPtr->socketThread);

Discussion

  • Ralf Fassel
    Ralf Fassel
    2012-10-04

    C source code

     
    Attachments
    t.c
    • assigned_to: nobody --> davygrvy
     
  • Sounds reasonable to me, but my knowledge of Windows (if any) is waning, so please Dave can you validate ?

     
  • Looks great!

     
    • labels: --> 27. Channel Types
    • priority: 5 --> 8
    • milestone: --> obsolete: 8.5.12
     
  • I'd be tempted to write that as:

    if (PostMessage(...)) {
    /* the comment... */
    WaitForSingleObject(...);
    }
    tsdPtr->hwnd = NULL;

    YMMV. At least the result of PostMessage() is defined to be boolean...

     
  • Ralf Fassel
    Ralf Fassel
    2012-10-05

    Of course the terse form Donal suggested is preferable. I had introduced the intermediate var before looking at the exact signature of PostMessage().
    And of course it would be nice to know *why* the handle is already invalid at that point, but...

     
  • Re "it would be nice to know *why*": agreed.
    Could you please set breakpoints/printfs at strategic places like SocketExitHandler , TclpFinalizeSocket, or after the "while" loop in SocketThread ? I suspect one of the first two is called twice.
    Also, what about 8.6 ?

     
  • unloading the DLL, for which a proper Tcl_Finalize() wasn't done before hand, is my guess. This usage of Tcl here is probably embedding and the app isn't being clean. I've seen this problem before with other areas of Tcl when the finalize logic is manipulated subtly, but has major affect in the abortive teardown.

     
  • Don Porter
    Don Porter
    2012-11-05

    Please report whether action on this is needed
    before an 8.5.13 release.

     
  • Don Porter
    Don Porter
    2012-11-05

    • priority: 8 --> 9
    • assigned_to: davygrvy --> dkf
     
  • Ralf Fassel
    Ralf Fassel
    2012-11-05

    The patch should be applied.

    Unfortunately I currently have no resources to investigate the "why" as proposed by Alexandre :-/

    @David: the usage as in the example program is indeed 'embedded' with no active TCL interpreter, but the error also manifests when a hard exit (C++ exception or signal) tears down the TCL application without properly releasing the TCL resources first.

    Since the logic of the patch seems clear it should do no harm in regular usage of the TCL library.

     
  • Ralf Fassel
    Ralf Fassel
    2012-11-05

    patch sans indentation

     
    Attachments
    • assigned_to: dkf --> dgp
     
  • You mistake me for someone who understands what's going on. I read the patch a month ago and noted that it was not "in style" especially; I don't understand the semantics (beyond what I can grep for) and don't really use Tcl on Windows very much at the moment anyway.

     
  • (The style issue is not a blocker, BTW.)

     
  • Don Porter
    Don Porter
    2012-11-07

    In the demo program, replacing the call _exit(0)
    with a call to Tcl_Exit(0) will fix things, yes?

     
  • Don Porter
    Don Porter
    2012-11-07

    Thanks to apn for help debugging. apn points to
    docs saying synchronization calls are not welcome
    in the routines called by DllMain().

    So either Tcl finalization ought not indulge in any
    synchronization matters, or....

    .... Tcl_Finalize() ought not be called from DllMain().

    The submitted patch is fine, but doesn't really address
    the fundamental matter in either of these ways.

     
  • Don Porter
    Don Porter
    2012-11-07

    apn points out to me that removing the Tcl_Finalize()
    call from DllMain() is the solution on the Tcl trunk (8.6).

    With Tcl 8.6, we apparently more explicitly put the
    burden of finalization on the embedding program,
    instead of trying to make automagic happen as the
    Tcl Dll gets unloaded.

    I'll commit the patch on the core-8-5-branch as an
    additional layer of workaround, but programs like
    the demo here also ought to be revised to take care
    of Tcl finalization.

     
  • Don Porter
    Don Porter
    2012-11-07

    • status: open --> closed-fixed
     
  • Don Porter
    Don Porter
    2012-11-07

    patched for 8.4.20 and 8.5.13.