Menu

#1236 missing file events on server socket under Windows

obsolete: 8.2.1
closed-fixed
nobody
5
2001-03-31
2000-10-26
Anonymous
No

OriginalBugID: 3409 Bug
Version: 8.2.1
SubmitDate: '1999-11-05'
LastModified: '2000-02-10'
Severity: CRIT
Status: Assigned
Submitter: techsupp
ChangedBy: hobbs
OS: Windows NT
OSVersion: 4
FixedDate: '2000-10-25'
ClosedDate: '2000-10-25'

Name:
Ulrich Lauther

CustomShell:
compiled a static lib; Tcl/Tk library scripts are embedded in a C-file and

ObservedBehavior:
I set up an event handler in a server gui-application which should react on
incoming messages on a socket, concurrently to gui handling. The code looks
similar to this:
int handle = accept(my_socket,0,0);
Tcl_Channel channel = Tcl_MakeTcpClientChannel((ClientData) handle);
Tcl_RegisterChannel(my_interpreter,channel);
Tcl_CreateChannelHandler(channel, TK_READABLE, file_callback, (void*)
this);
whenever input arrives at the socket, the function file_callback()
should be called.
This works fine under Linux, used to work in the past (< 8.1 ?) but does
not
reliably work under Windows. It DOES work, if I add printf()'s for each
arriving
message, so I suspect a timing problem.

DesiredBehavior:
See above.

http://www.deja.com/=dnc/viewthread.xp?AN=540260481

It's not certain that this is failing quite as mentioned,
and may be due to TCP delay.
-- 11/10/1999 hobbs
Another reports that this crept into Tcl between 8.0.4 and 8.2:

Michael Kirkham wrote:
...
> occuring in the older version of our software. This leads me to
> believe that the problem was introduced between Tcl 8.0.4 and 8.2.1.
>
> The dropped events problem is a little bit intermittent, so I am not
> 100% certain, but I have not myself been able to observe the dropped
> events problem with any version of our software built with Tcl 8.0.4
> but have quite often with versions built with Tcl 8.2.2 (including a
> version previously built with Tcl 8.0.4).

-- 12/30/1999 hobbs

Discussion

  • Donal K. Fellows

    I can't see quite why this is failing. Is it possible to try using Tcl's own code (Tcl_OpenTcpServer) to handle sockets instead? That runs code that should work on all supported platforms...

     
  • Anonymous

    Anonymous - 2001-03-20

    Logged In: YES
    user_id=178287

    I've recently been investigating this problem again hoping
    to find a fix for the problem and in doing so have found
    the following information that may be useful:

    * It happens most often on Windows 95/98. It seems to be
    much more difficult to reproduce on Windows NT, according
    to my colleague with an NT machine, but I can reproduce it
    with no problem on 95/98.

    * If the socket is closed and reopened, with a new event
    handler created, communication resumes for a time (in my
    case this is UDP so there's no connection loss doing so).

    * Simply using Tcl_DeleteChannelHandler and then calling
    Tcl_CreateChannelHandler again does NOT cause communication
    to resume.

    * After an event is dropped, the socket remains in the
    readable state (as indicated by a select() call directly on
    the socket). It seems as long as the socket remains in
    this state no further "readable" events will occur. Once
    the socket is read so that it is no longer in the readable,
    however, new incoming data will usually trigger the event
    handler and will continue until the next time an event is
    dropped.

    * The behavior seems to be affected by whether or not the
    application has focus. In my case, I've got a small DLL
    written specifically to reproduce the problem. Two copies
    of wish are run, loading the DLL in each, and one acts as a
    client and the other a server sending UDP packets back and
    forth between each other as fast as possible. The instance
    acting as the server, which sends a packet back only when a
    packet is received, usually runs normally as long as it has
    the focus. But as soon as the client gets focus then
    events start getting dropped left and right on Win 95/98.

    * Tcl 8.0.5 does not exhibit the problem but all versions
    from 8.2.1 through 8.4a2 do. 8.1.x - 8.2 may also exhibit
    the problem but as of yet I haven't confirmed due to a
    crashing problem with 8.1 and this DLL.

     
  • Anonymous

    Anonymous - 2001-03-23

    Logged In: YES
    user_id=178287

    I believe I finally managed to track down this elusive bug.

    Background: Tcl uses WSAAsyncSelect() (winsock API) to tell
    winsock to call a particular function (SocketProc()) when
    an event such as incoming data occurs. This function
    basically checks what sort of event occurred and sets some
    flags to be checked later in Tcl's idle/event handling loop.
    Eventually this loop calls another function (SocketEventProc
    ()) that verifies that the condition for the event (well,
    FD_READ events, at least) is still met before signalling
    back to Tcl (via Tcl_NotifyChannel()) to trigger the
    function we specified (via Tcl_CreateChannelHandler()) as
    our own handler for the event.

    Now, in various places the Tcl socket drivers disable the
    WSAAsyncSelect() handler before doing some things and re-
    enabling it. (Re-enabling also causes any existing
    conditions to re-generate events -- ie., the handler
    specified to WSAAsyncSelect() is called). Before this
    verification step mentioned above (involving a call to the
    regular non-event-driven select() function) is one such
    place where the WSAAsyncSelect() handler is disabled
    temporarily.

    However, the code was such that it's only RE-enabled after
    the select() call when the socket no longer has data to be
    read (select() returns 0). If there's data on the socket
    (because, perhaps, multiple packets came in very quickly,
    before the channel handler could read the first and clear
    the event), then the WSAAsyncSelect() handler is apparently
    not re-enabled.

    I suspect, though I haven't verified, that in 8.0.5 and
    earlier the problem didn't occur for one of a few possible
    reasons:

    1. The particular codepath that left the event handler
    disabled was basically never called (ie., at this point
    there was never data left to be read and select() would
    always return 0).

    2. -Or- the WSAAsyncSelect() handler was being re-enabled
    later on in the event handling loop but isn't in later
    versions.

    3. -Or- 8.0.5 and earlier called WSAAsyncSelect() directly
    at this point, rather than calling SendMessage() to trigger
    a later call to WSAAsyncSelect() to re-enable the handler,
    which may cause a race condition of sorts. (Though in this
    case, the patch below would probably just be lucky that it
    works for me).

    At any rate, here's the change that worked for me to fix
    this problem (in "diff -c" format; will also upload to
    patches section). The little program I wrote to recreate
    and debug this problem screams along happily without any
    apparent dropping of events. Just one line that's inside an
    else {} block that apparently shouldn't be (the SendMessage
    () call, which triggers a later call to WSAAsyncSelect(),
    is moved so it is called regardless of the select() return
    value):

    <pre>
    *** ./orig/tcl8.2.3/win/tclWinSock.c Sun Aug 1 15:09:29
    1999
    --- ./tcl8.2.3/win/tclWinSock.c Thu Mar 22 16:44:48 2001
    ***************
    *** 853,862 ****
    if ((*winSock.select)(0, &readFds, NULL, NULL,
    &timeout) != 0) {
    mask |= TCL_READABLE;
    } else {
    - SendMessage(tsdPtr->hwnd, SOCKET_SELECT,
    - (WPARAM) SELECT, (LPARAM) infoPtr);
    infoPtr->readyEvents &= ~(FD_READ);
    }
    }
    if (events & (FD_WRITE | FD_CONNECT)) {
    mask |= TCL_WRITABLE;
    --- 853,862 ----
    if ((*winSock.select)(0, &readFds, NULL, NULL,
    &timeout) != 0) {
    mask |= TCL_READABLE;
    } else {
    infoPtr->readyEvents &= ~(FD_READ);
    }
    + SendMessage(tsdPtr->hwnd, SOCKET_SELECT,
    + (WPARAM) SELECT, (LPARAM) infoPtr);
    }
    if (events & (FD_WRITE | FD_CONNECT)) {
    mask |= TCL_WRITABLE;
    </pre>

     
  • Jeffrey Hobbs

    Jeffrey Hobbs - 2001-03-31

    Logged In: YES
    user_id=72656

    See fix in patch 410674.

     
  • Jeffrey Hobbs

    Jeffrey Hobbs - 2001-03-31
    • status: open --> closed-fixed
     
  • Don Porter

    Don Porter - 2001-03-31
    • labels: 104250 --> 27. Channel Types