From: SourceForge.net <no...@so...> - 2009-07-28 19:21:17
|
Bugs item #2827000, was opened at 2009-07-25 06:19 Message generated for change (Comment added) made by andreas_kupries You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110894&aid=2827000&group_id=10894 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: 25. Channel System Group: development: 8.6b1.1 Status: Open Resolution: None Priority: 7 Private: No Submitted By: Colin McCormack (coldstore) Assigned to: Andreas Kupries (andreas_kupries) Summary: tcl incorrectly reports eof to application Initial Comment: attached is an annotated narrative log of an observed sequence of interactions by the refchan subsystem and this code http://code.google.com/p/wub/source/browse/trunk/Wub/Chan.tcl?r=1798 A connected socket is passed as the 'chan' argument of Chan construction, which is [chan create]d as a refchan. In the narrative, some data is [read] pursuant to a [readable] event on the underlying socket. Less data is returned by [Chan read] than is requested. A buffer-full of data is requested (4096 bytes) but only 1440 bytes are available. The data is processed by the application, which then attempts to [read] (in non-blocking mode?) more data. The application uses [gets] to parse a prefix of the 1440 bytes which are available, but it certainly does not consume the entirety. It attempts to [read] some 20Kb more. Tcl [Chan watch]es the refchan, and soon after calls [Chan read] again to request another buffer-full of data. There has been no readable event triggering, nor has [chan postevent] been called by Chan. Despite this, Tcl attempts to read. When there is no data available, as indicated by the return value of "" from [Chan read] upon this second attempt, Tcl indicates [eof] on the refchan. I would have expected Tcl not to call [Chan read] until after a [chan postevent] had been signalled to it by Chan. Observations: The behaviour is intermittent, insofar as sometimes an identical socket interaction succeeds (though rarely.) I have attempted to reproduce the behaviour in a localhost-connected socket, and failed. I believe this is because the second and subsequent [Chan read]s succeed. I think it's likely that Tcl refchan is calling the refchan's read method inappropriately, and then interpreting the 0-length result as end of file even though it has received no indication from the refchan by means of postevent that it can expect any data to be available to read. ---------------------------------------------------------------------- >Comment By: Andreas Kupries (andreas_kupries) Date: 2009-07-28 12:21 Message: Line 72 of chan.tcl, method 'initialize'. -blocking 0 at this point is wrong. A channel starts out in mode 'blocking'. To then set the underlying channel into non-blocking will likely confuse things when it comes to reading, especially regarding short reads (Not possible in blocking mode except when reaching eof. With the underlying channel set to non-blocking on init this can give short reads before eof). Note that this should not have an effect on the problem we are looking into, because there both channels are set non-blocking. Well, should be, and the save.anno log seems to concur. ---------------------------------------------------------------------- Comment By: Andreas Kupries (andreas_kupries) Date: 2009-07-28 11:28 Message: A review of the IChan class shows no anomalies. Using watch to (de)activate events on the underlying channel is a nice trick for avoiding flags. It should work I would think. And with that the immediate forwarding of events in the methods readable/writable should work as well. I tried to compare to my refchan examples, but they are all about base channels without a channel underneath, so not really comparable. The nearest is the fifo2, sort of, as it consists of two ref channels talking to each, i.e. internally joined at the hip. But even there the whole buffer mgmt is by the fifo code, there is no regular channel involved. After I got me the Debug package I could run the Chan.tcl by itself, demo mode. That worked. Do you have a demo script using Chan.tcl and showing the problem ? I hope here that I can avoid getting the whole wub sources. Looking over the save.anno, maybe some more clues there > * as a result of an app [gets], tcl watches reads on the rc: > chan @@{rc18 watch sock15 read} Here might be a misunderstanding of the IO core. 'watch' is primarily called in response to 'chan event'. Calls from the guts of gets/read happen for temp. deactivate events, and re-activate them when the Tcl buffers are empty. A call this early is more likely from a 'chan event' than 'gets'. > (App parses the header from some of the received bytes: Httpd @@{::Httpd::rc18_1504 > reader got request: (...)}) This gets events on rc18, as per step 9. > * tcl reads a buffer's worth of bytes from rc18, but there are no bytes buffered in sock15, so an > empty string is returned: > chan @@{rc18 read sock15 4096 -> 0 bytes '...'} Yes. This is like in testRefchan.tcl, step 13. Event through timer, not enough data in the buffers, ask the OS if it has more. Empty read. > * tcl watches read (presumably because the app has a fileevent readable): > chan @@{rc18 watch sock15 read} Like step 15. Interest in the base events is re-activated. This also re-activates sock15 events, should be. > * application calles [eof rc18] which returns true, and terminates its reader: > Httpd @@{::Httpd::rc18_1504 terminate: (entity)} Hm. Here I would have expected fblocked ... Hm ... It should be possible to model that using testRefchan ... Basically, instead of going into state 2 where it always returns the empty string it should return nothing once, and at the next read something ... But the next read will not happen, won't it. Because of the 'eof' getting signaled ... Ok. That means I should concentrate on the happenings in and around steps 16 and 17 in rc.tcl. ---------------------------------------------------------------------- Comment By: Andreas Kupries (andreas_kupries) Date: 2009-07-28 08:13 Message: Ok. Will have a look through this as well. ---------------------------------------------------------------------- Comment By: Colin McCormack (coldstore) Date: 2009-07-27 21:03 Message: Clearly the test isn't a good test of what I'm observing. Please see enclosed Chan.tcl for the code which produces the events I observed. Unlike the test, it includes a [watch] method, and it only forwards events to Tcl when tcl has registered interest in that event, and it nonetheless behaves exactly as the original test script behaves. Moreover, when the refchan is substituted for a socket, it does not behave like a socket. Your advice on what is missing to make the enclosed Chan object perform precisely as a socket performs would be appreciated. ---------------------------------------------------------------------- Comment By: Andreas Kupries (andreas_kupries) Date: 2009-07-27 20:01 Message: Behaviour of testRefChan.tcl ... NOTE: The implementation of 'watch' was missing. This is crucical. You have to look at the watch calls to see if _you_ (The rc channel implementation) is expected to generate events, or not. See steps 3, 7, and 15. NOTE 2: The handling of eof in getter/reader fell through to regular read, I added exit commands there. NOTE 3: The modified test script is attached. (.1) Create channel RC (.2) Set channel RC to blocking --> (.3) Set up a fileevent receiver for RC -> getter. This invokes rc::watch, asking for read events, this posts a readable event. (.4) The readable event is received and 'getter' called. (.5) The 'gets' command in 'getter' finds empty buffers and calls the channel's inputProc -> ReflectInput -> procedure rc::read. (.6) rc::read returns 221 bytes, 10 lines of '111...', an empty line, 10 lines of '222...'. (.7) As part of the inputProc processing interest in read-events from the channel is disabled. We have data in the Tcl buffers, these will get flushed first. No need to ask for events from the OS (=== the reflected channel). (.8) The IO core returns the first '111...' line as result of 'gets'. (.9) The IO core generates its own fileevents, based on timers, to flush the data left over in the Tcl buffers (remaining '111...' etc. lines. (10) Per (.9) 'getter' is called, the cycle repeats. (11) At the 11th call 'gets' returns an empty line, and the event receiver switches to 'reader'. This doesn't change that data in the buffers is flushed via timer-based fileevents, there are still the '222...' lines etc. (12) 'reader' is called, for the first of the events after the switch-over. (13) It tries to [read 10000]. As only less bytes than that are left in the Tcl buffers the channel's inputProc is called to sdee that we fill it up with more data => ReflectInput => rc::read. This is the so-called bogus call, and it is not bogus, but completely correct. (14) This call returns 0 bytes, and the EOF flag in the channel is set. (15) The IO core now __re-enables__ interest in the 'readable' events because all data in the Tcl buffers will be delivered to the script, and any more data will have to come from the channel itself. <=> Any more events have to come from the channel as well, the timer-based events are stopped. (16) The read call returns the remaining bytes from the Tcl buffers. At this point the script stops for me with can't wait for variable "forever": would wait forever because the RC channel implementation did not set things up to produce its own events, despite being asked to in step 15, via rc::watch. The moment we do this the posted event causes another call of 'reader' sa step (17), which then handles the eof condition by closing the channel, which disables events, finalizes, and exits. The last is something I added to the script. Now, what does that mean for the identity refchan ? It has to keep track of when the Tcl IO core wants events from it or not. If it receives a fileevent from the socket it is connected to at a time the core is not interested in identity events, then this event has to be remembered in flags (one per read and write), but not be acted on, i.e no posting forward to the core. Only when events are enabled again for the identity chan the flags have to be checked, and if we got events from the socket we post _one_ event forward. ---------------------------------------------------------------------- Comment By: Colin McCormack (coldstore) Date: 2009-07-27 14:28 Message: The testRefchan.tcl script is not wild caught, it is crafted to illustrate and test for the problem I observed. My assertion that I am not posting an event pertains to the observed problem, which I have annotated in the other attachment, not to the test script, which was written after I'd reported this bug. It seems to me that either there should never be a read invoked in non-blocking mode without there having been a directly corresponding post event, or at least that under such circumstances a read return value of 0 must not be interpreted as an EOF. ---------------------------------------------------------------------- Comment By: Andreas Kupries (andreas_kupries) Date: 2009-07-27 09:56 Message: One thing I noted in my quick glance over the testRefchan.tcl test script is that you _are_ posting a 'readable' event from the 'blocking' method, contradicting the "having been no [chan post] indicating" information in your comment here. I.e. when the blocking status of the channel changes you tell the higher levels unconditionally that you have data to read. Without checking that you truly have such. That already could be your problem. At least for the test script. ---------------------------------------------------------------------- Comment By: Colin McCormack (coldstore) Date: 2009-07-25 07:01 Message: Incidentally: I would have expected the test script to show tcl sending another readable event to the application when it detected/decided the refchan was [eof]. This would be consistent with standard channel behaviour, but doesn't occur. ---------------------------------------------------------------------- Comment By: Colin McCormack (coldstore) Date: 2009-07-25 06:56 Message: I have written and attached a small script testRefchan.tcl, which I believe replicates and illustraes the problem. It demonstrates that Tcl calls [read] on a non-blocking refchan despite there having been no [chan post] indicating that it should, and then delivering a positive [eof] when that read returned empty string result. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110894&aid=2827000&group_id=10894 |