From: SourceForge.net <no...@so...> - 2009-07-24 20:04:16
|
Bugs item #2826430, was opened at 2009-07-24 00:42 Message generated for change (Comment added) made by andreas_kupries You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110894&aid=2826430&group_id=10894 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: 24. Channel Commands Group: None Status: Open Resolution: None Priority: 7 Private: No Submitted By: Colin McCormack (coldstore) Assigned to: Andreas Kupries (andreas_kupries) Summary: Channel name recycling is dangerous Initial Comment: I would like a [chan rename $oldname $newname] command, which renamed a channel. This would be useful to avoid bugs (admittedly, caused by other bugs) where an open channel name is stored in one data structure, perhaps in a coro, thread or interp, and the channel is closed before the channel name can be removed from those data structures. The current effect of such uncoordinated storage is that either a delay in closing the channel must be suffered while all copies of the name are updated, or that a stale chan name is held somewhere, and causes a new chan (usually a socket) with the same name as the old (and now closed) chan to be manipulated in error. This problem arises, essentially, because tcl reuses chan names frequently, and chooses them from a very compact set. By enabling arbitrarily named chans, at construction time, chan name conflict could be completely avoided by an application. As a stopgap, synthetic channels can be constructed with unique names, but this is an extra level of indirection. ---------------------------------------------------------------------- >Comment By: Andreas Kupries (andreas_kupries) Date: 2009-07-24 13:04 Message: Another aspect of the use-case as Colin presented it is, it seems, that if any part of the system is allowed to kill a channel for some reason or other, then the coros using that channel have to have lots of catch'ing to do to detect when a channel was pulled out from under them, do I understand that correctly ? Ditto for the timer ... When it goes off it has to catch the possible error. IMVHO I would prefer an application architecture/schema where a specific package would be the manager of stuff like this, with each coro calling the necessary procedures, and 'close' would be routed through that as well, to perform the necessary actions, i.e. kill the timer. All the necessary information would be in a namespace, indexed by channel id, and not inaccessible inside of coro variables/state. Now the only issue would be to tell the coro using the channel that it is gone (because events would not get enabled on it again in scheme, with a central manager). ---------------------------------------------------------------------- Comment By: Andreas Kupries (andreas_kupries) Date: 2009-07-24 12:56 Message: Note our ability to share/transfer channels among slave interps and also to transfer between threads. The Tcl_GetUnique() result is only unique per-interp, and now we can get problems with such transfers when two channels get the same id in different interps, and then we want to transfer/share one over. The existing scheme avoided that because the names were unique per the whole process. Any new scheme should preserve this. ---------------------------------------------------------------------- Comment By: Donal K. Fellows (dkf) Date: 2009-07-24 08:42 Message: That's a question to ask on tcl-core ---------------------------------------------------------------------- Comment By: Alexandre Ferrieux (ferrieux) Date: 2009-07-24 07:47 Message: Looking into the implementation, it seems a much simpler method than compact+gencount could work: just increment a global (per-interp) integer, and use it as source of uniqueness: int Tcl_GetUnique(Tcl_Interp *i); /* returns a different number every time. Wraps in no less than MAXUINT calls. */ The idea of making it a global stub entry is that any extension can use it for its own purposes, and keep the guarantee of not colliding. Is a TIP mandatory for that or shall I proceed ? ---------------------------------------------------------------------- Comment By: Alexandre Ferrieux (ferrieux) Date: 2009-07-24 06:29 Message: Switched to Bug status and re-titled according to previous exchanges. Now I have a patch to submit it seems ;-) ---------------------------------------------------------------------- Comment By: Donal K. Fellows (dkf) Date: 2009-07-24 06:03 Message: They use the fd because it's an easy source of local uniqueness. ---------------------------------------------------------------------- Comment By: Colin McCormack (coldstore) Date: 2009-07-24 05:46 Message: Well, it certainly can be, so sure. I had a look, and it seems that every style of chan, serial, socket, file, pipe, except refchan uses the fd as the basis of its name. ---------------------------------------------------------------------- Comment By: Alexandre Ferrieux (ferrieux) Date: 2009-07-24 04:27 Message: If you accept to rename it to "Channel name recycling is dangerous", I can switch it to Bug status (without recreating and duplicating comments). Justtell me. ---------------------------------------------------------------------- Comment By: Colin McCormack (coldstore) Date: 2009-07-24 03:35 Message: BTW, the existing implementation is not strictly a bug, it's just a really inconvenient side-effect of an implementation choice which can easily cause bugs in applications unless a whole lot of work is done to avoid it. I'm not sure what you call those. I've not come across many of these in Tcl, so don't know the appropriate name. Should I close this and make it a bug report instead? ---------------------------------------------------------------------- Comment By: Colin McCormack (coldstore) Date: 2009-07-24 03:28 Message: These are all valid points, and it is certainly a judgement call. I don't mind which is chosen, but I would very much like either to be implemented, as the problems caused by chan name reuse for sockets are significant to me. The only real advantage of the approach I suggested is, as previously stated, some additional expressive power. It does come at a cost, as does its use. Either way, I'm happy. The approach suggested by Alexandre is also quite simple to implement - reflected channels already do it (tclIORChan.c:2041 NextHandle function) and requires no additional documentation unless one is feeling kindly toward people who may have relied upon undocumented side-effects of socket naming in the past. ---------------------------------------------------------------------- Comment By: Alexandre Ferrieux (ferrieux) Date: 2009-07-24 03:10 Message: OK so both (1) and (2) are the same in fact: all of this vanishes if channel names stop being reused. Now comparing the two approches (yours and mine), I 'd say I perfer mine (surprise, eh ?) because: - mine doesn't need extra primitives - mine doesn't demand script contribution to avoid bugs - yours might break existing scripts: if you rename (not close) a channel after some package has secretly stored its name, then its subsequent use of it will fail while it shouldn't. ---------------------------------------------------------------------- Comment By: Colin McCormack (coldstore) Date: 2009-07-24 03:00 Message: I see that I didn't clarify how renaming the channel would fix the problem of distributing the fact of closure to all interested parties. It would fix the problem by allowing the app to guarantee that a once-valid channel name, once closed, would never again be valid. So the fact-of-closure would be distributed as a side effect of that guarantee. If several interested parties might close the chan, the first one to do so indicates it by that act alone. An additional benefit is that the channel name-space is currently coded, but undocumented. [chan rename] would document the name-space as initially undefined, subsequently defined. ---------------------------------------------------------------------- Comment By: Colin McCormack (coldstore) Date: 2009-07-24 02:55 Message: To clarify, per Alexandre's request: if you have a coro which is in the process of handling fileevents on a socket (socket10, let's say) and you decide you want to not handle events on that socket for a few seconds, so you schedule an [after] event to trigger to enable you to re-establish the [fileevent $sock] ... you must store the socket name (socket10) in the variable sock, which may be a coro variable, hence inaccessible to other coros. If a different processing stream, or coro, decides it wants to close socket10, it can't safely do that until and unless it has notified the first coro of its intention, so that it can forget about socket10 - cancel any [after] events it may have scheduled. If this sounds like a contrived example, it's not. I have almost exactly this problem in adding input throttling to Wub right now, and anyone writing a producer/consumer pair with buffering where the two parts make use of modern Tcl facilities such as coros or [apply] or even [interp] will have similar problems. The amount of data which needs to be shared between the distinct processing elements is too great, and is imposed entirely by the rapid cycling of chan names. It is true that one could fix the cycling, but the idea of encoding names with significant information also appeals to me, and comes for free with this fix. ---------------------------------------------------------------------- Comment By: Colin McCormack (coldstore) Date: 2009-07-24 02:53 Message: I've thought of a new good reason to do this: by renaming, one could introduce some application-specific encoding into the names, for example "connected_$ipaddress_$count" and get connection counting for free. I also thought of a new distinct area of a program which may contain chan names: the event system (fileevents and such) ---------------------------------------------------------------------- Comment By: Donal K. Fellows (dkf) Date: 2009-07-24 02:39 Message: I'm just making sure that we don't forget to make them special. ---------------------------------------------------------------------- Comment By: Alexandre Ferrieux (ferrieux) Date: 2009-07-24 02:11 Message: No problem with hard-coding the lookup from std* to compactArrayOfFds[0,1,2] and disabling gencount checks on them. ---------------------------------------------------------------------- Comment By: Donal K. Fellows (dkf) Date: 2009-07-24 01:58 Message: Just a quick note in passing. It's important to keep the 'std*' namespace special. ---------------------------------------------------------------------- Comment By: Alexandre Ferrieux (ferrieux) Date: 2009-07-24 01:08 Message: Hmmm, there are two separate issues here: (1) removing the carpet under a coro/thread feet and (2) too fast reuse. I fail to see how this would solve (1), but for (2) it is clear. Can you clarify on (1) ? Regarding (2), the "compact set" arises from Tcl's direct use of OS handles/fd, and on unix fds are a compact set. On Windows AFAIK handles are not reused (correct m if I'm wrong). So, an alternative way of solving this issue, without any intervention from the script writer, would be to insert an indirection in Tcl, with an intentionally compact set managed by Tcl, but extended with generation counts, so that no reuse occurs: typedef struct { HANDLE_OR_INT fd; int gencount; } UniqueHandle; UniqueHandle *compactArrayOfFds; The idea, then, is to have automatic channel names of the form sock3g1 which can be looked up from string in two steps: 3 is the index in compactArrayOfFds 1 is the gencount for verification This way, even if slot 3 of the array is reused, it will have a gencount of at least 2, hence any stale reference to sock3g1 will generate an error instead of wreaking havoc by unwanted IO on another channel. Note that this works well with the Channel intrep too, assuming we add the gencount to the structure. Then the gencount verification is just an extra integer comparison. Reactions ? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=110894&aid=2826430&group_id=10894 |