From: Twylite <tw...@cr...> - 2008-11-22 13:06:05
|
Hi, > From: Magentus <mag...@gm...> > > The [finally script] usage is trivial to implement using unset traces > (although not quite as clean, mostly since it uses a magic variable > name). > This works for [proc] and [apply], but is not completely reliable. There is no guarantee that the magic finally variable will be the last to be unset, so a script like 'finally [list close $f]' is safe but 'finally { close $f }' may not behave as expected. Also [try] is not a separate scope for variables, so it would have to have a special interaction with the magic finally variable such that [finally] scripts added inside the context of [try] are executed at the end of the [try]. Example: proc dostuff {} { set f [open {c:/boot.ini} r] trace add variable --finally--trap-- unset [list apply [list args { close $f ; puts done }]] } dostuff chan names ;# -> stdout stderr filed27ae8 stdin proc dostuff {} { set f [open {c:/boot.ini} r] trace add variable --finally--trap-- unset [list apply [list args [list close $f]]] } dostuff chan names ;# -> stdout stderr stdin > The [try] command for matching on something other than the return code > is excellent. Especially if it can match on return values as well as > errorcodes. How about this for a twist on the idea... > > try { > script > } catch { > var ?opts? > } then { > script > } handler .....and so on..... > This fits with extending [catch], e.g. catch { ... } em opts then { ... } handler {...} The feedback I've had so far on this approach has not been favorable. It seems that developers would prefer to keep the args/vars in the context of the handler body. > Regardless, why not have the handler clause evaluate an expression in > the context of a [dict with $opts]? Then you can use whatever matching > function you wish, the only minor pain is that you have to use some > ugly bracketing of the option names { ${-code} == 2 }. But maybe > there's a way around that, too, especially if the [dict with] is > doable read-only and non-destructively somehow. > In a word, performance. I have been having conversations with other Tcl developers off-list, and proposed exactly this. It is unquestionably the most flexible option, but it forces a sequential consideration of each handler's expression, preventing any sort of heuristic to improve the performance of the construct. Since one of the uses of this [try] will be to build other language constructs, performance is something that deserves reasonable consideration. The tradeoff may be to have "pluggable handler matching" where some handlers can use exact matching ( O(1) time), some can use glob, some can use expr, etc. Doing this in a manner that maintains a simple syntax is quite difficult however. > And finally for over-all syntax, what'd be wrong with tagging the > try clauses onto the end of the present [catch] command. Make the > options variable mandatory in this usage, and bring it into scope for > the evaluations as above. > See above. I'm not necessarily against it, but it doesn't seem to be a popular option. >> > handle {code ?resultVar ?optionsVar??} { script } >> > Is there any actual practical use to putting code in the braces? Not that I'm aware of, no. My current thinking is that it will be outside the brackets, e.g. handle code/expr {?resultvar? ?optionsvar?} { body } > Something like a: > withvars {resultVar ?optionsVar?} > following the main try script indicating where to stash the variables. > One advantage of having the vars with the handler script is that it allows you to reuse handlers. e.g. set GENERAL_IO_HANDLER {{em opts} { log "Problem: $em" }} ... try { # some IO routine } handle error * {*}$GENERAL_IO_HANDLER And in this case its no coincidence that the GENERAL_IO_HANDLER looks like an anonymous function that can be used with [apply] > For the blending with [if] option, there was chatter a while back about > fast [expr]-local variables intended mostly to hold partial results > during an expression; the main terms of the options dict could quite > readily be pre-loaded as [expr]-local variables. I'm very interesting in the idea of extending [expr] in various ways, especially to make pattern matching easier and somehow bind the error options as variables into the expr. It's just not going to happen by 10 December, so we can't use any approach that relies on it. Regards, Twylite |
From: Twylite <tw...@cr...> - 2008-11-22 23:42:54
|
Hi, > try ?-matchcommand cmd? script ?handlers ...? ?finally script? > > Where -matchcommand is the command to use to do errorCode matching > and defaults to {switch -glob --}. (May need a -- marker to eliminate > ambiguity). The syntax of the handlers part would be: > > on exception-types ?vars? ?errorPattern? body ?errorPattern > body ...? I'm very uncomfortable with this syntax. It is potentially ambiguous and feels very DWIMy (in a bad way). try { # do stuff } on error { puts hello } finally { puts goodbye } ... is ambiguous. It could be interpreted the way you think, or with "puts hello" as the vars and "finally" as the error pattern. Consider also: try { ... } on error {em opts} "POSIX *" { body } on break Is "on break" an errorPattern and body, or the start of a new exception handler? One can construct other ambiguities that exploit the inability to distinguish between a pattern for a pluggable matcher (i.e. you can't make assumtions about what is and isn't a valid input) and the keywords of the [try] itself. Regards, Twylite |
From: Neil M. <ne...@Cs...> - 2008-11-23 01:12:31
|
On 22 Nov 2008, at 23:42, Twylite wrote: > Hi, >> try ?-matchcommand cmd? script ?handlers ...? ?finally script? >> >> Where -matchcommand is the command to use to do errorCode matching >> and defaults to {switch -glob --}. (May need a -- marker to eliminate >> ambiguity). The syntax of the handlers part would be: >> >> on exception-types ?vars? ?errorPattern? body ?errorPattern >> body ...? > I'm very uncomfortable with this syntax. It is potentially ambiguous > and feels very DWIMy (in a bad way). > > try { > # do stuff > } on error { > puts hello > } finally { > puts goodbye > } > > ... is ambiguous. It could be interpreted the way you think, or with > "puts hello" as the vars and "finally" as the error pattern. > > Consider also: > try { ... } on error {em opts} "POSIX *" { body } on break > > Is "on break" an errorPattern and body, or the start of a new > exception > handler? I don't think it's possible to avoid ambiguity while still preserving ease of use. The alternative is to make less things optional, which just becomes a pain. My preference is to make "on" a keyword in this context. It's highly unlikely that a script or a variable would just be named "on", and if we document this as a keyword in this context then there should be no problem. -- Neil This message has been checked for viruses but the contents of an attachment may still contain software viruses, which could damage your computer system: you are advised to perform your own checks. Email communications with the University of Nottingham may be monitored as permitted by UK legislation. |
From: Twylite <tw...@cr...> - 2008-11-23 12:45:39
|
I think it's time for a summary of where we are on the try/catch/finally. The overall intent of the TIP can be summed up as "make a control structure that makes dealing with exceptions, errors and resource cleanup simpler - both logically and visually". 1. Functionality We want to (a) Handle return codes, so that we can build control structures and handle exceptions that use return codes. In most cases an exact match against a single integer (or magic name) is sufficient. (b) Handle matching against -errorCode in the case of the return code TCL_ERROR (1), so that we can have something similar to other languages with typed exception handling. In most cases a prefix match, or glob match, or element-wise glob match on a list is sufficient. (aside) Any argument about the ugliness of handling return codes with catch+if/then or catch+switch applies equally to handing errors & -errorCode, and vice versa. As a result this TIP must provide for both (a) and (b), although it is not necessarily a requirement that they are provided for in the same command. (c) Handle on those exceptions/errors that are of interest (can be handled at this point) and let others propagate normally. (d) Handle success continuation, i.e. branch when there is no error/exception. This is not generally supported by procedural languages but the requirement has been expressed by several developers and TCT members. (e) Handle cleanup at the end of a block of code by means of a "finally" handler (regardless of errors/exceptions). (f) Have reasonable performance, at least for the common cases. (g) Discourage the use of the result for determining the nature of the error (an in doing so encourage the use of -errorCode). At the very least this means not having default support for matching on the result. (h) For exceptions thrown from handlers and finally blocks, maintain the details of the original exception (i.e. chain exceptions in the options dict). 2. Look & Feel (a) It's going to be called [try] (b) Handlers are identified by keywords. The keyword "catch" has been argued against (confusion with existing language feature/keyword), as has "except" (ambiguous - "with exception" or "except for"). Likely candidates are "on" and "handle". (c) A [try] statement is going to look more like an "if {} then {} else {}" than a "switch { case {body} case {body} }". The former seems to be preferred by everyone involved in the discussions. (d) Capture of variables (return code, result and options dict) needs to happen at the front of the [try] for the statement as a whole, rather than per handler. This avoids confusion over which vars will be defined after the [try] returns, and also avoids variable churn if the errorPattern to be matched can access the variables. Although some amount of locality is lost, this also makes the syntax cleaner (less repeated "noise"). (e) A [switch]-like "fall through to next statement" would be a nice-to-have. 3. Matching (a) In general the matching of exceptions (return code) and errors (errorCode where the return code is TCL_ERROR) are separate concerns. It makes sense to exploit this by using a fast/exact match against the return code first (meeting the performance requirement) followed by a slower match against the errorCode. (b) When matching against errorCode: (i) There is (largely) consensus that basic pattern matching is "good enough" "for now". Basic pattern matching may be defined as prefix matching, glob matching against errorCode (as a string), or an element-wise list-glob match against errorCode (as a list). In short there is no agreement on the right way to do this. (ii) There is also no guarantee that a match against errorCode will be adequate in the future. For example an OO-style error object may be developed. (iii) An [expr]-type match is the most flexible but the lowest performance (and potentially ugliest syntax); as such it is not suitable (at least not as a default). (iv) Delegating to [switch] for matching is a nice compromise of performance and flexibility (and reuses existing functionality), but brings with it the baggage of the [switch] command's interface. (c) The only thing we _can_ be sure of is that whatever we choose now will be inadequate in some what, implying that the syntax of [try] must contain provision for future extension. (d) Taking (c) to its logical conclusion, [try] must be specified and implemented to support user-selectable pattern matching. It is possible to have the matcher selected for the [try] as a whole, or per handler, and there are pros and cons to each approach. In terms of (d) my personal preference is to specify the matcher per handler. It is difficult to predict how different packages/libraries may approach error handling, both now and in the future (e.g. a future move from -errorCode to an OO-style error object). If the matcher is selected for the [try] as a whole it may only be possible to support disparate error handling styles by using the most flexible and complex matcher (say [expr]-based), which could be an unnecessary complication. The same holds now for integrating with legacy code that only produces meaningful error information in the result. 4. General These are weakly-expressed requirements or requirements of my own. (a) There is a general desire for consistency / symmetry in the syntax. This would obviously improve the readability & understandability of the source code. (b) The behaviour of the [try] should be predictable and conform to the principle of least surprise. On particular consequence of this is that matchers must consider handlers/errorPatterns in left-to-right order, and all handlers should be executed in the same fashion (implying that the [try] rather than the matcher should execute the handler body). On the issue of ordering, left-to-right is the only order than makes sense for [expr]-based matching, and is the norm in other languages. 5. Proposal Based on this summary of the discussions so far, this is my current proposal: try tryscript ?as {vars}? ?handler ...? ?finally finalscript? where handler is on code ?-matcher pattern? handlerscript and handlerscript may be "-" to fall through to the next handlerscript The tryscript is executed and the outcome (return code, result, options dict) is captured into vars. A fast match (possibly a dict lookup) is performed to find the handler(s) for that code. If there is an unqualified handler (one with no matcher) or a single handler for the code, then it is executed; otherwise each handler is considered in turn (left-to-right order) by calling the associated matcher, and the first matching handler is executed. If no matching handler is found then the exception is propagated. The implementation will probably provide the following handlers by default (users can implement their own): - -like for glob matching against errorCode as a string (perhaps -glob?) - -llike for element-wise list-glob matching against errorCode as a list - -expr for expr-based matching (with access to return code, result & options dict) Example of use: try { # do stuff } as {code em opts} on ok { # do more } on break - on continue { # special handing for break & continue } on error -like "POSIX *" { # handle POSIX errors } on error -expr { $em in {BAD FOO BAR} } { # support legacy errors } finally { # cleanup } Concerns: - How to handle "all other errors" (-like * would work, is that good enough?) - No handlerscript may begin with a "-". - No feedback yet on "as {vars}" and the order of the vars - If there are multiple handlers and one is unqualified, should it be executed first or last? Alternatives: - The body "-" is reserved to indicate fallthrough to the next body. The body "+" could be reserved to indicate that a matcher and pattern follow. e.g. on error + like "POSIX *" { ... } I feel that this proposal meets the requirements with the greatest flexibility and the least ambiguity. But of course that's my opinion. Regards, Twylite |
From: Neil M. <ne...@Cs...> - 2008-11-23 15:43:07
|
On 23 Nov 2008, at 12:45, Twylite wrote: > I think it's time for a summary of where we are on the try/catch/ > finally. > > The overall intent of the TIP can be summed up as "make a control > structure that makes dealing with exceptions, errors and resource > cleanup simpler - both logically and visually". > > 1. Functionality > > We want to > (a) Handle return codes, so that we can build control structures and > handle exceptions that use return codes. In most cases an exact match > against a single integer (or magic name) is sufficient. > (b) Handle matching against -errorCode in the case of the return code > TCL_ERROR (1), so that we can have something similar to other > languages > with typed exception handling. In most cases a prefix match, or glob > match, or element-wise glob match on a list is sufficient. > (aside) Any argument about the ugliness of handling return codes with > catch+if/then or catch+switch applies equally to handing errors & > -errorCode, and vice versa. As a result this TIP must provide for > both > (a) and (b), although it is not necessarily a requirement that they > are > provided for in the same command. > (c) Handle on those exceptions/errors that are of interest (can be > handled at this point) and let others propagate normally. > (d) Handle success continuation, i.e. branch when there is no > error/exception. This is not generally supported by procedural > languages but the requirement has been expressed by several developers > and TCT members. > (e) Handle cleanup at the end of a block of code by means of a > "finally" > handler (regardless of errors/exceptions). > (f) Have reasonable performance, at least for the common cases. > (g) Discourage the use of the result for determining the nature of the > error (an in doing so encourage the use of -errorCode). At the very > least this means not having default support for matching on the > result. > (h) For exceptions thrown from handlers and finally blocks, > maintain the > details of the original exception (i.e. chain exceptions in the > options > dict). Agree with all of these. > > 2. Look & Feel > > (a) It's going to be called [try] > (b) Handlers are identified by keywords. The keyword "catch" has been > argued against (confusion with existing language feature/keyword), as > has "except" (ambiguous - "with exception" or "except for"). Likely > candidates are "on" and "handle". > (c) A [try] statement is going to look more like an "if {} then {} > else > {}" than a "switch { case {body} case {body} }". The former seems > to be > preferred by everyone involved in the discussions. Agreed. > (d) Capture of variables (return code, result and options dict) > needs to > happen at the front of the [try] for the statement as a whole, rather > than per handler. This avoids confusion over which vars will be > defined > after the [try] returns, and also avoids variable churn if the > errorPattern to be matched can access the variables. Although some > amount of locality is lost, this also makes the syntax cleaner (less > repeated "noise"). Don't entirely agree with this. I don't believe we need to care about inconsistent sets of vars being defined after the try -- it's not a problem for [if], [switch], and every other control structure, so I don't believe we need to give it special consideration here. Agree though that it is generally more useful for whatever pattern matching mechanism is used to be called with a set of pattern/script pairs and the variables already set-up in the callers scope. Whether that means binding the vars for the entire try statement or once per exception code is a matter of choice. Either seems acceptable. > (e) A [switch]-like "fall through to next statement" would be a > nice-to-have. Clarifying this -- we want the ability to specify the same script for multiple patterns (and possibly multiple exception codes). The switch approach is one way. > > 3. Matching > > (a) In general the matching of exceptions (return code) and errors > (errorCode where the return code is TCL_ERROR) are separate concerns. > It makes sense to exploit this by using a fast/exact match against the > return code first (meeting the performance requirement) followed by a > slower match against the errorCode. > (b) When matching against errorCode: > (i) There is (largely) consensus that basic pattern matching is > "good > enough" "for now". Basic pattern matching may be defined as prefix > matching, glob matching against errorCode (as a string), or an > element-wise list-glob match against errorCode (as a list). In short > there is no agreement on the right way to do this. If adopting some novel pattern mechanism, then there is the further question of whether to special case that in [try] or to extract it out into a separate command (and separate TIP). > (ii) There is also no guarantee that a match against errorCode > will be > adequate in the future. For example an OO-style error object may be > developed. > (iii) An [expr]-type match is the most flexible but the lowest > performance (and potentially ugliest syntax); as such it is not > suitable > (at least not as a default). > (iv) Delegating to [switch] for matching is a nice compromise of > performance and flexibility (and reuses existing functionality), but > brings with it the baggage of the [switch] command's interface. This depends how it is done, and how it is documented. You can delegate to [switch] either implicitly or explicitly (as a - matchcommand) and still avoid acquiring [switch]'s interface. You just document that pattern-matching is handled by [switch] and that as far as [try] is concerned the patterns are just opaque data that it passes on. Introducing an explicit option for this enhances this rationale, as then the pattern matcher is just another callback. What we definitely don't want to do is introduce [switch]'s various options, like -nocase, -regexp etc as options of [try]. That would constrain the implementation and be a mess. A callback solution avoids this as the options can be specified as part of the callback command, rather than as part of the try command. > (c) The only thing we _can_ be sure of is that whatever we choose now > will be inadequate in some what, implying that the syntax of [try] > must > contain provision for future extension. > (d) Taking (c) to its logical conclusion, [try] must be specified and > implemented to support user-selectable pattern matching. It is > possible > to have the matcher selected for the [try] as a whole, or per handler, > and there are pros and cons to each approach. > > In terms of (d) my personal preference is to specify the matcher per > handler. It is difficult to predict how different packages/libraries > may approach error handling, both now and in the future (e.g. a future > move from -errorCode to an OO-style error object). If the matcher is > selected for the [try] as a whole it may only be possible to support > disparate error handling styles by using the most flexible and complex > matcher (say [expr]-based), which could be an unnecessary > complication. > The same holds now for integrating with legacy code that only produces > meaningful error information in the result. I'd be interested to see the interface proposed for this. Clearly the most flexible approach is to allow an arbitrary script to do the matching, but then we end up right back at the beginning of this discussion where [try] just does exception-code dispatch and leaves everything else up to a script. I believe we've ruled that option out, as it violates requirements 1.b and 1.c. > > 4. General > > These are weakly-expressed requirements or requirements of my own. > > (a) There is a general desire for consistency / symmetry in the > syntax. > This would obviously improve the readability & understandability of > the > source code. > (b) The behaviour of the [try] should be predictable and conform to > the > principle of least surprise. On particular consequence of this is > that > matchers must consider handlers/errorPatterns in left-to-right order, > and all handlers should be executed in the same fashion (implying > that > the [try] rather than the matcher should execute the handler > body). On > the issue of ordering, left-to-right is the only order than makes > sense > for [expr]-based matching, and is the norm in other languages. I don't believe [try] has to execute the bodies. All it has to do is ensure that any option/result variables are defined in the calling scope when that script runs. For example: upvar 1 $msgVar msg $optsVar opts set rc [catch { $script } msg opts] invoke 1 $matchcmd [dict get $opts -errorcode] [dict get $handlers $rc] should do the right thing for most matching constructs. I also believe the order in which to consider patterns should be left to the match command. [... snip proposal: I'll post a separate message for that ...] -- Neil This message has been checked for viruses but the contents of an attachment may still contain software viruses, which could damage your computer system: you are advised to perform your own checks. Email communications with the University of Nottingham may be monitored as permitted by UK legislation. |
From: Twylite <tw...@cr...> - 2008-11-23 17:06:09
|
Hi, > Don't entirely agree with this. I don't believe we need to care about > inconsistent sets of vars being defined after the try -- it's not a > problem for [if], [switch], and every other control structure, so I > don't believe we need to give it special consideration here. Agree > though that it is generally more useful for whatever pattern matching > mechanism is used to be called with a set of pattern/script pairs and > the variables already set-up in the callers scope. Whether that means > binding the vars for the entire try statement or once per exception > code is a matter of choice. Either seems acceptable. The difference being that in the case of [if] or [switch] the executed body defines the vars; in this case the [try] itself defines the vars. Semantics. The stronger argument is the performance impact of bringing per-handler vars into scope and back out of scope each time. >> (e) A [switch]-like "fall through to next statement" would be a >> nice-to-have. > Clarifying this -- we want the ability to specify the same script for > multiple patterns (and possibly multiple exception codes). The switch > approach is one way. Cool. >> 3. Matching >> (b) When matching against errorCode: >> (i) There is (largely) consensus that basic pattern matching is "good >> enough" "for now". Basic pattern matching may be defined as prefix >> matching, glob matching against errorCode (as a string), or an >> element-wise list-glob match against errorCode (as a list). In short >> there is no agreement on the right way to do this. > If adopting some novel pattern mechanism, then there is the further > question of whether to special case that in [try] or to extract it out > into a separate command (and separate TIP). So that we have [catch], [try], [try2], ... as we discover new and different needs for exception handling? No thanks. We should either get [try] sufficiently right now (which is closer to 99% than 80%) or make it extensible. Preferably the latter since we don't know what is 99% right. >> (iv) Delegating to [switch] for matching is a nice compromise of >> performance and flexibility (and reuses existing functionality), but >> brings with it the baggage of the [switch] command's interface. > This depends how it is done, and how it is documented. You can > delegate to [switch] either implicitly or explicitly (as a > -matchcommand) and still avoid acquiring [switch]'s interface. You > just document that pattern-matching is handled by [switch] and that as > far as [try] is concerned the patterns are just opaque data that it > passes on. Introducing an explicit option for this enhances this > rationale, as then the pattern matcher is just another callback. What > we definitely don't want to do is introduce [switch]'s various > options, like -nocase, -regexp etc as options of [try]. That would > constrain the implementation and be a mess. A callback solution avoids > this as the options can be specified as part of the callback command, > rather than as part of the try command. I was meaning a [try] that uses [switch] implicitly. You would need something in the interface of [try] that would configure the [switch], otherwise you are limited to some predetermined configuration (like -glob --). Agreed that a callback gets around this, and discussed in (c) and (d) below. >> (c) The only thing we _can_ be sure of is that whatever we choose now >> (d) Taking (c) to its logical conclusion, [try] must be specified and >> implemented to support user-selectable pattern matching. It is possible >> to have the matcher selected for the [try] as a whole, or per handler, >> and there are pros and cons to each approach. >> >> In terms of (d) my personal preference is to specify the matcher per >> handler. It is difficult to predict how different packages/libraries >> may approach error handling, both now and in the future (e.g. a future >> move from -errorCode to an OO-style error object). If the matcher is > I'd be interested to see the interface proposed for this. Clearly the > most flexible approach is to allow an arbitrary script to do the > matching, but then we end up right back at the beginning of this > discussion where [try] just does exception-code dispatch and leaves > everything else up to a script. I believe we've ruled that option out, > as it violates requirements 1.b and 1.c. I believe the interface I proposed does not violate (1.b), and provides an acceptable compromise on (1.c). >> (b) The behaviour of the [try] should be predictable and conform to the >> principle of least surprise. On particular consequence of this is that >> matchers must consider handlers/errorPatterns in left-to-right order, >> and all handlers should be executed in the same fashion (implying that >> the [try] rather than the matcher should execute the handler body). On >> the issue of ordering, left-to-right is the only order than makes sense >> for [expr]-based matching, and is the norm in other languages. > I don't believe [try] has to execute the bodies. All it has to do is > ensure that any option/result variables are defined in the calling > scope when that script runs. For example: There are a bunch of other things [try] has to do, including catching errors off the handlerscript (and match command, for that matter) in order to chain the errors, execute the finally script, etc. Having the match command execute the body means that its not just a match command but a fully fledged control structure, it must behave in a way that is predictable to the [try] command (i.e. [try] needs to make certain assumptions about what it will do), and the [try] cannot distinguish between a failure in the match command and a failure in the handlerscript. It also has the potential to make the errorInfo very ugly -- you will see an exception in a handler in a matchcommand in a try. If you try to use a [return -level] to avoid this you will end up with unsafe nesting and/or making assumptions about the internals of [try]. In your proposal you also talk about the match command adding a default -- this would not work if [try] is expected to chain errors, as [try] would catch the default (assumedly rethrown) error and chain it to itself (i.e. the error that [try] knows about). Any way I look at it, having the match command execute the body joins together separate concerns (matching, and execution), and there are only two arguments for this: (1) Performance. The largest number of exception handlers I've ever seen attached to a single try is 5 or 6. It there ever going to be a large enough number that the performance difference will be significant? (2) Specifically allowing the order of matching to be determined by the match command. > I also believe the order in which to consider patterns should be left > to the match command. I think non-determinism in the syntax of a language is a very bad thing. Notice that even [switch] is documented as: "The switch command matches its string argument against each of the pattern arguments in order", so the behaviour is deterministic and unsurprising from a user perspective, and a linear trawl would be no slower than a matcher that uses [switch]. In order to ensure the performance of the "common case" the most common matcher (probably "-like") could be hard-coded into the [try] implementation. Regards, Twylite |
From: Neil M. <ne...@Cs...> - 2008-11-23 18:04:19
|
On 23 Nov 2008, at 17:06, Twylite wrote: > Hi, >> Don't entirely agree with this. I don't believe we need to care >> about inconsistent sets of vars being defined after the try -- >> it's not a problem for [if], [switch], and every other control >> structure, so I don't believe we need to give it special >> consideration here. Agree though that it is generally more useful >> for whatever pattern matching mechanism is used to be called with >> a set of pattern/script pairs and the variables already set-up in >> the callers scope. Whether that means binding the vars for the >> entire try statement or once per exception code is a matter of >> choice. Either seems acceptable. > The difference being that in the case of [if] or [switch] the > executed body defines the vars; in this case the [try] itself > defines the vars. Semantics. I don't think that actually matters. In what circumstance do you envision this causing a real problem? > The stronger argument is the performance impact of bringing per- > handler vars into scope and back out of scope each time. I don't see this point, could you elaborate? The vars only need to be defined once. >>> (e) A [switch]-like "fall through to next statement" would be a >>> nice-to-have. >> Clarifying this -- we want the ability to specify the same script >> for multiple patterns (and possibly multiple exception codes). The >> switch approach is one way. > Cool. >>> 3. Matching >>> (b) When matching against errorCode: >>> (i) There is (largely) consensus that basic pattern matching is >>> "good >>> enough" "for now". Basic pattern matching may be defined as prefix >>> matching, glob matching against errorCode (as a string), or an >>> element-wise list-glob match against errorCode (as a list). In >>> short >>> there is no agreement on the right way to do this. >> If adopting some novel pattern mechanism, then there is the >> further question of whether to special case that in [try] or to >> extract it out into a separate command (and separate TIP). > So that we have [catch], [try], [try2], ... as we discover new and > different needs for exception handling? No thanks. We should > either get [try] sufficiently right now (which is closer to 99% > than 80%) or make it extensible. Preferably the latter since we > don't know what is 99% right. No -- I mean you would have [try] and some [lmatch] command. >>> [...] >> I don't believe [try] has to execute the bodies. All it has to do >> is ensure that any option/result variables are defined in the >> calling scope when that script runs. For example: > There are a bunch of other things [try] has to do, including > catching errors off the handlerscript (and match command, for that > matter) in order to chain the errors, execute the finally script, > etc. Having the match command execute the body means that its not > just a match command but a fully fledged control structure, it must > behave in a way that is predictable to the [try] command (i.e. > [try] needs to make certain assumptions about what it will do), and > the [try] cannot distinguish between a failure in the match command > and a failure in the handlerscript. I don't see why [try] has to know anything at all about it. It is just passed a callback that takes the errorcode and a list of pattern- >script pairs, and simply calls it, returning whatever it returns (including exceptions). All it needs to do is ensure any "finally" script runs. > It also has the potential to make the errorInfo very ugly -- you > will see an exception in a handler in a matchcommand in a try. If > you try to use a [return -level] to avoid this you will end up with > unsafe nesting and/or making assumptions about the internals of [try]. I don't see this as a problem. If [try] is documented as delegating to a match command then it makes sense for that command to appear in the stack trace. [try] can always pretty up the errorinfo if it helps. > In your proposal you also talk about the match command adding a > default -- this would not work if [try] is expected to chain > errors, as [try] would catch the default (assumedly rethrown) error > and chain it to itself (i.e. the error that [try] knows about). A simple equality check would avoid this (a Tcl_Obj pointer comparison). Alternatively, the default script can be manufactured to signal this special condition. It's a problem of implementation not interface. > Any way I look at it, having the match command execute the body > joins together separate concerns (matching, and execution), and > there are only two arguments for this: Yes, you can separate these concerns, of course. But neither of them need to be handled by [try]. > (1) Performance. > > The largest number of exception handlers I've ever seen attached to > a single try is 5 or 6. It there ever going to be a large enough > number that the performance difference will be significant? Possibly, in generated code. E.g. there are quite a large number of possible HTTP return codes. If these got put into an errorCode {HTTP 302 /redirected.html} then I can quite imagine HTTP client libraries wanting large try statements and wanting fast lookup. > (2) Specifically allowing the order of matching to be determined by > the match command. >> I also believe the order in which to consider patterns should be >> left to the match command. > I think non-determinism in the syntax of a language is a very bad > thing. Notice that even [switch] is documented as: "The switch > command matches its string argument against each of the pattern > arguments in order", so the behaviour is deterministic and > unsurprising from a user perspective, and a linear trawl would be > no slower than a matcher that uses [switch]. In order to ensure the > performance of the "common case" the most common matcher (probably > "-like") could be hard-coded into the [try] implementation. This is the point -- the behaviour isn't non-deterministic as it is explicit what command is being used for matching, and the docs for that command specify the ordering used. Non-determinism doesn't require that [try] specify every last detail of execution -- it can happily delegate those responsibilities. -- Neil This message has been checked for viruses but the contents of an attachment may still contain software viruses, which could damage your computer system: you are advised to perform your own checks. Email communications with the University of Nottingham may be monitored as permitted by UK legislation. |
From: Twylite <tw...@cr...> - 2008-11-23 20:36:56
|
Hi, > I don't think that actually matters. In what circumstance do you > envision this causing a real problem? > I don't -- it's an understandability & readability thing. Everyone's going to have their own take. >> The stronger argument is the performance impact of bringing per- >> handler vars into scope and back out of scope each time. >> > I don't see this point, could you elaborate? The vars only need to be > defined once. > If the vars are defined per handler, and can be different per handler, then multiple different variables must be brought into scope (and possibly out again if you don't want them handing around if the handler didn't match). I think the confusion here is what constitutes a "handler" - according to all my proposals a "handler" is (return code + optional more specific pattern), but your recent proposal is (return code + pattern1 + pattern2 + ...). Clearly there would issues of performance and dangling variables in the case I am understanding, but not in the case of your proposal. > No -- I mean you would have [try] and some [lmatch] command. > Sorry - misunderstanding. > I don't see why [try] has to know anything at all about it. It is > just passed a callback that takes the errorcode and a list of pattern- > >script pairs, and simply calls it, returning whatever it returns > (including exceptions). All it needs to do is ensure any "finally" > script runs. > Because [try] is a (core) command that is promising a particular interface & behaviour, but the implementation cannot guarantee the behaviour as it delegates too much to a matcher that _may_ be implemented outside the core. And because, as you highlight below, you have to identify and compensate for the corner cases in the matcher. > I don't see this as a problem. If [try] is documented as delegating > to a match command then it makes sense for that command to appear in > the stack trace. [try] can always pretty up the errorinfo if it helps. > > A simple equality check would avoid this (a Tcl_Obj pointer > comparison). Alternatively, the default script can be manufactured to > signal this special condition. It's a problem of implementation not > interface. > All I'm saying is that rather than have the matcher actually execute the script, it should return it (or the index of the script in whatever list/dict was provided to the matcher) and allow the [try] to execute the script directly. >> (1) Performance. >> >> The largest number of exception handlers I've ever seen attached to >> a single try is 5 or 6. It there ever going to be a large enough >> number that the performance difference will be significant? >> > Possibly, in generated code. E.g. there are quite a large number of > possible HTTP return codes. If these got put into an errorCode {HTTP > 302 /redirected.html} then I can quite imagine HTTP client libraries > wanting large try statements and wanting fast lookup. > Fair case. But how will they do it now? I would imagine most developers would happily use a [switch], not realising that they are not getting O(1) performance out of it. >> I think non-determinism in the syntax of a language is a very bad >> thing. Notice that even [switch] is documented as: "The switch >> command matches its string argument against each of the pattern >> arguments in order", so the behaviour is deterministic and > This is the point -- the behaviour isn't non-deterministic as it is > explicit what command is being used for matching, and the docs for > that command specify the ordering used. Non-determinism doesn't > require that [try] specify every last detail of execution -- it can > happily delegate those responsibilities. > The point is that irrespective of whether you are using -glob, -regex or -exact, you as a developer can scan the [switch] cases in left-to-right order and know that the first match will be the one that will be used. If [try] delegates its ordering then you cannot do this. You need to know the behaviour of "try -command mymatcher". Given "try -command oo_matcher { .... } on error SomeException { ... } on error OtherException { ... }" it is reasonable to assume that you're matching on the class of the exception object, but if OtherException is a child of SomeException, which one will match? Language syntax should enable you to determine that. Leaving it to a pluggable handler means that a novice developer or maintenance coder needs to understand every nuance of [try] and every matcher you use to understand the behaviour of a rather elementary control structure. Regards, Twylite |
From: Neil M. <ne...@Cs...> - 2008-11-23 21:35:18
|
On 23 Nov 2008, at 20:36, Twylite wrote: > [...] >>> The stronger argument is the performance impact of bringing per- >>> handler vars into scope and back out of scope each time. >>> >> I don't see this point, could you elaborate? The vars only need to be >> defined once. >> > If the vars are defined per handler, and can be different per handler, > then multiple different variables must be brought into scope (and > possibly out again if you don't want them handing around if the > handler > didn't match). > I think the confusion here is what constitutes a "handler" - according > to all my proposals a "handler" is (return code + optional more > specific > pattern), but your recent proposal is (return code + pattern1 + > pattern2 > + ...). Clearly there would issues of performance and dangling > variables in the case I am understanding, but not in the case of your > proposal. In my scheme only a single handler ever gets as far as defining its variables. So there is no need to bring multiple sets of vars into and out of scope. >> No -- I mean you would have [try] and some [lmatch] command. >> > Sorry - misunderstanding. > >> I don't see why [try] has to know anything at all about it. It is >> just passed a callback that takes the errorcode and a list of >> pattern- >>> script pairs, and simply calls it, returning whatever it returns >> (including exceptions). All it needs to do is ensure any "finally" >> script runs. >> > Because [try] is a (core) command that is promising a particular > interface & behaviour, but the implementation cannot guarantee the > behaviour as it delegates too much to a matcher that _may_ be > implemented outside the core. Then don't guarantee that behaviour. > And because, as you highlight below, you have to identify and > compensate > for the corner cases in the matcher. >> I don't see this as a problem. If [try] is documented as delegating >> to a match command then it makes sense for that command to appear in >> the stack trace. [try] can always pretty up the errorinfo if it >> helps. >> >> A simple equality check would avoid this (a Tcl_Obj pointer >> comparison). Alternatively, the default script can be manufactured to >> signal this special condition. It's a problem of implementation not >> interface. >> > All I'm saying is that rather than have the matcher actually > execute the > script, it should return it (or the index of the script in whatever > list/dict was provided to the matcher) and allow the [try] to execute > the script directly. Sure, you *could* do that, but that excludes using [switch] or most other control structures, which expect to directly execute the chosen branch rather than just returning it. I really don't see what is gained from having [try] execute the script: it's the difference between doing [catch {$matchcmd ...}] vs set script [$matchcmd ...]; catch {uplevel 1 $script}. >>> (1) Performance. >>> >>> The largest number of exception handlers I've ever seen attached to >>> a single try is 5 or 6. It there ever going to be a large enough >>> number that the performance difference will be significant? >>> >> Possibly, in generated code. E.g. there are quite a large number of >> possible HTTP return codes. If these got put into an errorCode {HTTP >> 302 /redirected.html} then I can quite imagine HTTP client libraries >> wanting large try statements and wanting fast lookup. >> > Fair case. > But how will they do it now? I would imagine most developers would > happily use a [switch], not realising that they are not getting O(1) > performance out of it. switch -exact is O(1), or should be. >>> I think non-determinism in the syntax of a language is a very bad >>> thing. Notice that even [switch] is documented as: "The switch >>> command matches its string argument against each of the pattern >>> arguments in order", so the behaviour is deterministic and >> This is the point -- the behaviour isn't non-deterministic as it is >> explicit what command is being used for matching, and the docs for >> that command specify the ordering used. Non-determinism doesn't >> require that [try] specify every last detail of execution -- it can >> happily delegate those responsibilities. >> > The point is that irrespective of whether you are using -glob, - > regex or > -exact, you as a developer can scan the [switch] cases in left-to- > right > order and know that the first match will be the one that will be used. > If [try] delegates its ordering then you cannot do this. You need to > know the behaviour of "try -command mymatcher". What's wrong with that? If I know it's try -matchcommand {switch - glob --} then I know to expect left-to-right behaviour. If I know it's something based on a hash lookup, then I know to expect only exact matching. > > Given "try -command oo_matcher { .... } on error SomeException { ... } > on error OtherException { ... }" it is reasonable to assume that > you're > matching on the class of the exception object, but if > OtherException is > a child of SomeException, which one will match? That's up to oo_matcher. That's a good thing. > Language syntax should > enable you to determine that. Leaving it to a pluggable handler means > that a novice developer or maintenance coder needs to understand every > nuance of [try] and every matcher you use to understand the > behaviour of > a rather elementary control structure. That argument applies to any command that takes a callback. You could equally say that no-one can know the behaviour of [lsort -command] without knowing every possible comparison function. Of course, it's not a problem because the language syntax *does* make it obvious which command is being used: You look at the -command/-matchcommand option and examine the docs of the corresponding command. -- Neil This message has been checked for viruses but the contents of an attachment may still contain software viruses, which could damage your computer system: you are advised to perform your own checks. Email communications with the University of Nottingham may be monitored as permitted by UK legislation. |
From: Donal K. F. <don...@ma...> - 2008-11-23 16:37:15
|
Twylite wrote: > I think it's time for a summary of where we are on the try/catch/finally. [...] > (f) Have reasonable performance, at least for the common cases. Some notes on this: I plan to make [try] bytecode compiled as it has script bodies. But it is highly likely that I won't have time to do this before 8.6b1! To facilitate this, only requiring equality- or glob-matching (and having it be compile-time decidable) will be good. Please do not require lots of nested parsing of lists and things like that either (especially lists of scripts) since working with those things in the compiler is really unpleasant. For the b1 release, a pure Tcl scripted version will be good enough. > (c) A [try] statement is going to look more like an "if {} then {} else > {}" than a "switch { case {body} case {body} }". The former seems to be > preferred by everyone involved in the discussions. In particular, it's much easier to compile. > (a) In general the matching of exceptions (return code) and errors > (errorCode where the return code is TCL_ERROR) are separate concerns. > It makes sense to exploit this by using a fast/exact match against the > return code first (meeting the performance requirement) followed by a > slower match against the errorCode. We'll probably build a jump table. > (b) When matching against errorCode: > (i) There is (largely) consensus that basic pattern matching is "good > enough" "for now". Basic pattern matching may be defined as prefix > matching, glob matching against errorCode (as a string), or an > element-wise list-glob match against errorCode (as a list). In short > there is no agreement on the right way to do this. I think we'll find that glob-matching is good enough. Almost everyone will use it for prefix matching; Tcl's glob-matcher is good at that. > (ii) There is also no guarantee that a match against errorCode will be > adequate in the future. For example an OO-style error object may be > developed. Out Of Scope! If someone needs something that complicated, they'll have to write their own code. > (iii) An [expr]-type match is the most flexible but the lowest > performance (and potentially ugliest syntax); as such it is not suitable > (at least not as a default). Need I repeat myself? OOS! > (iv) Delegating to [switch] for matching is a nice compromise of > performance and flexibility (and reuses existing functionality), but > brings with it the baggage of the [switch] command's interface. If we require [switch] for the normal case, we might as well not use [try] at all. The purpose of [try] is to reduce baggage for important classes of error handling. > (c) The only thing we _can_ be sure of is that whatever we choose now > will be inadequate in some what, implying that the syntax of [try] must > contain provision for future extension. > (d) Taking (c) to its logical conclusion, [try] must be specified and > implemented to support user-selectable pattern matching. It is possible > to have the matcher selected for the [try] as a whole, or per handler, > and there are pros and cons to each approach. > > In terms of (d) my personal preference is to specify the matcher per > handler. It is difficult to predict how different packages/libraries > may approach error handling, both now and in the future (e.g. a future > move from -errorCode to an OO-style error object). If the matcher is > selected for the [try] as a whole it may only be possible to support > disparate error handling styles by using the most flexible and complex > matcher (say [expr]-based), which could be an unnecessary complication. > The same holds now for integrating with legacy code that only produces > meaningful error information in the result. I write quite a lot of Java code, and I don't think there's anything really worth it to be gained from OO exceptions. A list that folks can match against is good enough, and they can add their own complexity if they really want. We're building a bikeshed, not an aircraft carrier! > On > the issue of ordering, left-to-right is the only order than makes sense > for [expr]-based matching, and is the norm in other languages. Not all. C is unspecified. But Tcl is very much left-to-right. > The implementation will probably provide the following handlers by > default (users can implement their own): > - -like for glob matching against errorCode as a string (perhaps -glob?) > - -llike for element-wise list-glob matching against errorCode as a list > - -expr for expr-based matching (with access to return code, result & > options dict) Too complicated by far. Glob is enough. If people want to match by ouija board, they can write their own command. (To be clear, that's an example of a carrier deck, undoubtedly useful to some but not part of any sensible bikeshed...) > I feel that this proposal meets the requirements with the greatest > flexibility and the least ambiguity. But of course that's my opinion. I feel that you're chasing off in the wrong direction. Try this: try script ?as {msgvar optvar}? ?handler...? ?finally script? Each handler is one of these: on code script trap glob script Where 'code' is any numeric code or named alias or '*' (to mean any) and 'glob' is a pattern according to [string match] to be checked against the errorcode with an implied code of 'error'. Only errors in the initial script are trapped; errors in any handler replace the original. The finally script is run after all else, and in all cases (except for interpreter deletion, execution cancellation or resource exhaustion) and errors in *it* will replace all others. All handlers except the last one may be the string literal '-', which means use the one following; the last one must not be that, and the finally clause is not a handler. I'll not argue over the names 'on' or 'trap'. Expect a fight on anything else as this is probably as complicated as it is sensible to go. :-) Note that there's no need for an explicit rethrowing command (can do that with [return] and the options dict) and there's no need for an explicit variable for the code; it's in the options dict. Donal. |
From: Twylite <tw...@cr...> - 2008-11-23 17:45:00
|
Hi, > Please do not require lots of nested parsing of lists and things like > that either (especially lists of scripts) since working with those > things in the compiler is really unpleasant. > I take it from your proposal for "as {msgvar optsvar}" that this isn't considered "nested parsing of lists"? > Out Of Scope! If someone needs something that complicated, they'll have > to write their own code. > Having to write your own control structure just because the existing one doesn't do what you need (or at least doesn't do it in a pretty way) is what this exercise is all about, and what we're trying to avoid happening again. > I write quite a lot of Java code, and I don't think there's anything > really worth it to be gained from OO exceptions. A list that folks can > match against is good enough, and they can add their own complexity if > they really want. We're building a bikeshed, not an aircraft carrier! > Oh dear ... I was building a Jeep. > Too complicated by far. Glob is enough. If people want to match by ouija > board, they can write their own command. (To be clear, that's an example > of a carrier deck, undoubtedly useful to some but not part of any > sensible bikeshed...) > It occurs to one that once upon a time there was a need for a simpler, prettier alternative to if/then/elseif/elseif/elseif/elseif/elseif/else. And so [switch] was born. It also occurs to me that in C a switch is over a set of integer values. In Tcl it was obvious to make [switch] operate on strings, but not just that - it would be able to match against wildcard patterns _and regular expressions_. And to do so it would add interface complexity and sacrifice performance (in particular it was necessary to specify the order of evaluation). Are you _sure_ glob is enough? I'm not. So I want a syntax that doesn't preclude extension (in a pretty) to handle other options in the future. And I'd like a syntax that allows developers to create these extensions outside the core, so that these options can evolve in future rather than end up in a length discussion that really has few facts and figures to back up things like "most developers" and "common case". > I feel that you're chasing off in the wrong direction. Try this: > > try script ?as {msgvar optvar}? ?handler...? ?finally script? > > Each handler is one of these: > > on code script > trap glob script > Versus: on code ?-howtomatch whattomatch? script I cannot comment on the implications of byte-coding that, but I do feel that it is more consistent (on error vs trap), more flexible, etc. Your proposal is of course extensible by adding new handler keywords in future (assuming the TIP proposers at the time can agree on the keyword), but this would have to be done in the core. Regards, Twylite |
From: Twylite <tw...@cr...> - 2008-11-23 17:50:09
|
Forgot: > I'll not argue over the names 'on' or 'trap'. Expect a fight on anything > else as this is probably as complicated as it is sensible to go. :-) > Note that there's no need for an explicit rethrowing command (can do > that with [return] and the options dict) and there's no need for an > explicit variable for the code; it's in the options dict. > catch { return -code 5 FAIL } em opts 2 dict get $opts -code 5 ? |
From: Donal K. F. <don...@ma...> - 2008-11-23 18:04:48
|
Twylite wrote: > Are you _sure_ glob is enough? I'm not. So I want a syntax that > doesn't preclude extension (in a pretty) to handle other options in the > future. And I'd like a syntax that allows developers to create these > extensions outside the core, so that these options can evolve in future > rather than end up in a length discussion that really has few facts and > figures to back up things like "most developers" and "common case". I don't want any of that high-falutin' baggage. I do not think the practical use-cases justify it. > Versus: > on code ?-howtomatch whattomatch? script > > I cannot comment on the implications of byte-coding that, but I do feel > that it is more consistent (on error vs trap), more flexible, etc. It's too complicated. (Or not complicated enough since it doesn't permit arbitrary matching of arbitrary subsets of options. After all it's *totally vital* that I be able to use soundex matching on the error message when it's on the 13th-22nd line of the body while dealing with some custom extra parameters!!! </sarcasm>) I'll go with dealing with the 90% use-case. > Your proposal is of course extensible by adding new handler keywords in > future (assuming the TIP proposers at the time can agree on the > keyword), but this would have to be done in the core. Your proposal goes so far towards being flexible that it ceases to be practical. Cut out the complexity; it'll be good enough. Donal. |
From: Twylite <tw...@cr...> - 2008-11-23 20:16:57
|
Not to nitpick but ... > It's too complicated. (Or not complicated enough since it doesn't permit > arbitrary matching of arbitrary subsets of options. After all it's > *totally vital* that I be able to use soundex matching on the error > message when it's on the 13th-22nd line of the body while dealing with > some custom extra parameters!!! </sarcasm>) as {em opts} on error -expr { [dict get $opts -errorline] >= 13 && [dict get $opts -errorline] <= 22 && [soundex match $PATTERN $em] } { ... } :) Twylite |
From: Magentus <mag...@gm...> - 2008-11-23 16:53:56
Attachments:
signature.asc
|
On Sun, 23 Nov 2008 14:45:17 +0200, Twylite <tw...@cr...> wrote: > (g) Discourage the use of the result for determining the nature of > the error (an in doing so encourage the use of -errorCode). At the > very least this means not having default support for matching on the > result. Keeping in mind that the return result might be the NORMAL place to match for SOME return codes. Namely OK and custom codes >4. > (b) Handlers are identified by keywords. The keyword "catch" has > been argued against (confusion with existing language > feature/keyword), as has "except" (ambiguous - "with exception" or > "except for"). Likely candidates are "on" and "handle". I like "on" for the generic catch-a-return-code case, and "handle" as in handle-the-error. > (d) Taking (c) to its logical conclusion, [try] must be specified and > implemented to support user-selectable pattern matching. It is > possible to have the matcher selected for the [try] as a whole, or > per handler, and there are pros and cons to each approach. Definitely per-handler. But source and test are different things; message source will mostly be dependant on the return code, where any of the basic types of string match test can be applied to every possible message source. So unless you want every combination of source and test explicitly spelt out in its own matcher, they need to be separate. If a whole new error passing paradigm evolves, a new [try] token can be built to work with it, which could be as simple as a new term which takes a sub-set of the existing terms as its first argument, and emulates those terms (I doubt the existing code will be particularly reusable in that case anyhow). > (b) The behaviour of the [try] should be predictable and conform to > the principle of least surprise. On particular consequence of this > is that matchers must consider handlers/errorPatterns in > left-to-right order, and all handlers should be executed in the same > fashion (implying that the [try] rather than the matcher should > execute the handler body). Would it make sense to "accumulate" finally bodies as you go through, until you reach an active handler. This would mean there's a subtle twist and a bit of surprise, in that the finally block should be right behind the main [try] block, before any error handlers. Conversely, run finally blocks only from a matching handler down. Does that make any practical sense? Also, would it make any sense to do it in the style of a C switch... The handler body has to [break], otherwise it continues to try and match. For example, on a write error, you might want to send an error message and flush. Then if it's not a file closed error, send a "connection closed" message, and flush. Finally you'll close the connection if it's still open, if there was any kind of error at all. This would set [try] apart from any other TCL control structure, giving it a unique niche that even catch+switch can't readily support. Or conversely, [continue] would cause it to keep looking, [break] would prevent it from executing the finally block, and [return -code return] could conceivably be used to alter the matching from here on down. > Concerns: > - How to handle "all other errors" (-like * would work, is that good > enough?) Personally I'd like an "else" clause... But "on *" would be good in the presence of [continue] from above. And with basic glob matching, "handle *" could short-circuit and not even bother doing the match. Your present concept of the matcher is broken. It should not dictate the source of the string being matched against, that's already done for all normal usage cases, and the rest can be handled by [expr]-based matching or almost certainly are better handled by an entirely different structure. > - No handlerscript may begin with a "-". Sucky. My plan puts options only before the match string. If there's no match string, then there's no options, either. > - No feedback yet on "as {vars}" and the order of the vars Sure you have. I suggested something like that earlier, except called "catch". This is better. > - If there are multiple handlers and one is unqualified, should it be > executed first or last? In the name of least surprise, execute it where it stands. That might mask more specific ones further down, but it avoids magic. Neil: > Don't entirely agree with this. I don't believe we need to care > about inconsistent sets of vars being defined after the try -- it's > not a problem for [if], [switch], and every other control structure, > so I don't believe we need to give it special consideration here. It doesn't apply to [if] and most other control structures, and it only applies to the [regexp] part of [switch], which is a very different beast, as every single pattern has a different source of values for the variables. > Clarifying this -- we want the ability to specify the same script > for multiple patterns (and possibly multiple exception codes). The > switch approach is one way. That can be done regardless. > If adopting some novel pattern mechanism, then there is the further > question of whether to special case that in [try] or to extract it > out into a separate command (and separate TIP). It's been needed for a long time. Basic glob match for now, expand on it once a decent mechanism is in place. > I'd drop -llike -- too similar to -like, and the details of list > matching are complex to do right with nested sub-lists. But that > assumes all list elements are just strings (rather than e.g. being > sub-lists themselves), so only partially addresses the issue. In a general context, that would be a problem. But is it expected to be a problem for errorcode? Sounds like the purpose of errorcode needs revisiting, to figure out just what it is and isn't supposed to be, because their seems to be some disagreement as to its complexity. >> - No handlerscript may begin with a "-". > Which conflicts with the specification that "-" as a handlerscript > means fall-through to next branch. Only if your handlerscript is a command called "-" which takes no arguments. > Overall, I think the proposal ends up with [try] doing too much. In > particular, it seems doomed to a linear trawl through various match > conditions. Specifying an overall match command and then passing it > all the patterns and handler scripts at once gives much more freedom > for efficient implementation. I had the strange impression that a linear trawl through various match conditions is exactly what we're doing. It can be made a little more intelligent by grouping based on code, and adjacent matches using the same match type can be grouped by a [try] bytecode compiler (as DKF pointed out just as I was about to click Send). DKF: > I feel that you're chasing off in the wrong direction. Try this: > try script ?as {msgvar optvar}? ?handler...? ?finally script? > Each handler is one of these: > on code script > trap glob script I hate to say it, but that's the basis of what I've been arguing for, except that I added one extra handler type for OK (match on return value for non-fatal failures), and the ability for code to have a second word (again matched against the return value) for the regular case of return codes >4. -- Fredderic Debian/unstable (LC#384816) on i686 2.6.23-z2 2007 (up 46 days, 10:07) |
From: Joe E. <jen...@fl...> - 2008-11-23 17:02:28
|
Twylite wrote: [ extensive summary - thanks, that was helpful! ] Just one comment for now: > 2. Look & Feel > (d) Capture of variables (return code, result and options dict) needs to > happen at the front of the [try] for the statement as a whole, rather > than per handler. This avoids confusion over which vars will be defined > after the [try] returns, and also avoids variable churn if the > errorPattern to be matched can access the variables. Although some > amount of locality is lost, this also makes the syntax cleaner (less > repeated "noise"). I'm not sure about this. try { open $filename w } as {code var} on ok { # (1) } on error { # (2) } In block (1), $var holds an open file handle; in block (2) it holds an error message. Personally I'd prefer having different variable names in the different branches: try { open $filename w } on {ok fp} { ... } on {error msg} ... } It also unconditionally binds the variable 'code', which is unused and unneeded. (In blocks (1) and (2) you already know what $code is; outside of the [try] command you no longer care.) (I'm also somewhat alarmed at the growing complexity of the error matching facilities -- especially because I am not yet convinced that they will be useful in practice. I'd rather start with something braindead simple and add to it later. Features are easy to add, but incompletely baked features are very very hard to get rid of.) --Joe English jen...@fl... |
From: <lm...@bi...> - 2008-11-23 17:12:49
|
On Sun, Nov 23, 2008 at 09:02:05AM -0800, Joe English wrote: > (I'm also somewhat alarmed at the growing complexity > of the error matching facilities -- especially because > I am not yet convinced that they will be useful in practice. > I'd rather start with something braindead simple and > add to it later. Features are easy to add, but > incompletely baked features are very very hard to > get rid of.) Amen. -- --- Larry McVoy lm at bitmover.com http://www.bitkeeper.com |
From: Donal K. F. <don...@ma...> - 2008-11-23 17:20:39
|
Joe English wrote: > (I'm also somewhat alarmed at the growing complexity > of the error matching facilities -- especially because > I am not yet convinced that they will be useful in practice. > I'd rather start with something braindead simple and > add to it later. Features are easy to add, but > incompletely baked features are very very hard to > get rid of.) I agree. At least some of the things floating round are enough over what seems sensible and practical that they'll attract a NO vote from me if formally proposed. Donal. |
From: Twylite <tw...@cr...> - 2008-11-23 17:55:56
|
Hi, > I'm not sure about this. > > try { > open $filename w > } as {code var} on ok { > # (1) > } on error { > # (2) > } > > In block (1), $var holds an open file handle; > in block (2) it holds an error message. > This is of course what you get from [catch] at present. > Personally I'd prefer having different variable names > in the different branches: > Your objection has been forwarded to Fredderic and NEM. This one here gone back and forth on Tcl-core more than once, and in private discussions. Having the vars at the front of [try] reduces repetition which keeps the look cleaner. It may also make implementation simpler & more efficient depending on how we match exceptions & errors. Having the vars with the handler improves locality, and some developers find it more readable. Sounds like this is a personal preference one that is going to be hard to resolve. > It also unconditionally binds the variable 'code', > which is unused and unneeded. (In blocks (1) and (2) > you already know what $code is; outside of the [try] > command you no longer care.) > As someone who uses "code" all over the place as a local variable, I'd be really unhappy with a new control structure automagically defining variables in my stack frame that I didn't tell it to (and I can't think of any other Tcl command that does this). > (I'm also somewhat alarmed at the growing complexity > of the error matching facilities -- especially because > I am not yet convinced that they will be useful in practice. > I'd rather start with something braindead simple and > add to it later. Features are easy to add, but > incompletely baked features are very very hard to > get rid of.) > I want a syntax that handles adding the features later without making the [try] an ugly construct. I would _like_ a syntax that allows those features to be added outside the core. At this point the core only needs to provide glob-style matching for error codes. Regards, Twylite |
From: Magentus <mag...@gm...> - 2008-11-25 04:31:21
Attachments:
signature.asc
|
On Sun, 23 Nov 2008 19:55:51 +0200, Twylite <tw...@cr...> wrote: >> In block (1), $var holds an open file handle; >> in block (2) it holds an error message. > This is of course what you get from [catch] at present. >> Personally I'd prefer having different variable names >> in the different branches: > Your objection has been forwarded to Fredderic and NEM. If the vars can be shoe-horned into each branch optionally, or at least not too ugly, I'm all for it, personally. My main concern was the complexity of trying to fit everything into the one statement. If you try to match a decent-length error message, add two decently descriptive variable names, a level or two of indenting, and the handler command, you can all too easily end up having to split it over two which is going to totally frag any hope of visual clarity. A little effort on picking purpose-neutral variable names (as you have to do with [catch] already) allows you to define it once up front, and keep the individual handler lines short and sweet. How's this as an idea to combine the "no matching is needed" thoughts... try script as {vars} handler ?-- errorcode-pattern? body return ?-- returnvalue-patten? body on {code ?-- returnvalue-pattern?} body finally body The -- "option" here indicates there is a pattern to match, but will later be the place holder for the match type if there's consensus that "beyond-glob" matching is needed. This allows you omit the pattern string, and divide the inevitable embedded [switch] statement into per-code blocks with whatever matcher you wish: try { script } as { response options } return { switch -regexp ... { ... different kinds of OK response ... } } handler -- "POSIX *" { switch -exact -- [lindex $response 1] { ... match the gauntlet of Posix errorx ... } } on error { switch -- $response { ... traditional (bad) error returning ... } } expr {some freaky conditional stuff} { ... do the jolly jumbuck ... } finally { cleanup } Clean, minimal if you want it to be, hugely flexible when you need it, you can add other types of pattern matching later (the sentinel is there for when it's needed, AND it's actually being used so it'll already be there avoiding future compatibility issues), and you can still allow for plugging in other types of [try] keyword (in place of as, handler, on, expr, and finally). Of course, it could be -pattern in place of just --, or something like that. I'd rather avoid -match here, because as another post suggests, I'd like to see that used for a generic pattern matching framework that wold be consistent across all TCL commands, and remove the constant expanding mass of match type options. (If I remember correctly we've already had a clash between a match type option and a non-matching option already, in which it was necessary to choose a new name for the new option.) You could even bring back per-handler vars, if you do use -pattern or something in place of the simple --. And if the -- sentinel is still allowed as purely optional, then you don't even have the barely relevant constraint that the handler body it can't be -vars or -pattern, or a prefix thereof, AND not taking any arguments. As far as I can tell, that pattern fits EVERYONE's present requirements, and shouldn't be too much trouble to compile. As as aside; extending (user-wise) it could be tricky (although I suspect expanding it efficiently is anyhow), my personal preference would be for an expansion to return a list with the number of arguments consumed, and the outcome (whether the body has been evaluated already, should be evaluated by [try], or didn't match). Failing that, though, the IMHO ugly [if] syntax will fix that reducing all cases (except "as" :( ) to merely {handler options body} by grouping everything between keyword and body in a mandatory list (not so good for compiling, though, as I understand it). The extensions, I'd suggest, should either go in a ::tcl::try namespace, or have their own handler command "with" or something. I like the ::tcl::try namespace better purely because it's more built-in-friendly. But this whole paragraph is an issue for another day anyhow..... -- Fredderic Debian/unstable (LC#384816) on i686 2.6.23-z2 2007 (up 47 days, 22:12) |
From: Donal K. F. <don...@ma...> - 2008-11-25 09:16:16
|
Magentus wrote: > If the vars can be shoe-horned into each branch optionally, or at least > not too ugly, I'm all for it, personally. I'm against it. > My main concern was the complexity of trying to fit everything into the > one statement. That's a good joke. Or are you serious? You're already far too complex. > The -- "option" here indicates there is a pattern to match, but will > later be the place holder for the match type if there's consensus that > "beyond-glob" matching is needed. There is no such consensus outside the design astronautics practised on this list over the past few days. Note that in this period "the astronauts" have wholly failed to persuade any TCT member that such complexity is a good idea. Additionally, the use of "--" in this way is wholly unacceptable. The only acceptable use of that option is to mean "end of options" to the overall command. > This allows you omit the pattern string, and divide the inevitable > embedded [switch] statement It's not inevitable at all, no matter how much you've convinced yourself otherwise. Time to come back to earth, Mr. Astronaut. > into per-code blocks with whatever matcher you wish: But tackling the more complex cases with code inside the handler blocks is exactly the right thing. Glob matching on the errorCode is the exception to this, and only because it makes an existing feature of Tcl far more usable than before. > > try { [...] > } expr {some freaky conditional stuff} { > ... do the jolly jumbuck ... Include an 'expr' clause only if you wish to guarantee that I'll vote against it. > Clean, minimal if you want it to be, hugely flexible when you need it, > you can add other types of pattern matching later (the sentinel is > there for when it's needed, AND it's actually being used so it'll > already be there avoiding future compatibility issues), and you can > still allow for plugging in other types of [try] keyword (in place of > as, handler, on, expr, and finally). It's also a wholly impractical amount of effort to implement, especially in compiled form. > As far as I can tell, that pattern fits EVERYONE's present > requirements, No it doesn't. It fails my requirement for simplicity. > and shouldn't be too much trouble to compile. So says the person who has never written a bytecode compiler. They're really much more awkward to do than a normal C-implemented Tcl command, and the more complexity there is, the worse it gets (non-linearly). In particular, there is no chance for flexibility of matchers as we're not exposing the bytecode compilation interface. Since it is required that the script arguments to [try] be compiled efficiently (there's a real chance that people will put loops inside without "declaring" the variable outside) we will just drop the less-important flexibility requirement. Which means that all your complexity can be thrown away anyway since we won't support that sort of thing. Indeed, of this entire discussion the only thing that was at all persuasive was the idea of putting the variables to store the matched stuff only once (the 'as' clause). Perhaps I should summarize the non-crap bits and call a vote; after all, waiting for consensus among the astronauts is a) going to take too long, and b) unlikely to produce anything practical anyway. Donal. |
From: Colin M. <co...@ch...> - 2008-11-25 16:20:27
|
Magentus wrote: > try script > as {vars} > handler ?-- errorcode-pattern? body > return ?-- returnvalue-patten? body > on {code ?-- returnvalue-pattern?} body > finally body > With all due respect, that's a hairy nightmare and I wouldn't use it on principle (the principle is that anything with *that* many arguments is too hard to keep in my head while coding.) Is there really no way you can simplify it to do one thing, that's hard to do without it, and do it well? Just, perhaps, [try {} finally {}] ... that's simple, even I can remember that one, and have a reasonable guess at what it might do, and anything more complex occurs (say) in the 'finally' block. (Oh, and before anyone asks, I don't think try/as/handler/return/on/finally would benefit from NULLs) Colin. |
From: Andreas L. <av...@lo...> - 2008-11-23 01:01:13
|
Neil Madden <ne...@Cs...> wrote: > try ?-matchcommand cmd? script ?handlers ...? ?finally script? I don't like that particular option, and I think that glob-like matching will be enough for some time, but I would see for "--" as an options delimiter before the body (even though no options are yet defined) just in case we later notice that we do need any. > ... {POSIX *} ... While I find this most practicable, it somehow does strike me as odd, that in this particular case, we are *supposed* to use a string operation (pattern-matching) on a list ($errorCode). Perhaps this pattern should be itself taken as a list, and then glob-matched element-wise (to the length of the pattern). That way {POSIX *} would exhibit the same behaviour as is expected, but it would be easier to safely match the third element of the list, without being trapped by list-string meta-characters. |
From: Neil M. <ne...@Cs...> - 2008-11-23 01:19:47
|
On 23 Nov 2008, at 01:01, Andreas Leitgeb wrote: > Neil Madden <ne...@Cs...> wrote: >> try ?-matchcommand cmd? script ?handlers ...? ?finally script? > > I don't like that particular option, and I think that glob-like > matching will be enough for some time, but I would see for "--" > as an options delimiter before the body (even though no options > are yet defined) just in case we later notice that we do need any. I believe I proposed a "--" didn't I? > >> ... {POSIX *} ... > > While I find this most practicable, it somehow does strike me as > odd, that in this particular case, we are *supposed* to use a > string operation (pattern-matching) on a list ($errorCode). > > Perhaps this pattern should be itself taken as a list, and then > glob-matched element-wise (to the length of the pattern). This is exactly the purpose of -matchcommand. I'd rather not have to come up with an entirely new pattern syntax for lists (matching nested sub-lists etc). KISS -- glob as default (as [switch] already provides it), and leave freedom to plug in your own scheme. > That way {POSIX *} would exhibit the same behaviour as is expected, > but it would be easier to safely match the third element of the > list, without being trapped by list-string meta-characters. Lists are strings, so there should be no problem using glob. The only problem is if you want to do something more sophisticated, like {FOO {BAR *} JIM *} -- Neil This message has been checked for viruses but the contents of an attachment may still contain software viruses, which could damage your computer system: you are advised to perform your own checks. Email communications with the University of Nottingham may be monitored as permitted by UK legislation. |
From: Andreas L. <av...@lo...> - 2008-11-23 10:28:28
|
On Sun, Nov 23, 2008 at 01:16:15AM +0000, Neil Madden wrote: > I believe I proposed a "--" didn't I? I just meant that the "--" should even then be kept, if all other options were deferred for later (if at all) >> [ {POSIX *} ] somehow does strike me as odd, [...as...], we >> are *supposed* to use a string operation (pattern-matching) >> on a list ($errorCode). >> Perhaps this pattern should be itself taken as a list, and then >> glob-matched element-wise (to the length of the pattern). > > This is exactly the purpose of -matchcommand. But my point was, that a list-aware matching should happen by default, such that most of the cases it works correctly, even without implementing and installing a custom matcher. If usage of errorCode catches on (as a hoped-for result of the new try-command), then sooner or later someone will define sub-types like "ARITH MATRIX" and wonder, why ARITH* doesn't match both ARITH and "ARITH MATRIX". It of course doesn't match the latter, because that actually looks like "{ARITH MATRIX} ..." thus would need an optional open brace be matched as well (How to do that with globs?) And then it may even look like "ARITH\ MATRIX ..." sometimes, namely if some later element of the errorCode happens to contain an unpaired brace. > glob as default (as [switch] already provides it), But [switch] is not designed for list-matching. > and leave freedom to plug in your own scheme. My point is, that for correct programs, everyone would not only have to specify, but even implement his own list-matcher. Introducing a list-string mixup directly in the core is a very bad move, imho. |