|
From: Twylite <tw...@cr...> - 2008-11-22 13:06:05
|
Hi,
> From: Magentus <mag...@gm...>
>
> The [finally script] usage is trivial to implement using unset traces
> (although not quite as clean, mostly since it uses a magic variable
> name).
>
This works for [proc] and [apply], but is not completely reliable.
There is no guarantee that the magic finally variable will be the last
to be unset, so a script like 'finally [list close $f]' is safe but
'finally { close $f }' may not behave as expected.
Also [try] is not a separate scope for variables, so it would have to
have a special interaction with the magic finally variable such that
[finally] scripts added inside the context of [try] are executed at the
end of the [try].
Example:
proc dostuff {} {
set f [open {c:/boot.ini} r]
trace add variable --finally--trap-- unset [list apply [list args {
close $f ; puts done }]]
}
dostuff
chan names ;# -> stdout stderr filed27ae8 stdin
proc dostuff {} {
set f [open {c:/boot.ini} r]
trace add variable --finally--trap-- unset [list apply [list args
[list close $f]]]
}
dostuff
chan names ;# -> stdout stderr stdin
> The [try] command for matching on something other than the return code
> is excellent. Especially if it can match on return values as well as
> errorcodes. How about this for a twist on the idea...
>
> try {
> script
> } catch {
> var ?opts?
> } then {
> script
> } handler .....and so on.....
>
This fits with extending [catch], e.g.
catch { ... } em opts then { ... } handler {...}
The feedback I've had so far on this approach has not been favorable.
It seems that developers would prefer to keep the args/vars in the
context of the handler body.
> Regardless, why not have the handler clause evaluate an expression in
> the context of a [dict with $opts]? Then you can use whatever matching
> function you wish, the only minor pain is that you have to use some
> ugly bracketing of the option names { ${-code} == 2 }. But maybe
> there's a way around that, too, especially if the [dict with] is
> doable read-only and non-destructively somehow.
>
In a word, performance. I have been having conversations with other Tcl
developers off-list, and proposed exactly this. It is unquestionably
the most flexible option, but it forces a sequential consideration of
each handler's expression, preventing any sort of heuristic to improve
the performance of the construct. Since one of the uses of this [try]
will be to build other language constructs, performance is something
that deserves reasonable consideration.
The tradeoff may be to have "pluggable handler matching" where some
handlers can use exact matching ( O(1) time), some can use glob, some
can use expr, etc. Doing this in a manner that maintains a simple
syntax is quite difficult however.
> And finally for over-all syntax, what'd be wrong with tagging the
> try clauses onto the end of the present [catch] command. Make the
> options variable mandatory in this usage, and bring it into scope for
> the evaluations as above.
>
See above. I'm not necessarily against it, but it doesn't seem to be a
popular option.
>> > handle {code ?resultVar ?optionsVar??} { script }
>>
> Is there any actual practical use to putting code in the braces?
Not that I'm aware of, no. My current thinking is that it will be
outside the brackets, e.g.
handle code/expr {?resultvar? ?optionsvar?} { body }
> Something like a:
> withvars {resultVar ?optionsVar?}
> following the main try script indicating where to stash the variables.
>
One advantage of having the vars with the handler script is that it
allows you to reuse handlers. e.g.
set GENERAL_IO_HANDLER {{em opts} { log "Problem: $em" }}
...
try {
# some IO routine
} handle error * {*}$GENERAL_IO_HANDLER
And in this case its no coincidence that the GENERAL_IO_HANDLER looks
like an anonymous function that can be used with [apply]
> For the blending with [if] option, there was chatter a while back about
> fast [expr]-local variables intended mostly to hold partial results
> during an expression; the main terms of the options dict could quite
> readily be pre-loaded as [expr]-local variables.
I'm very interesting in the idea of extending [expr] in various ways,
especially to make pattern matching easier and somehow bind the error
options as variables into the expr. It's just not going to happen by 10
December, so we can't use any approach that relies on it.
Regards,
Twylite
|
|
From: Twylite <tw...@cr...> - 2008-11-22 23:42:54
|
Hi,
> try ?-matchcommand cmd? script ?handlers ...? ?finally script?
>
> Where -matchcommand is the command to use to do errorCode matching
> and defaults to {switch -glob --}. (May need a -- marker to eliminate
> ambiguity). The syntax of the handlers part would be:
>
> on exception-types ?vars? ?errorPattern? body ?errorPattern
> body ...?
I'm very uncomfortable with this syntax. It is potentially ambiguous
and feels very DWIMy (in a bad way).
try {
# do stuff
} on error {
puts hello
} finally {
puts goodbye
}
... is ambiguous. It could be interpreted the way you think, or with
"puts hello" as the vars and "finally" as the error pattern.
Consider also:
try { ... } on error {em opts} "POSIX *" { body } on break
Is "on break" an errorPattern and body, or the start of a new exception
handler?
One can construct other ambiguities that exploit the inability to
distinguish between a pattern for a pluggable matcher (i.e. you can't
make assumtions about what is and isn't a valid input) and the keywords
of the [try] itself.
Regards,
Twylite
|
|
From: Neil M. <ne...@Cs...> - 2008-11-23 01:12:31
|
On 22 Nov 2008, at 23:42, Twylite wrote:
> Hi,
>> try ?-matchcommand cmd? script ?handlers ...? ?finally script?
>>
>> Where -matchcommand is the command to use to do errorCode matching
>> and defaults to {switch -glob --}. (May need a -- marker to eliminate
>> ambiguity). The syntax of the handlers part would be:
>>
>> on exception-types ?vars? ?errorPattern? body ?errorPattern
>> body ...?
> I'm very uncomfortable with this syntax. It is potentially ambiguous
> and feels very DWIMy (in a bad way).
>
> try {
> # do stuff
> } on error {
> puts hello
> } finally {
> puts goodbye
> }
>
> ... is ambiguous. It could be interpreted the way you think, or with
> "puts hello" as the vars and "finally" as the error pattern.
>
> Consider also:
> try { ... } on error {em opts} "POSIX *" { body } on break
>
> Is "on break" an errorPattern and body, or the start of a new
> exception
> handler?
I don't think it's possible to avoid ambiguity while still preserving
ease of use. The alternative is to make less things optional, which
just becomes a pain. My preference is to make "on" a keyword in this
context. It's highly unlikely that a script or a variable would just
be named "on", and if we document this as a keyword in this context
then there should be no problem.
-- Neil
This message has been checked for viruses but the contents of an attachment
may still contain software viruses, which could damage your computer system:
you are advised to perform your own checks. Email communications with the
University of Nottingham may be monitored as permitted by UK legislation.
|
|
From: Twylite <tw...@cr...> - 2008-11-23 12:45:39
|
I think it's time for a summary of where we are on the try/catch/finally.
The overall intent of the TIP can be summed up as "make a control
structure that makes dealing with exceptions, errors and resource
cleanup simpler - both logically and visually".
1. Functionality
We want to
(a) Handle return codes, so that we can build control structures and
handle exceptions that use return codes. In most cases an exact match
against a single integer (or magic name) is sufficient.
(b) Handle matching against -errorCode in the case of the return code
TCL_ERROR (1), so that we can have something similar to other languages
with typed exception handling. In most cases a prefix match, or glob
match, or element-wise glob match on a list is sufficient.
(aside) Any argument about the ugliness of handling return codes with
catch+if/then or catch+switch applies equally to handing errors &
-errorCode, and vice versa. As a result this TIP must provide for both
(a) and (b), although it is not necessarily a requirement that they are
provided for in the same command.
(c) Handle on those exceptions/errors that are of interest (can be
handled at this point) and let others propagate normally.
(d) Handle success continuation, i.e. branch when there is no
error/exception. This is not generally supported by procedural
languages but the requirement has been expressed by several developers
and TCT members.
(e) Handle cleanup at the end of a block of code by means of a "finally"
handler (regardless of errors/exceptions).
(f) Have reasonable performance, at least for the common cases.
(g) Discourage the use of the result for determining the nature of the
error (an in doing so encourage the use of -errorCode). At the very
least this means not having default support for matching on the result.
(h) For exceptions thrown from handlers and finally blocks, maintain the
details of the original exception (i.e. chain exceptions in the options
dict).
2. Look & Feel
(a) It's going to be called [try]
(b) Handlers are identified by keywords. The keyword "catch" has been
argued against (confusion with existing language feature/keyword), as
has "except" (ambiguous - "with exception" or "except for"). Likely
candidates are "on" and "handle".
(c) A [try] statement is going to look more like an "if {} then {} else
{}" than a "switch { case {body} case {body} }". The former seems to be
preferred by everyone involved in the discussions.
(d) Capture of variables (return code, result and options dict) needs to
happen at the front of the [try] for the statement as a whole, rather
than per handler. This avoids confusion over which vars will be defined
after the [try] returns, and also avoids variable churn if the
errorPattern to be matched can access the variables. Although some
amount of locality is lost, this also makes the syntax cleaner (less
repeated "noise").
(e) A [switch]-like "fall through to next statement" would be a
nice-to-have.
3. Matching
(a) In general the matching of exceptions (return code) and errors
(errorCode where the return code is TCL_ERROR) are separate concerns.
It makes sense to exploit this by using a fast/exact match against the
return code first (meeting the performance requirement) followed by a
slower match against the errorCode.
(b) When matching against errorCode:
(i) There is (largely) consensus that basic pattern matching is "good
enough" "for now". Basic pattern matching may be defined as prefix
matching, glob matching against errorCode (as a string), or an
element-wise list-glob match against errorCode (as a list). In short
there is no agreement on the right way to do this.
(ii) There is also no guarantee that a match against errorCode will be
adequate in the future. For example an OO-style error object may be
developed.
(iii) An [expr]-type match is the most flexible but the lowest
performance (and potentially ugliest syntax); as such it is not suitable
(at least not as a default).
(iv) Delegating to [switch] for matching is a nice compromise of
performance and flexibility (and reuses existing functionality), but
brings with it the baggage of the [switch] command's interface.
(c) The only thing we _can_ be sure of is that whatever we choose now
will be inadequate in some what, implying that the syntax of [try] must
contain provision for future extension.
(d) Taking (c) to its logical conclusion, [try] must be specified and
implemented to support user-selectable pattern matching. It is possible
to have the matcher selected for the [try] as a whole, or per handler,
and there are pros and cons to each approach.
In terms of (d) my personal preference is to specify the matcher per
handler. It is difficult to predict how different packages/libraries
may approach error handling, both now and in the future (e.g. a future
move from -errorCode to an OO-style error object). If the matcher is
selected for the [try] as a whole it may only be possible to support
disparate error handling styles by using the most flexible and complex
matcher (say [expr]-based), which could be an unnecessary complication.
The same holds now for integrating with legacy code that only produces
meaningful error information in the result.
4. General
These are weakly-expressed requirements or requirements of my own.
(a) There is a general desire for consistency / symmetry in the syntax.
This would obviously improve the readability & understandability of the
source code.
(b) The behaviour of the [try] should be predictable and conform to the
principle of least surprise. On particular consequence of this is that
matchers must consider handlers/errorPatterns in left-to-right order,
and all handlers should be executed in the same fashion (implying that
the [try] rather than the matcher should execute the handler body). On
the issue of ordering, left-to-right is the only order than makes sense
for [expr]-based matching, and is the norm in other languages.
5. Proposal
Based on this summary of the discussions so far, this is my current
proposal:
try tryscript ?as {vars}? ?handler ...? ?finally finalscript?
where handler is
on code ?-matcher pattern? handlerscript
and
handlerscript may be "-" to fall through to the next handlerscript
The tryscript is executed and the outcome (return code, result, options
dict) is captured into vars. A fast match (possibly a dict lookup) is
performed to find the handler(s) for that code. If there is an
unqualified handler (one with no matcher) or a single handler for the
code, then it is executed; otherwise each handler is considered in turn
(left-to-right order) by calling the associated matcher, and the first
matching handler is executed. If no matching handler is found then the
exception is propagated.
The implementation will probably provide the following handlers by
default (users can implement their own):
- -like for glob matching against errorCode as a string (perhaps -glob?)
- -llike for element-wise list-glob matching against errorCode as a list
- -expr for expr-based matching (with access to return code, result &
options dict)
Example of use:
try {
# do stuff
} as {code em opts} on ok {
# do more
} on break - on continue {
# special handing for break & continue
} on error -like "POSIX *" {
# handle POSIX errors
} on error -expr { $em in {BAD FOO BAR} } {
# support legacy errors
} finally {
# cleanup
}
Concerns:
- How to handle "all other errors" (-like * would work, is that good
enough?)
- No handlerscript may begin with a "-".
- No feedback yet on "as {vars}" and the order of the vars
- If there are multiple handlers and one is unqualified, should it be
executed first or last?
Alternatives:
- The body "-" is reserved to indicate fallthrough to the next body.
The body "+" could be reserved to indicate that a matcher and pattern
follow.
e.g.
on error + like "POSIX *" { ... }
I feel that this proposal meets the requirements with the greatest
flexibility and the least ambiguity. But of course that's my opinion.
Regards,
Twylite
|
|
From: Neil M. <ne...@Cs...> - 2008-11-23 15:43:07
|
On 23 Nov 2008, at 12:45, Twylite wrote:
> I think it's time for a summary of where we are on the try/catch/
> finally.
>
> The overall intent of the TIP can be summed up as "make a control
> structure that makes dealing with exceptions, errors and resource
> cleanup simpler - both logically and visually".
>
> 1. Functionality
>
> We want to
> (a) Handle return codes, so that we can build control structures and
> handle exceptions that use return codes. In most cases an exact match
> against a single integer (or magic name) is sufficient.
> (b) Handle matching against -errorCode in the case of the return code
> TCL_ERROR (1), so that we can have something similar to other
> languages
> with typed exception handling. In most cases a prefix match, or glob
> match, or element-wise glob match on a list is sufficient.
> (aside) Any argument about the ugliness of handling return codes with
> catch+if/then or catch+switch applies equally to handing errors &
> -errorCode, and vice versa. As a result this TIP must provide for
> both
> (a) and (b), although it is not necessarily a requirement that they
> are
> provided for in the same command.
> (c) Handle on those exceptions/errors that are of interest (can be
> handled at this point) and let others propagate normally.
> (d) Handle success continuation, i.e. branch when there is no
> error/exception. This is not generally supported by procedural
> languages but the requirement has been expressed by several developers
> and TCT members.
> (e) Handle cleanup at the end of a block of code by means of a
> "finally"
> handler (regardless of errors/exceptions).
> (f) Have reasonable performance, at least for the common cases.
> (g) Discourage the use of the result for determining the nature of the
> error (an in doing so encourage the use of -errorCode). At the very
> least this means not having default support for matching on the
> result.
> (h) For exceptions thrown from handlers and finally blocks,
> maintain the
> details of the original exception (i.e. chain exceptions in the
> options
> dict).
Agree with all of these.
>
> 2. Look & Feel
>
> (a) It's going to be called [try]
> (b) Handlers are identified by keywords. The keyword "catch" has been
> argued against (confusion with existing language feature/keyword), as
> has "except" (ambiguous - "with exception" or "except for"). Likely
> candidates are "on" and "handle".
> (c) A [try] statement is going to look more like an "if {} then {}
> else
> {}" than a "switch { case {body} case {body} }". The former seems
> to be
> preferred by everyone involved in the discussions.
Agreed.
> (d) Capture of variables (return code, result and options dict)
> needs to
> happen at the front of the [try] for the statement as a whole, rather
> than per handler. This avoids confusion over which vars will be
> defined
> after the [try] returns, and also avoids variable churn if the
> errorPattern to be matched can access the variables. Although some
> amount of locality is lost, this also makes the syntax cleaner (less
> repeated "noise").
Don't entirely agree with this. I don't believe we need to care about
inconsistent sets of vars being defined after the try -- it's not a
problem for [if], [switch], and every other control structure, so I
don't believe we need to give it special consideration here. Agree
though that it is generally more useful for whatever pattern matching
mechanism is used to be called with a set of pattern/script pairs and
the variables already set-up in the callers scope. Whether that means
binding the vars for the entire try statement or once per exception
code is a matter of choice. Either seems acceptable.
> (e) A [switch]-like "fall through to next statement" would be a
> nice-to-have.
Clarifying this -- we want the ability to specify the same script for
multiple patterns (and possibly multiple exception codes). The switch
approach is one way.
>
> 3. Matching
>
> (a) In general the matching of exceptions (return code) and errors
> (errorCode where the return code is TCL_ERROR) are separate concerns.
> It makes sense to exploit this by using a fast/exact match against the
> return code first (meeting the performance requirement) followed by a
> slower match against the errorCode.
> (b) When matching against errorCode:
> (i) There is (largely) consensus that basic pattern matching is
> "good
> enough" "for now". Basic pattern matching may be defined as prefix
> matching, glob matching against errorCode (as a string), or an
> element-wise list-glob match against errorCode (as a list). In short
> there is no agreement on the right way to do this.
If adopting some novel pattern mechanism, then there is the further
question of whether to special case that in [try] or to extract it
out into a separate command (and separate TIP).
> (ii) There is also no guarantee that a match against errorCode
> will be
> adequate in the future. For example an OO-style error object may be
> developed.
> (iii) An [expr]-type match is the most flexible but the lowest
> performance (and potentially ugliest syntax); as such it is not
> suitable
> (at least not as a default).
> (iv) Delegating to [switch] for matching is a nice compromise of
> performance and flexibility (and reuses existing functionality), but
> brings with it the baggage of the [switch] command's interface.
This depends how it is done, and how it is documented. You can
delegate to [switch] either implicitly or explicitly (as a -
matchcommand) and still avoid acquiring [switch]'s interface. You
just document that pattern-matching is handled by [switch] and that
as far as [try] is concerned the patterns are just opaque data that
it passes on. Introducing an explicit option for this enhances this
rationale, as then the pattern matcher is just another callback. What
we definitely don't want to do is introduce [switch]'s various
options, like -nocase, -regexp etc as options of [try]. That would
constrain the implementation and be a mess. A callback solution
avoids this as the options can be specified as part of the callback
command, rather than as part of the try command.
> (c) The only thing we _can_ be sure of is that whatever we choose now
> will be inadequate in some what, implying that the syntax of [try]
> must
> contain provision for future extension.
> (d) Taking (c) to its logical conclusion, [try] must be specified and
> implemented to support user-selectable pattern matching. It is
> possible
> to have the matcher selected for the [try] as a whole, or per handler,
> and there are pros and cons to each approach.
>
> In terms of (d) my personal preference is to specify the matcher per
> handler. It is difficult to predict how different packages/libraries
> may approach error handling, both now and in the future (e.g. a future
> move from -errorCode to an OO-style error object). If the matcher is
> selected for the [try] as a whole it may only be possible to support
> disparate error handling styles by using the most flexible and complex
> matcher (say [expr]-based), which could be an unnecessary
> complication.
> The same holds now for integrating with legacy code that only produces
> meaningful error information in the result.
I'd be interested to see the interface proposed for this. Clearly the
most flexible approach is to allow an arbitrary script to do the
matching, but then we end up right back at the beginning of this
discussion where [try] just does exception-code dispatch and leaves
everything else up to a script. I believe we've ruled that option
out, as it violates requirements 1.b and 1.c.
>
> 4. General
>
> These are weakly-expressed requirements or requirements of my own.
>
> (a) There is a general desire for consistency / symmetry in the
> syntax.
> This would obviously improve the readability & understandability of
> the
> source code.
> (b) The behaviour of the [try] should be predictable and conform to
> the
> principle of least surprise. On particular consequence of this is
> that
> matchers must consider handlers/errorPatterns in left-to-right order,
> and all handlers should be executed in the same fashion (implying
> that
> the [try] rather than the matcher should execute the handler
> body). On
> the issue of ordering, left-to-right is the only order than makes
> sense
> for [expr]-based matching, and is the norm in other languages.
I don't believe [try] has to execute the bodies. All it has to do is
ensure that any option/result variables are defined in the calling
scope when that script runs. For example:
upvar 1 $msgVar msg $optsVar opts
set rc [catch { $script } msg opts]
invoke 1 $matchcmd [dict get $opts -errorcode] [dict get $handlers
$rc]
should do the right thing for most matching constructs.
I also believe the order in which to consider patterns should be left
to the match command.
[... snip proposal: I'll post a separate message for that ...]
-- Neil
This message has been checked for viruses but the contents of an attachment
may still contain software viruses, which could damage your computer system:
you are advised to perform your own checks. Email communications with the
University of Nottingham may be monitored as permitted by UK legislation.
|
|
From: Twylite <tw...@cr...> - 2008-11-23 17:06:09
|
Hi, > Don't entirely agree with this. I don't believe we need to care about > inconsistent sets of vars being defined after the try -- it's not a > problem for [if], [switch], and every other control structure, so I > don't believe we need to give it special consideration here. Agree > though that it is generally more useful for whatever pattern matching > mechanism is used to be called with a set of pattern/script pairs and > the variables already set-up in the callers scope. Whether that means > binding the vars for the entire try statement or once per exception > code is a matter of choice. Either seems acceptable. The difference being that in the case of [if] or [switch] the executed body defines the vars; in this case the [try] itself defines the vars. Semantics. The stronger argument is the performance impact of bringing per-handler vars into scope and back out of scope each time. >> (e) A [switch]-like "fall through to next statement" would be a >> nice-to-have. > Clarifying this -- we want the ability to specify the same script for > multiple patterns (and possibly multiple exception codes). The switch > approach is one way. Cool. >> 3. Matching >> (b) When matching against errorCode: >> (i) There is (largely) consensus that basic pattern matching is "good >> enough" "for now". Basic pattern matching may be defined as prefix >> matching, glob matching against errorCode (as a string), or an >> element-wise list-glob match against errorCode (as a list). In short >> there is no agreement on the right way to do this. > If adopting some novel pattern mechanism, then there is the further > question of whether to special case that in [try] or to extract it out > into a separate command (and separate TIP). So that we have [catch], [try], [try2], ... as we discover new and different needs for exception handling? No thanks. We should either get [try] sufficiently right now (which is closer to 99% than 80%) or make it extensible. Preferably the latter since we don't know what is 99% right. >> (iv) Delegating to [switch] for matching is a nice compromise of >> performance and flexibility (and reuses existing functionality), but >> brings with it the baggage of the [switch] command's interface. > This depends how it is done, and how it is documented. You can > delegate to [switch] either implicitly or explicitly (as a > -matchcommand) and still avoid acquiring [switch]'s interface. You > just document that pattern-matching is handled by [switch] and that as > far as [try] is concerned the patterns are just opaque data that it > passes on. Introducing an explicit option for this enhances this > rationale, as then the pattern matcher is just another callback. What > we definitely don't want to do is introduce [switch]'s various > options, like -nocase, -regexp etc as options of [try]. That would > constrain the implementation and be a mess. A callback solution avoids > this as the options can be specified as part of the callback command, > rather than as part of the try command. I was meaning a [try] that uses [switch] implicitly. You would need something in the interface of [try] that would configure the [switch], otherwise you are limited to some predetermined configuration (like -glob --). Agreed that a callback gets around this, and discussed in (c) and (d) below. >> (c) The only thing we _can_ be sure of is that whatever we choose now >> (d) Taking (c) to its logical conclusion, [try] must be specified and >> implemented to support user-selectable pattern matching. It is possible >> to have the matcher selected for the [try] as a whole, or per handler, >> and there are pros and cons to each approach. >> >> In terms of (d) my personal preference is to specify the matcher per >> handler. It is difficult to predict how different packages/libraries >> may approach error handling, both now and in the future (e.g. a future >> move from -errorCode to an OO-style error object). If the matcher is > I'd be interested to see the interface proposed for this. Clearly the > most flexible approach is to allow an arbitrary script to do the > matching, but then we end up right back at the beginning of this > discussion where [try] just does exception-code dispatch and leaves > everything else up to a script. I believe we've ruled that option out, > as it violates requirements 1.b and 1.c. I believe the interface I proposed does not violate (1.b), and provides an acceptable compromise on (1.c). >> (b) The behaviour of the [try] should be predictable and conform to the >> principle of least surprise. On particular consequence of this is that >> matchers must consider handlers/errorPatterns in left-to-right order, >> and all handlers should be executed in the same fashion (implying that >> the [try] rather than the matcher should execute the handler body). On >> the issue of ordering, left-to-right is the only order than makes sense >> for [expr]-based matching, and is the norm in other languages. > I don't believe [try] has to execute the bodies. All it has to do is > ensure that any option/result variables are defined in the calling > scope when that script runs. For example: There are a bunch of other things [try] has to do, including catching errors off the handlerscript (and match command, for that matter) in order to chain the errors, execute the finally script, etc. Having the match command execute the body means that its not just a match command but a fully fledged control structure, it must behave in a way that is predictable to the [try] command (i.e. [try] needs to make certain assumptions about what it will do), and the [try] cannot distinguish between a failure in the match command and a failure in the handlerscript. It also has the potential to make the errorInfo very ugly -- you will see an exception in a handler in a matchcommand in a try. If you try to use a [return -level] to avoid this you will end up with unsafe nesting and/or making assumptions about the internals of [try]. In your proposal you also talk about the match command adding a default -- this would not work if [try] is expected to chain errors, as [try] would catch the default (assumedly rethrown) error and chain it to itself (i.e. the error that [try] knows about). Any way I look at it, having the match command execute the body joins together separate concerns (matching, and execution), and there are only two arguments for this: (1) Performance. The largest number of exception handlers I've ever seen attached to a single try is 5 or 6. It there ever going to be a large enough number that the performance difference will be significant? (2) Specifically allowing the order of matching to be determined by the match command. > I also believe the order in which to consider patterns should be left > to the match command. I think non-determinism in the syntax of a language is a very bad thing. Notice that even [switch] is documented as: "The switch command matches its string argument against each of the pattern arguments in order", so the behaviour is deterministic and unsurprising from a user perspective, and a linear trawl would be no slower than a matcher that uses [switch]. In order to ensure the performance of the "common case" the most common matcher (probably "-like") could be hard-coded into the [try] implementation. Regards, Twylite |
|
From: Neil M. <ne...@Cs...> - 2008-11-23 18:04:19
|
On 23 Nov 2008, at 17:06, Twylite wrote:
> Hi,
>> Don't entirely agree with this. I don't believe we need to care
>> about inconsistent sets of vars being defined after the try --
>> it's not a problem for [if], [switch], and every other control
>> structure, so I don't believe we need to give it special
>> consideration here. Agree though that it is generally more useful
>> for whatever pattern matching mechanism is used to be called with
>> a set of pattern/script pairs and the variables already set-up in
>> the callers scope. Whether that means binding the vars for the
>> entire try statement or once per exception code is a matter of
>> choice. Either seems acceptable.
> The difference being that in the case of [if] or [switch] the
> executed body defines the vars; in this case the [try] itself
> defines the vars. Semantics.
I don't think that actually matters. In what circumstance do you
envision this causing a real problem?
> The stronger argument is the performance impact of bringing per-
> handler vars into scope and back out of scope each time.
I don't see this point, could you elaborate? The vars only need to be
defined once.
>>> (e) A [switch]-like "fall through to next statement" would be a
>>> nice-to-have.
>> Clarifying this -- we want the ability to specify the same script
>> for multiple patterns (and possibly multiple exception codes). The
>> switch approach is one way.
> Cool.
>>> 3. Matching
>>> (b) When matching against errorCode:
>>> (i) There is (largely) consensus that basic pattern matching is
>>> "good
>>> enough" "for now". Basic pattern matching may be defined as prefix
>>> matching, glob matching against errorCode (as a string), or an
>>> element-wise list-glob match against errorCode (as a list). In
>>> short
>>> there is no agreement on the right way to do this.
>> If adopting some novel pattern mechanism, then there is the
>> further question of whether to special case that in [try] or to
>> extract it out into a separate command (and separate TIP).
> So that we have [catch], [try], [try2], ... as we discover new and
> different needs for exception handling? No thanks. We should
> either get [try] sufficiently right now (which is closer to 99%
> than 80%) or make it extensible. Preferably the latter since we
> don't know what is 99% right.
No -- I mean you would have [try] and some [lmatch] command.
>>> [...]
>> I don't believe [try] has to execute the bodies. All it has to do
>> is ensure that any option/result variables are defined in the
>> calling scope when that script runs. For example:
> There are a bunch of other things [try] has to do, including
> catching errors off the handlerscript (and match command, for that
> matter) in order to chain the errors, execute the finally script,
> etc. Having the match command execute the body means that its not
> just a match command but a fully fledged control structure, it must
> behave in a way that is predictable to the [try] command (i.e.
> [try] needs to make certain assumptions about what it will do), and
> the [try] cannot distinguish between a failure in the match command
> and a failure in the handlerscript.
I don't see why [try] has to know anything at all about it. It is
just passed a callback that takes the errorcode and a list of pattern-
>script pairs, and simply calls it, returning whatever it returns
(including exceptions). All it needs to do is ensure any "finally"
script runs.
> It also has the potential to make the errorInfo very ugly -- you
> will see an exception in a handler in a matchcommand in a try. If
> you try to use a [return -level] to avoid this you will end up with
> unsafe nesting and/or making assumptions about the internals of [try].
I don't see this as a problem. If [try] is documented as delegating
to a match command then it makes sense for that command to appear in
the stack trace. [try] can always pretty up the errorinfo if it helps.
> In your proposal you also talk about the match command adding a
> default -- this would not work if [try] is expected to chain
> errors, as [try] would catch the default (assumedly rethrown) error
> and chain it to itself (i.e. the error that [try] knows about).
A simple equality check would avoid this (a Tcl_Obj pointer
comparison). Alternatively, the default script can be manufactured to
signal this special condition. It's a problem of implementation not
interface.
> Any way I look at it, having the match command execute the body
> joins together separate concerns (matching, and execution), and
> there are only two arguments for this:
Yes, you can separate these concerns, of course. But neither of them
need to be handled by [try].
> (1) Performance.
>
> The largest number of exception handlers I've ever seen attached to
> a single try is 5 or 6. It there ever going to be a large enough
> number that the performance difference will be significant?
Possibly, in generated code. E.g. there are quite a large number of
possible HTTP return codes. If these got put into an errorCode {HTTP
302 /redirected.html} then I can quite imagine HTTP client libraries
wanting large try statements and wanting fast lookup.
> (2) Specifically allowing the order of matching to be determined by
> the match command.
>> I also believe the order in which to consider patterns should be
>> left to the match command.
> I think non-determinism in the syntax of a language is a very bad
> thing. Notice that even [switch] is documented as: "The switch
> command matches its string argument against each of the pattern
> arguments in order", so the behaviour is deterministic and
> unsurprising from a user perspective, and a linear trawl would be
> no slower than a matcher that uses [switch]. In order to ensure the
> performance of the "common case" the most common matcher (probably
> "-like") could be hard-coded into the [try] implementation.
This is the point -- the behaviour isn't non-deterministic as it is
explicit what command is being used for matching, and the docs for
that command specify the ordering used. Non-determinism doesn't
require that [try] specify every last detail of execution -- it can
happily delegate those responsibilities.
-- Neil
This message has been checked for viruses but the contents of an attachment
may still contain software viruses, which could damage your computer system:
you are advised to perform your own checks. Email communications with the
University of Nottingham may be monitored as permitted by UK legislation.
|
|
From: Twylite <tw...@cr...> - 2008-11-23 20:36:56
|
Hi,
> I don't think that actually matters. In what circumstance do you
> envision this causing a real problem?
>
I don't -- it's an understandability & readability thing. Everyone's
going to have their own take.
>> The stronger argument is the performance impact of bringing per-
>> handler vars into scope and back out of scope each time.
>>
> I don't see this point, could you elaborate? The vars only need to be
> defined once.
>
If the vars are defined per handler, and can be different per handler,
then multiple different variables must be brought into scope (and
possibly out again if you don't want them handing around if the handler
didn't match).
I think the confusion here is what constitutes a "handler" - according
to all my proposals a "handler" is (return code + optional more specific
pattern), but your recent proposal is (return code + pattern1 + pattern2
+ ...). Clearly there would issues of performance and dangling
variables in the case I am understanding, but not in the case of your
proposal.
> No -- I mean you would have [try] and some [lmatch] command.
>
Sorry - misunderstanding.
> I don't see why [try] has to know anything at all about it. It is
> just passed a callback that takes the errorcode and a list of pattern-
> >script pairs, and simply calls it, returning whatever it returns
> (including exceptions). All it needs to do is ensure any "finally"
> script runs.
>
Because [try] is a (core) command that is promising a particular
interface & behaviour, but the implementation cannot guarantee the
behaviour as it delegates too much to a matcher that _may_ be
implemented outside the core.
And because, as you highlight below, you have to identify and compensate
for the corner cases in the matcher.
> I don't see this as a problem. If [try] is documented as delegating
> to a match command then it makes sense for that command to appear in
> the stack trace. [try] can always pretty up the errorinfo if it helps.
>
> A simple equality check would avoid this (a Tcl_Obj pointer
> comparison). Alternatively, the default script can be manufactured to
> signal this special condition. It's a problem of implementation not
> interface.
>
All I'm saying is that rather than have the matcher actually execute the
script, it should return it (or the index of the script in whatever
list/dict was provided to the matcher) and allow the [try] to execute
the script directly.
>> (1) Performance.
>>
>> The largest number of exception handlers I've ever seen attached to
>> a single try is 5 or 6. It there ever going to be a large enough
>> number that the performance difference will be significant?
>>
> Possibly, in generated code. E.g. there are quite a large number of
> possible HTTP return codes. If these got put into an errorCode {HTTP
> 302 /redirected.html} then I can quite imagine HTTP client libraries
> wanting large try statements and wanting fast lookup.
>
Fair case.
But how will they do it now? I would imagine most developers would
happily use a [switch], not realising that they are not getting O(1)
performance out of it.
>> I think non-determinism in the syntax of a language is a very bad
>> thing. Notice that even [switch] is documented as: "The switch
>> command matches its string argument against each of the pattern
>> arguments in order", so the behaviour is deterministic and
> This is the point -- the behaviour isn't non-deterministic as it is
> explicit what command is being used for matching, and the docs for
> that command specify the ordering used. Non-determinism doesn't
> require that [try] specify every last detail of execution -- it can
> happily delegate those responsibilities.
>
The point is that irrespective of whether you are using -glob, -regex or
-exact, you as a developer can scan the [switch] cases in left-to-right
order and know that the first match will be the one that will be used.
If [try] delegates its ordering then you cannot do this. You need to
know the behaviour of "try -command mymatcher".
Given "try -command oo_matcher { .... } on error SomeException { ... }
on error OtherException { ... }" it is reasonable to assume that you're
matching on the class of the exception object, but if OtherException is
a child of SomeException, which one will match? Language syntax should
enable you to determine that. Leaving it to a pluggable handler means
that a novice developer or maintenance coder needs to understand every
nuance of [try] and every matcher you use to understand the behaviour of
a rather elementary control structure.
Regards,
Twylite
|
|
From: Neil M. <ne...@Cs...> - 2008-11-23 21:35:18
|
On 23 Nov 2008, at 20:36, Twylite wrote:
> [...]
>>> The stronger argument is the performance impact of bringing per-
>>> handler vars into scope and back out of scope each time.
>>>
>> I don't see this point, could you elaborate? The vars only need to be
>> defined once.
>>
> If the vars are defined per handler, and can be different per handler,
> then multiple different variables must be brought into scope (and
> possibly out again if you don't want them handing around if the
> handler
> didn't match).
> I think the confusion here is what constitutes a "handler" - according
> to all my proposals a "handler" is (return code + optional more
> specific
> pattern), but your recent proposal is (return code + pattern1 +
> pattern2
> + ...). Clearly there would issues of performance and dangling
> variables in the case I am understanding, but not in the case of your
> proposal.
In my scheme only a single handler ever gets as far as defining its
variables. So there is no need to bring multiple sets of vars into
and out of scope.
>> No -- I mean you would have [try] and some [lmatch] command.
>>
> Sorry - misunderstanding.
>
>> I don't see why [try] has to know anything at all about it. It is
>> just passed a callback that takes the errorcode and a list of
>> pattern-
>>> script pairs, and simply calls it, returning whatever it returns
>> (including exceptions). All it needs to do is ensure any "finally"
>> script runs.
>>
> Because [try] is a (core) command that is promising a particular
> interface & behaviour, but the implementation cannot guarantee the
> behaviour as it delegates too much to a matcher that _may_ be
> implemented outside the core.
Then don't guarantee that behaviour.
> And because, as you highlight below, you have to identify and
> compensate
> for the corner cases in the matcher.
>> I don't see this as a problem. If [try] is documented as delegating
>> to a match command then it makes sense for that command to appear in
>> the stack trace. [try] can always pretty up the errorinfo if it
>> helps.
>>
>> A simple equality check would avoid this (a Tcl_Obj pointer
>> comparison). Alternatively, the default script can be manufactured to
>> signal this special condition. It's a problem of implementation not
>> interface.
>>
> All I'm saying is that rather than have the matcher actually
> execute the
> script, it should return it (or the index of the script in whatever
> list/dict was provided to the matcher) and allow the [try] to execute
> the script directly.
Sure, you *could* do that, but that excludes using [switch] or most
other control structures, which expect to directly execute the chosen
branch rather than just returning it. I really don't see what is
gained from having [try] execute the script: it's the difference
between doing [catch {$matchcmd ...}] vs set script [$matchcmd ...];
catch {uplevel 1 $script}.
>>> (1) Performance.
>>>
>>> The largest number of exception handlers I've ever seen attached to
>>> a single try is 5 or 6. It there ever going to be a large enough
>>> number that the performance difference will be significant?
>>>
>> Possibly, in generated code. E.g. there are quite a large number of
>> possible HTTP return codes. If these got put into an errorCode {HTTP
>> 302 /redirected.html} then I can quite imagine HTTP client libraries
>> wanting large try statements and wanting fast lookup.
>>
> Fair case.
> But how will they do it now? I would imagine most developers would
> happily use a [switch], not realising that they are not getting O(1)
> performance out of it.
switch -exact is O(1), or should be.
>>> I think non-determinism in the syntax of a language is a very bad
>>> thing. Notice that even [switch] is documented as: "The switch
>>> command matches its string argument against each of the pattern
>>> arguments in order", so the behaviour is deterministic and
>> This is the point -- the behaviour isn't non-deterministic as it is
>> explicit what command is being used for matching, and the docs for
>> that command specify the ordering used. Non-determinism doesn't
>> require that [try] specify every last detail of execution -- it can
>> happily delegate those responsibilities.
>>
> The point is that irrespective of whether you are using -glob, -
> regex or
> -exact, you as a developer can scan the [switch] cases in left-to-
> right
> order and know that the first match will be the one that will be used.
> If [try] delegates its ordering then you cannot do this. You need to
> know the behaviour of "try -command mymatcher".
What's wrong with that? If I know it's try -matchcommand {switch -
glob --} then I know to expect left-to-right behaviour. If I know
it's something based on a hash lookup, then I know to expect only
exact matching.
>
> Given "try -command oo_matcher { .... } on error SomeException { ... }
> on error OtherException { ... }" it is reasonable to assume that
> you're
> matching on the class of the exception object, but if
> OtherException is
> a child of SomeException, which one will match?
That's up to oo_matcher. That's a good thing.
> Language syntax should
> enable you to determine that. Leaving it to a pluggable handler means
> that a novice developer or maintenance coder needs to understand every
> nuance of [try] and every matcher you use to understand the
> behaviour of
> a rather elementary control structure.
That argument applies to any command that takes a callback. You could
equally say that no-one can know the behaviour of [lsort -command]
without knowing every possible comparison function. Of course, it's
not a problem because the language syntax *does* make it obvious
which command is being used: You look at the -command/-matchcommand
option and examine the docs of the corresponding command.
-- Neil
This message has been checked for viruses but the contents of an attachment
may still contain software viruses, which could damage your computer system:
you are advised to perform your own checks. Email communications with the
University of Nottingham may be monitored as permitted by UK legislation.
|
|
From: Donal K. F. <don...@ma...> - 2008-11-23 16:37:15
|
Twylite wrote:
> I think it's time for a summary of where we are on the try/catch/finally.
[...]
> (f) Have reasonable performance, at least for the common cases.
Some notes on this: I plan to make [try] bytecode compiled as it has
script bodies. But it is highly likely that I won't have time to do this
before 8.6b1! To facilitate this, only requiring equality- or
glob-matching (and having it be compile-time decidable) will be good.
Please do not require lots of nested parsing of lists and things like
that either (especially lists of scripts) since working with those
things in the compiler is really unpleasant.
For the b1 release, a pure Tcl scripted version will be good enough.
> (c) A [try] statement is going to look more like an "if {} then {} else
> {}" than a "switch { case {body} case {body} }". The former seems to be
> preferred by everyone involved in the discussions.
In particular, it's much easier to compile.
> (a) In general the matching of exceptions (return code) and errors
> (errorCode where the return code is TCL_ERROR) are separate concerns.
> It makes sense to exploit this by using a fast/exact match against the
> return code first (meeting the performance requirement) followed by a
> slower match against the errorCode.
We'll probably build a jump table.
> (b) When matching against errorCode:
> (i) There is (largely) consensus that basic pattern matching is "good
> enough" "for now". Basic pattern matching may be defined as prefix
> matching, glob matching against errorCode (as a string), or an
> element-wise list-glob match against errorCode (as a list). In short
> there is no agreement on the right way to do this.
I think we'll find that glob-matching is good enough. Almost everyone
will use it for prefix matching; Tcl's glob-matcher is good at that.
> (ii) There is also no guarantee that a match against errorCode will be
> adequate in the future. For example an OO-style error object may be
> developed.
Out Of Scope! If someone needs something that complicated, they'll have
to write their own code.
> (iii) An [expr]-type match is the most flexible but the lowest
> performance (and potentially ugliest syntax); as such it is not suitable
> (at least not as a default).
Need I repeat myself? OOS!
> (iv) Delegating to [switch] for matching is a nice compromise of
> performance and flexibility (and reuses existing functionality), but
> brings with it the baggage of the [switch] command's interface.
If we require [switch] for the normal case, we might as well not use
[try] at all. The purpose of [try] is to reduce baggage for important
classes of error handling.
> (c) The only thing we _can_ be sure of is that whatever we choose now
> will be inadequate in some what, implying that the syntax of [try] must
> contain provision for future extension.
> (d) Taking (c) to its logical conclusion, [try] must be specified and
> implemented to support user-selectable pattern matching. It is possible
> to have the matcher selected for the [try] as a whole, or per handler,
> and there are pros and cons to each approach.
>
> In terms of (d) my personal preference is to specify the matcher per
> handler. It is difficult to predict how different packages/libraries
> may approach error handling, both now and in the future (e.g. a future
> move from -errorCode to an OO-style error object). If the matcher is
> selected for the [try] as a whole it may only be possible to support
> disparate error handling styles by using the most flexible and complex
> matcher (say [expr]-based), which could be an unnecessary complication.
> The same holds now for integrating with legacy code that only produces
> meaningful error information in the result.
I write quite a lot of Java code, and I don't think there's anything
really worth it to be gained from OO exceptions. A list that folks can
match against is good enough, and they can add their own complexity if
they really want. We're building a bikeshed, not an aircraft carrier!
> On
> the issue of ordering, left-to-right is the only order than makes sense
> for [expr]-based matching, and is the norm in other languages.
Not all. C is unspecified. But Tcl is very much left-to-right.
> The implementation will probably provide the following handlers by
> default (users can implement their own):
> - -like for glob matching against errorCode as a string (perhaps -glob?)
> - -llike for element-wise list-glob matching against errorCode as a list
> - -expr for expr-based matching (with access to return code, result &
> options dict)
Too complicated by far. Glob is enough. If people want to match by ouija
board, they can write their own command. (To be clear, that's an example
of a carrier deck, undoubtedly useful to some but not part of any
sensible bikeshed...)
> I feel that this proposal meets the requirements with the greatest
> flexibility and the least ambiguity. But of course that's my opinion.
I feel that you're chasing off in the wrong direction. Try this:
try script ?as {msgvar optvar}? ?handler...? ?finally script?
Each handler is one of these:
on code script
trap glob script
Where 'code' is any numeric code or named alias or '*' (to mean any) and
'glob' is a pattern according to [string match] to be checked against
the errorcode with an implied code of 'error'. Only errors in the
initial script are trapped; errors in any handler replace the original.
The finally script is run after all else, and in all cases (except for
interpreter deletion, execution cancellation or resource exhaustion) and
errors in *it* will replace all others. All handlers except the last one
may be the string literal '-', which means use the one following; the
last one must not be that, and the finally clause is not a handler.
I'll not argue over the names 'on' or 'trap'. Expect a fight on anything
else as this is probably as complicated as it is sensible to go. :-)
Note that there's no need for an explicit rethrowing command (can do
that with [return] and the options dict) and there's no need for an
explicit variable for the code; it's in the options dict.
Donal.
|
|
From: Twylite <tw...@cr...> - 2008-11-23 17:45:00
|
Hi,
> Please do not require lots of nested parsing of lists and things like
> that either (especially lists of scripts) since working with those
> things in the compiler is really unpleasant.
>
I take it from your proposal for "as {msgvar optsvar}" that this isn't
considered "nested parsing of lists"?
> Out Of Scope! If someone needs something that complicated, they'll have
> to write their own code.
>
Having to write your own control structure just because the existing one
doesn't do what you need (or at least doesn't do it in a pretty way) is
what this exercise is all about, and what we're trying to avoid
happening again.
> I write quite a lot of Java code, and I don't think there's anything
> really worth it to be gained from OO exceptions. A list that folks can
> match against is good enough, and they can add their own complexity if
> they really want. We're building a bikeshed, not an aircraft carrier!
>
Oh dear ... I was building a Jeep.
> Too complicated by far. Glob is enough. If people want to match by ouija
> board, they can write their own command. (To be clear, that's an example
> of a carrier deck, undoubtedly useful to some but not part of any
> sensible bikeshed...)
>
It occurs to one that once upon a time there was a need for a simpler,
prettier alternative to
if/then/elseif/elseif/elseif/elseif/elseif/else. And so [switch] was
born. It also occurs to me that in C a switch is over a set of integer
values. In Tcl it was obvious to make [switch] operate on strings, but
not just that - it would be able to match against wildcard patterns _and
regular expressions_. And to do so it would add interface complexity
and sacrifice performance (in particular it was necessary to specify the
order of evaluation).
Are you _sure_ glob is enough? I'm not. So I want a syntax that
doesn't preclude extension (in a pretty) to handle other options in the
future. And I'd like a syntax that allows developers to create these
extensions outside the core, so that these options can evolve in future
rather than end up in a length discussion that really has few facts and
figures to back up things like "most developers" and "common case".
> I feel that you're chasing off in the wrong direction. Try this:
>
> try script ?as {msgvar optvar}? ?handler...? ?finally script?
>
> Each handler is one of these:
>
> on code script
> trap glob script
>
Versus:
on code ?-howtomatch whattomatch? script
I cannot comment on the implications of byte-coding that, but I do feel
that it is more consistent (on error vs trap), more flexible, etc.
Your proposal is of course extensible by adding new handler keywords in
future (assuming the TIP proposers at the time can agree on the
keyword), but this would have to be done in the core.
Regards,
Twylite
|
|
From: Twylite <tw...@cr...> - 2008-11-23 17:50:09
|
Forgot:
> I'll not argue over the names 'on' or 'trap'. Expect a fight on anything
> else as this is probably as complicated as it is sensible to go. :-)
> Note that there's no need for an explicit rethrowing command (can do
> that with [return] and the options dict) and there's no need for an
> explicit variable for the code; it's in the options dict.
>
catch { return -code 5 FAIL } em opts
2
dict get $opts -code
5
?
|
|
From: Donal K. F. <don...@ma...> - 2008-11-23 18:04:48
|
Twylite wrote: > Are you _sure_ glob is enough? I'm not. So I want a syntax that > doesn't preclude extension (in a pretty) to handle other options in the > future. And I'd like a syntax that allows developers to create these > extensions outside the core, so that these options can evolve in future > rather than end up in a length discussion that really has few facts and > figures to back up things like "most developers" and "common case". I don't want any of that high-falutin' baggage. I do not think the practical use-cases justify it. > Versus: > on code ?-howtomatch whattomatch? script > > I cannot comment on the implications of byte-coding that, but I do feel > that it is more consistent (on error vs trap), more flexible, etc. It's too complicated. (Or not complicated enough since it doesn't permit arbitrary matching of arbitrary subsets of options. After all it's *totally vital* that I be able to use soundex matching on the error message when it's on the 13th-22nd line of the body while dealing with some custom extra parameters!!! </sarcasm>) I'll go with dealing with the 90% use-case. > Your proposal is of course extensible by adding new handler keywords in > future (assuming the TIP proposers at the time can agree on the > keyword), but this would have to be done in the core. Your proposal goes so far towards being flexible that it ceases to be practical. Cut out the complexity; it'll be good enough. Donal. |
|
From: Twylite <tw...@cr...> - 2008-11-23 20:16:57
|
Not to nitpick but ...
> It's too complicated. (Or not complicated enough since it doesn't permit
> arbitrary matching of arbitrary subsets of options. After all it's
> *totally vital* that I be able to use soundex matching on the error
> message when it's on the 13th-22nd line of the body while dealing with
> some custom extra parameters!!! </sarcasm>)
as {em opts} on error -expr { [dict get $opts -errorline] >= 13 && [dict
get $opts -errorline] <= 22 && [soundex match $PATTERN $em] } { ... }
:)
Twylite
|
|
From: Magentus <mag...@gm...> - 2008-11-23 16:53:56
Attachments:
signature.asc
|
On Sun, 23 Nov 2008 14:45:17 +0200,
Twylite <tw...@cr...> wrote:
> (g) Discourage the use of the result for determining the nature of
> the error (an in doing so encourage the use of -errorCode). At the
> very least this means not having default support for matching on the
> result.
Keeping in mind that the return result might be the NORMAL place to
match for SOME return codes. Namely OK and custom codes >4.
> (b) Handlers are identified by keywords. The keyword "catch" has
> been argued against (confusion with existing language
> feature/keyword), as has "except" (ambiguous - "with exception" or
> "except for"). Likely candidates are "on" and "handle".
I like "on" for the generic catch-a-return-code case, and "handle" as
in handle-the-error.
> (d) Taking (c) to its logical conclusion, [try] must be specified and
> implemented to support user-selectable pattern matching. It is
> possible to have the matcher selected for the [try] as a whole, or
> per handler, and there are pros and cons to each approach.
Definitely per-handler. But source and test are different things;
message source will mostly be dependant on the return code, where any
of the basic types of string match test can be applied to every possible
message source. So unless you want every combination of source and
test explicitly spelt out in its own matcher, they need to be separate.
If a whole new error passing paradigm evolves, a new [try] token can be
built to work with it, which could be as simple as a new term which
takes a sub-set of the existing terms as its first argument, and
emulates those terms (I doubt the existing code will be particularly
reusable in that case anyhow).
> (b) The behaviour of the [try] should be predictable and conform to
> the principle of least surprise. On particular consequence of this
> is that matchers must consider handlers/errorPatterns in
> left-to-right order, and all handlers should be executed in the same
> fashion (implying that the [try] rather than the matcher should
> execute the handler body).
Would it make sense to "accumulate" finally bodies as you go through,
until you reach an active handler. This would mean there's a subtle
twist and a bit of surprise, in that the finally block should be right
behind the main [try] block, before any error handlers. Conversely,
run finally blocks only from a matching handler down. Does that make
any practical sense?
Also, would it make any sense to do it in the style of a C switch...
The handler body has to [break], otherwise it continues to try and
match. For example, on a write error, you might want to send an error
message and flush. Then if it's not a file closed error, send a
"connection closed" message, and flush. Finally you'll close the
connection if it's still open, if there was any kind of error at all.
This would set [try] apart from any other TCL control structure, giving
it a unique niche that even catch+switch can't readily support. Or
conversely, [continue] would cause it to keep looking, [break] would
prevent it from executing the finally block, and [return -code return]
could conceivably be used to alter the matching from here on down.
> Concerns:
> - How to handle "all other errors" (-like * would work, is that good
> enough?)
Personally I'd like an "else" clause... But "on *" would be good in
the presence of [continue] from above. And with basic glob matching,
"handle *" could short-circuit and not even bother doing the match.
Your present concept of the matcher is broken. It should not dictate
the source of the string being matched against, that's already done for
all normal usage cases, and the rest can be handled by [expr]-based
matching or almost certainly are better handled by an entirely
different structure.
> - No handlerscript may begin with a "-".
Sucky. My plan puts options only before the match string. If there's
no match string, then there's no options, either.
> - No feedback yet on "as {vars}" and the order of the vars
Sure you have. I suggested something like that earlier, except called
"catch". This is better.
> - If there are multiple handlers and one is unqualified, should it be
> executed first or last?
In the name of least surprise, execute it where it stands. That might
mask more specific ones further down, but it avoids magic.
Neil:
> Don't entirely agree with this. I don't believe we need to care
> about inconsistent sets of vars being defined after the try -- it's
> not a problem for [if], [switch], and every other control structure,
> so I don't believe we need to give it special consideration here.
It doesn't apply to [if] and most other control structures, and it only
applies to the [regexp] part of [switch], which is a very different
beast, as every single pattern has a different source of values for the
variables.
> Clarifying this -- we want the ability to specify the same script
> for multiple patterns (and possibly multiple exception codes). The
> switch approach is one way.
That can be done regardless.
> If adopting some novel pattern mechanism, then there is the further
> question of whether to special case that in [try] or to extract it
> out into a separate command (and separate TIP).
It's been needed for a long time. Basic glob match for now, expand on
it once a decent mechanism is in place.
> I'd drop -llike -- too similar to -like, and the details of list
> matching are complex to do right with nested sub-lists. But that
> assumes all list elements are just strings (rather than e.g. being
> sub-lists themselves), so only partially addresses the issue.
In a general context, that would be a problem. But is it expected to
be a problem for errorcode? Sounds like the purpose of errorcode needs
revisiting, to figure out just what it is and isn't supposed to be,
because their seems to be some disagreement as to its complexity.
>> - No handlerscript may begin with a "-".
> Which conflicts with the specification that "-" as a handlerscript
> means fall-through to next branch.
Only if your handlerscript is a command called "-" which takes no
arguments.
> Overall, I think the proposal ends up with [try] doing too much. In
> particular, it seems doomed to a linear trawl through various match
> conditions. Specifying an overall match command and then passing it
> all the patterns and handler scripts at once gives much more freedom
> for efficient implementation.
I had the strange impression that a linear trawl through various match
conditions is exactly what we're doing. It can be made a little more
intelligent by grouping based on code, and adjacent matches using the
same match type can be grouped by a [try] bytecode compiler (as DKF
pointed out just as I was about to click Send).
DKF:
> I feel that you're chasing off in the wrong direction. Try this:
> try script ?as {msgvar optvar}? ?handler...? ?finally script?
> Each handler is one of these:
> on code script
> trap glob script
I hate to say it, but that's the basis of what I've been arguing for,
except that I added one extra handler type for OK (match on return
value for non-fatal failures), and the ability for code to have a
second word (again matched against the return value) for the regular
case of return codes >4.
--
Fredderic
Debian/unstable (LC#384816) on i686 2.6.23-z2 2007 (up 46 days, 10:07)
|
|
From: Joe E. <jen...@fl...> - 2008-11-23 17:02:28
|
Twylite wrote:
[ extensive summary - thanks, that was helpful! ]
Just one comment for now:
> 2. Look & Feel
> (d) Capture of variables (return code, result and options dict) needs to
> happen at the front of the [try] for the statement as a whole, rather
> than per handler. This avoids confusion over which vars will be defined
> after the [try] returns, and also avoids variable churn if the
> errorPattern to be matched can access the variables. Although some
> amount of locality is lost, this also makes the syntax cleaner (less
> repeated "noise").
I'm not sure about this.
try {
open $filename w
} as {code var} on ok {
# (1)
} on error {
# (2)
}
In block (1), $var holds an open file handle;
in block (2) it holds an error message.
Personally I'd prefer having different variable names
in the different branches:
try {
open $filename w
} on {ok fp} {
...
} on {error msg}
...
}
It also unconditionally binds the variable 'code',
which is unused and unneeded. (In blocks (1) and (2)
you already know what $code is; outside of the [try]
command you no longer care.)
(I'm also somewhat alarmed at the growing complexity
of the error matching facilities -- especially because
I am not yet convinced that they will be useful in practice.
I'd rather start with something braindead simple and
add to it later. Features are easy to add, but
incompletely baked features are very very hard to
get rid of.)
--Joe English
jen...@fl...
|
|
From: <lm...@bi...> - 2008-11-23 17:12:49
|
On Sun, Nov 23, 2008 at 09:02:05AM -0800, Joe English wrote: > (I'm also somewhat alarmed at the growing complexity > of the error matching facilities -- especially because > I am not yet convinced that they will be useful in practice. > I'd rather start with something braindead simple and > add to it later. Features are easy to add, but > incompletely baked features are very very hard to > get rid of.) Amen. -- --- Larry McVoy lm at bitmover.com http://www.bitkeeper.com |
|
From: Donal K. F. <don...@ma...> - 2008-11-23 17:20:39
|
Joe English wrote: > (I'm also somewhat alarmed at the growing complexity > of the error matching facilities -- especially because > I am not yet convinced that they will be useful in practice. > I'd rather start with something braindead simple and > add to it later. Features are easy to add, but > incompletely baked features are very very hard to > get rid of.) I agree. At least some of the things floating round are enough over what seems sensible and practical that they'll attract a NO vote from me if formally proposed. Donal. |
|
From: Twylite <tw...@cr...> - 2008-11-23 17:55:56
|
Hi,
> I'm not sure about this.
>
> try {
> open $filename w
> } as {code var} on ok {
> # (1)
> } on error {
> # (2)
> }
>
> In block (1), $var holds an open file handle;
> in block (2) it holds an error message.
>
This is of course what you get from [catch] at present.
> Personally I'd prefer having different variable names
> in the different branches:
>
Your objection has been forwarded to Fredderic and NEM.
This one here gone back and forth on Tcl-core more than once, and in
private discussions. Having the vars at the front of [try] reduces
repetition which keeps the look cleaner. It may also make
implementation simpler & more efficient depending on how we match
exceptions & errors. Having the vars with the handler improves
locality, and some developers find it more readable. Sounds like this
is a personal preference one that is going to be hard to resolve.
> It also unconditionally binds the variable 'code',
> which is unused and unneeded. (In blocks (1) and (2)
> you already know what $code is; outside of the [try]
> command you no longer care.)
>
As someone who uses "code" all over the place as a local variable, I'd
be really unhappy with a new control structure automagically defining
variables in my stack frame that I didn't tell it to (and I can't think
of any other Tcl command that does this).
> (I'm also somewhat alarmed at the growing complexity
> of the error matching facilities -- especially because
> I am not yet convinced that they will be useful in practice.
> I'd rather start with something braindead simple and
> add to it later. Features are easy to add, but
> incompletely baked features are very very hard to
> get rid of.)
>
I want a syntax that handles adding the features later without making
the [try] an ugly construct. I would _like_ a syntax that allows those
features to be added outside the core. At this point the core only
needs to provide glob-style matching for error codes.
Regards,
Twylite
|
|
From: Magentus <mag...@gm...> - 2008-11-25 04:31:21
Attachments:
signature.asc
|
On Sun, 23 Nov 2008 19:55:51 +0200,
Twylite <tw...@cr...> wrote:
>> In block (1), $var holds an open file handle;
>> in block (2) it holds an error message.
> This is of course what you get from [catch] at present.
>> Personally I'd prefer having different variable names
>> in the different branches:
> Your objection has been forwarded to Fredderic and NEM.
If the vars can be shoe-horned into each branch optionally, or at least
not too ugly, I'm all for it, personally.
My main concern was the complexity of trying to fit everything into the
one statement. If you try to match a decent-length error message, add
two decently descriptive variable names, a level or two of indenting,
and the handler command, you can all too easily end up having to split
it over two which is going to totally frag any hope of visual clarity.
A little effort on picking purpose-neutral variable names (as you have
to do with [catch] already) allows you to define it once up front, and
keep the individual handler lines short and sweet. How's this as an
idea to combine the "no matching is needed" thoughts...
try script
as {vars}
handler ?-- errorcode-pattern? body
return ?-- returnvalue-patten? body
on {code ?-- returnvalue-pattern?} body
finally body
The -- "option" here indicates there is a pattern to match, but will
later be the place holder for the match type if there's consensus that
"beyond-glob" matching is needed.
This allows you omit the pattern string, and divide the inevitable
embedded [switch] statement into per-code blocks with whatever matcher
you wish:
try {
script
} as {
response options
} return {
switch -regexp ... {
... different kinds of OK response ...
}
} handler -- "POSIX *" {
switch -exact -- [lindex $response 1] {
... match the gauntlet of Posix errorx ...
}
} on error {
switch -- $response {
... traditional (bad) error returning ...
}
} expr {some freaky conditional stuff} {
... do the jolly jumbuck ...
} finally {
cleanup
}
Clean, minimal if you want it to be, hugely flexible when you need it,
you can add other types of pattern matching later (the sentinel is
there for when it's needed, AND it's actually being used so it'll
already be there avoiding future compatibility issues), and you can
still allow for plugging in other types of [try] keyword (in place of
as, handler, on, expr, and finally).
Of course, it could be -pattern in place of just --, or something like
that. I'd rather avoid -match here, because as another post suggests,
I'd like to see that used for a generic pattern matching framework that
wold be consistent across all TCL commands, and remove the constant
expanding mass of match type options. (If I remember correctly we've
already had a clash between a match type option and a non-matching
option already, in which it was necessary to choose a new name for the
new option.)
You could even bring back per-handler vars, if you do use -pattern or
something in place of the simple --. And if the -- sentinel is still
allowed as purely optional, then you don't even have the barely
relevant constraint that the handler body it can't be -vars or -pattern,
or a prefix thereof, AND not taking any arguments.
As far as I can tell, that pattern fits EVERYONE's present
requirements, and shouldn't be too much trouble to compile.
As as aside; extending (user-wise) it could be tricky (although I
suspect expanding it efficiently is anyhow), my personal preference
would be for an expansion to return a list with the number of arguments
consumed, and the outcome (whether the body has been evaluated already,
should be evaluated by [try], or didn't match). Failing that, though,
the IMHO ugly [if] syntax will fix that reducing all cases (except
"as" :( ) to merely {handler options body} by grouping everything
between keyword and body in a mandatory list (not so good for
compiling, though, as I understand it). The extensions, I'd suggest,
should either go in a ::tcl::try namespace, or have their own handler
command "with" or something. I like the ::tcl::try namespace better
purely because it's more built-in-friendly. But this whole paragraph
is an issue for another day anyhow.....
--
Fredderic
Debian/unstable (LC#384816) on i686 2.6.23-z2 2007 (up 47 days, 22:12)
|
|
From: Donal K. F. <don...@ma...> - 2008-11-25 09:16:16
|
Magentus wrote:
> If the vars can be shoe-horned into each branch optionally, or at least
> not too ugly, I'm all for it, personally.
I'm against it.
> My main concern was the complexity of trying to fit everything into the
> one statement.
That's a good joke. Or are you serious? You're already far too complex.
> The -- "option" here indicates there is a pattern to match, but will
> later be the place holder for the match type if there's consensus that
> "beyond-glob" matching is needed.
There is no such consensus outside the design astronautics practised on
this list over the past few days. Note that in this period "the
astronauts" have wholly failed to persuade any TCT member that such
complexity is a good idea.
Additionally, the use of "--" in this way is wholly unacceptable. The
only acceptable use of that option is to mean "end of options" to the
overall command.
> This allows you omit the pattern string, and divide the inevitable
> embedded [switch] statement
It's not inevitable at all, no matter how much you've convinced yourself
otherwise. Time to come back to earth, Mr. Astronaut.
> into per-code blocks with whatever matcher you wish:
But tackling the more complex cases with code inside the handler blocks
is exactly the right thing. Glob matching on the errorCode is the
exception to this, and only because it makes an existing feature of Tcl
far more usable than before.
>
> try {
[...]
> } expr {some freaky conditional stuff} {
> ... do the jolly jumbuck ...
Include an 'expr' clause only if you wish to guarantee that I'll vote
against it.
> Clean, minimal if you want it to be, hugely flexible when you need it,
> you can add other types of pattern matching later (the sentinel is
> there for when it's needed, AND it's actually being used so it'll
> already be there avoiding future compatibility issues), and you can
> still allow for plugging in other types of [try] keyword (in place of
> as, handler, on, expr, and finally).
It's also a wholly impractical amount of effort to implement, especially
in compiled form.
> As far as I can tell, that pattern fits EVERYONE's present
> requirements,
No it doesn't. It fails my requirement for simplicity.
> and shouldn't be too much trouble to compile.
So says the person who has never written a bytecode compiler. They're
really much more awkward to do than a normal C-implemented Tcl command,
and the more complexity there is, the worse it gets (non-linearly). In
particular, there is no chance for flexibility of matchers as we're not
exposing the bytecode compilation interface. Since it is required that
the script arguments to [try] be compiled efficiently (there's a real
chance that people will put loops inside without "declaring" the
variable outside) we will just drop the less-important flexibility
requirement. Which means that all your complexity can be thrown away
anyway since we won't support that sort of thing.
Indeed, of this entire discussion the only thing that was at all
persuasive was the idea of putting the variables to store the matched
stuff only once (the 'as' clause). Perhaps I should summarize the
non-crap bits and call a vote; after all, waiting for consensus among
the astronauts is a) going to take too long, and b) unlikely to produce
anything practical anyway.
Donal.
|
|
From: Colin M. <co...@ch...> - 2008-11-25 16:20:27
|
Magentus wrote:
> try script
> as {vars}
> handler ?-- errorcode-pattern? body
> return ?-- returnvalue-patten? body
> on {code ?-- returnvalue-pattern?} body
> finally body
>
With all due respect, that's a hairy nightmare and I wouldn't use it on
principle (the principle is that anything with *that* many arguments is
too hard to keep in my head while coding.)
Is there really no way you can simplify it to do one thing, that's hard
to do without it, and do it well?
Just, perhaps, [try {} finally {}] ... that's simple, even I can
remember that one, and have a reasonable guess at what it might do, and
anything more complex occurs (say) in the 'finally' block.
(Oh, and before anyone asks, I don't think
try/as/handler/return/on/finally would benefit from NULLs)
Colin.
|
|
From: Andreas L. <av...@lo...> - 2008-11-23 01:01:13
|
Neil Madden <ne...@Cs...> wrote:
> try ?-matchcommand cmd? script ?handlers ...? ?finally script?
I don't like that particular option, and I think that glob-like
matching will be enough for some time, but I would see for "--"
as an options delimiter before the body (even though no options
are yet defined) just in case we later notice that we do need any.
> ... {POSIX *} ...
While I find this most practicable, it somehow does strike me as
odd, that in this particular case, we are *supposed* to use a
string operation (pattern-matching) on a list ($errorCode).
Perhaps this pattern should be itself taken as a list, and then
glob-matched element-wise (to the length of the pattern).
That way {POSIX *} would exhibit the same behaviour as is expected,
but it would be easier to safely match the third element of the
list, without being trapped by list-string meta-characters.
|
|
From: Neil M. <ne...@Cs...> - 2008-11-23 01:19:47
|
On 23 Nov 2008, at 01:01, Andreas Leitgeb wrote:
> Neil Madden <ne...@Cs...> wrote:
>> try ?-matchcommand cmd? script ?handlers ...? ?finally script?
>
> I don't like that particular option, and I think that glob-like
> matching will be enough for some time, but I would see for "--"
> as an options delimiter before the body (even though no options
> are yet defined) just in case we later notice that we do need any.
I believe I proposed a "--" didn't I?
>
>> ... {POSIX *} ...
>
> While I find this most practicable, it somehow does strike me as
> odd, that in this particular case, we are *supposed* to use a
> string operation (pattern-matching) on a list ($errorCode).
>
> Perhaps this pattern should be itself taken as a list, and then
> glob-matched element-wise (to the length of the pattern).
This is exactly the purpose of -matchcommand. I'd rather not have to
come up with an entirely new pattern syntax for lists (matching
nested sub-lists etc). KISS -- glob as default (as [switch] already
provides it), and leave freedom to plug in your own scheme.
> That way {POSIX *} would exhibit the same behaviour as is expected,
> but it would be easier to safely match the third element of the
> list, without being trapped by list-string meta-characters.
Lists are strings, so there should be no problem using glob. The only
problem is if you want to do something more sophisticated, like {FOO
{BAR *} JIM *}
-- Neil
This message has been checked for viruses but the contents of an attachment
may still contain software viruses, which could damage your computer system:
you are advised to perform your own checks. Email communications with the
University of Nottingham may be monitored as permitted by UK legislation.
|
|
From: Andreas L. <av...@lo...> - 2008-11-23 10:28:28
|
On Sun, Nov 23, 2008 at 01:16:15AM +0000, Neil Madden wrote:
> I believe I proposed a "--" didn't I?
I just meant that the "--" should even then be kept, if all other
options were deferred for later (if at all)
>> [ {POSIX *} ] somehow does strike me as odd, [...as...], we
>> are *supposed* to use a string operation (pattern-matching)
>> on a list ($errorCode).
>> Perhaps this pattern should be itself taken as a list, and then
>> glob-matched element-wise (to the length of the pattern).
>
> This is exactly the purpose of -matchcommand.
But my point was, that a list-aware matching should happen by
default, such that most of the cases it works correctly, even
without implementing and installing a custom matcher.
If usage of errorCode catches on (as a hoped-for result of the
new try-command), then sooner or later someone will define
sub-types like "ARITH MATRIX" and wonder, why ARITH* doesn't
match both ARITH and "ARITH MATRIX".
It of course doesn't match the latter, because that actually
looks like "{ARITH MATRIX} ..." thus would need an optional
open brace be matched as well (How to do that with globs?)
And then it may even look like "ARITH\ MATRIX ..." sometimes,
namely if some later element of the errorCode happens to
contain an unpaired brace.
> glob as default (as [switch] already provides it),
But [switch] is not designed for list-matching.
> and leave freedom to plug in your own scheme.
My point is, that for correct programs, everyone
would not only have to specify, but even implement
his own list-matcher.
Introducing a list-string mixup directly in the core
is a very bad move, imho.
|