|
From: Neil M. <ne...@Cs...> - 2008-11-20 13:40:54
|
Twylite wrote:
> And there I was thinking this had been discussed on tcl-core ;)
>
> First up, I want to be clear about the intent behind TIP #329:
> a) The TIP is about error handling, not exception handling. Tcl has
> excellent exception handling which can be readily used in a mostly
> non-ugly way (e.g. catch + if/then/else or catch + switch), but its
> error handling is poor.
I don't agree that exception handling is excellent in Tcl currently.
> Developers use the result (Tcl_SetResult, not
> return code) as an error message, representing both the class/category
> and exact nature of the error (in a human-readable form), but don't
> provide a way to programatically identify the class of error making
> control flow based on the type of error a hit-and-miss match on the
> error code.
I believe this is already addressed by the presence of errorCode.
Pattern-matching against this code is already quite simple, with
[switch] and so on. It just hasn't caught on. It seems like wishful
thinking to expect that it will suddenly catch on just because its usage
is made slightly more convenient (we're talking about a reduction of
about 1 line of code).
> b) The innovation of a new 'finally' that simplifies the linguistic
> expression of the programmer's desires has been a large part of my
> motivation to develop this TIP.
I agree with the motivation for this entirely.
> c) The overall intent of the TIP can be summed up as "make a control
> structure that makes dealing with errors and resource cleanup simpler -
> both logically and visually".
I think it is worth explicitly pulling out the separate concerns
embodied in this TIP and tying them to use-cases. As I see it, there are
a number of more-or-less separable concerns here:
1. Support for ensuring resource cleanup in the presence of
exceptions/errors (i.e. "finally").
2. Better support for case-analysis based on exception return code.
3. Better support for case-analysis based on errorcode.
4. Ability to trap only those exceptions/errors that you are interested
in/prepared to deal with, letting others propagate normally.
Now, it we look at some use-cases and how they are handled currently,
versus proposed methods for handling them:
1. Guaranteed resource cleanup
------------------------------
A typical example of this is ensuring that a channel is closed once we
have finished processing it. Currently, you would do this via:
set chan [open $myfile r]; # or [socket] etc
catch { process $chan } msg opts
close $chan
return -options $opts $msg
Some things to notice here are: (a) the catch-all behaviour of [catch]
is exactly what is required here: we don't want any exceptions to
escape, (b) we don't require any case-analysis on the exception type or
error code. The proposed alternative is:
set chan [open $myfile r]
try { process $chan } finally { close $chan }
This is a saving of two lines, but I think the improvement in
readability is worth it. A possible improvement would be for any errors
in the finally clause to not completely mask any errors in the main
script -- perhaps by adding a -original field to the options dict (which
itself contains the message and options dict of the original error).
I'd be happy to see this part of the TIP dropped or separated into a
different command however.
2. Case-analysis based on return code
-------------------------------------
An example here is that of implementing custom control structures. Of
particular interest here is the use of standard exceptions like [break]
and [continue]. For instance, we may want to write a version of
[foreach] that operates asynchronously using the event loop. Currently,
we might write this as:
proc async-foreach {varName list body} {
if {[llength $list] == 0} { return }
set list [lassign $list item]
set code [catch { apply [list $varName $body] $item } msg opts]
switch $code {
3 { return }
0 - 4 { # ok/continue }
default { return -options $opts $msg }
}
after 0 [list async-foreach $varName $list $body]
}
(I can never remember whether you need to [dict incr opts -level] in
these situations).
The try/handle (Twylite/JE) alternative would be:
proc async-foreach {varName list body} {
if {[llength $list] == 0} { return }
set list [lassign $list item]
try {
apply [list $varName $body] $item
} handle {code msg opts} {
switch $code {
3 { return }
0 - 4 { # ok/continue }
default { return -options $opts $msg }
}
}
after 0 [list async-foreach $varName $list $body]
}
This is slightly longer and introduces more nesting than the original.
In general, it seems to hamper rather than improve readability. By the
way, does "handl" catch all exception types, or just non-errors? In
general, try/handle seems just a more verbose version of the existing
[catch].
The alternative using try/catch would be:
proc async-foreach {varName list body} {
if {[llength $list] == 0} { return }
set list [lassign $list item]
try {
apply [list $varName $body] $item
} catch break {} { return } catch continue {} {}
after 0 [list async-foreach $varName $list $body]
}
This scheme is shorter. It is more readable as we have symbolic names
"break" and "continue" rather than magic numbers, and it avoids catching
anything it doesn't know how to deal with.
Note that dispatch based on exception code is a simple branch. There is
no need for complex pattern matching, sub-match capture, or
case-insensitive matching.
3. Case-analysis based on errorcode
-----------------------------------
For this, I'll use Twylite's example of trying different authentication
schemes. We will assume that the API throws an error with code BADAUTH
when the wrong scheme is used, and throws other errors such as NOCONN to
indicate that a connection to the host failed. (Note: this example
doesn't require glob-matching, but I don't think the changes in such a
case are that great). The existing way to handle this would be:
proc connect {schemes host user pass} {
foreach scheme $schemes {
if {[catch { $scheme connect $host $user $pass } res opts]} {
switch [dict get $opts -errorcode] {
BADAUTH { continue }
default { return -options $opts $res }
}
} else { return $res }
}
error "unable to authenticate"
}
Using the proposed try/onerror approach, this would be:
proc connect {schemes host user pass} {
foreach scheme $schemes {
try {
return [$scheme connect $host $user $pass]
} onerror BADAUTH { continue }
}
error "unable to authenticate"
}
Using try/catch:
proc connect {schemes host user pass} {
foreach scheme $schemes {
try {
return [$scheme connect $host $user $pass]
} catch error {msg opts} {
switch [dict get $opts -errorcode] {
BADAUTH { continue }
default { return -options $opts $msg }
}
}
}
error "unable to authenticate"
}
OK, in this case the try/onerror approach certainly is clearer, and the
try/catch approach is little better than the existing way with [catch]
alone. I think it still wins slightly over [catch] in readability. While
it is more verbose, the control flow is easier to read -- the [return]
is in an obvious place, and not tucked away in an inconspicuous "else"
clause. There's also less punctuation. But the onerror approach is
clearly better for this case.
The questions then, are whether "onerror" is sufficient for all/most
such cases, and whether these cases actually arise often/ever in
practice (or would arise given appropriate promotion). Regarding the
first part, dispatching based on errorcode is more complex than based on
return code as while the latter is a simple integer, the former can be
an arbitrarily complex data structure. In particular, the following
requirements may have to be considered:
a. Different forms of pattern-matching (e.g. exact, glob, regexp,
"algebraic" type matching etc). If we stick to one type only, will that
be appropriate? Will it cause problems? (e.g. if glob-only matching,
then we have problems specifying glob-special characters such as * or []).
b. Case sensitivity -- is "arith" the same error as "ARITH" or "ARiTh"?
c. Sub-match capture: an errorcode may contain detail fields which we
want to extract. It seems pointless to match once and then perform a
separate extraction when I could have just used [regexp] or some other
matching facility and performed both operations in one go.
d. Disjunctive matching: perform this action if the errorcode matches
either *this* or *that*.
I'm sure there are others. To me, the range of choices here suggests
that pattern matching is best kept separate from error-handling.
Otherwise there is a risk of duplicating [switch]. Perhaps I am wrong
here, and glob-matching meets all requirements. Personally, if errorcode
matching was to take off, I would use some form of algebraic types
(tagged lists) both for constructing and pattern-matching errorcodes, as
that seems to me to be the most appropriate tool for the job. Perhaps
there is a way to keep the behaviour but to parameterise the matching
command (with switch -glob/string match being the default).
The other question is whether these cases arise in practice. I can't
think of a single existing API that requires this kind of errorcode
pattern matching. Is such a design even appropriate? Clearly, if you
controlled the authentication scheme interface then you could just
return continue for BADAUTH and an error for anything else:
proc connect {schemes host user pass} {
foreach scheme $schemes {
return [$scheme connect $host $user $pass]
}
error "unable to authenticate"
}
4. Trapping only those errors/exceptions you are interested in
--------------------------------------------------------------
It's clear from the above that the Twylite/JE approach achieves this for
errors, but not for other exceptions. My approach achieves it for
exceptions but not for more specific error cases.
> NEM:
>> Firstly, "onerror" and "except" seem like bad names to me. "except" in
>> particular would imply that the following error case *isn't* handled (as
>> in "catch everything *except* for these..."), which is just confusing.
>> I also have some problems with the usage. I'd prefer to see something
>> like:
> I accept your argument about "except", having had the same concern
> myself. I drew this from C's try...except. Others have argued against
> the use of 'catch' as it could be confused with the existing catch
> command. I'll consider 'handle' instead of catch - it sounds reasonable
> for the domain; other suggestions are welcome. I feel that "onerror" is
> correctly named though.
I'd really prefer things to be unified in name at least: "on error", "on
break", etc. One possible unification that might please all would be to
adopt the syntax "on exception-type ?pattern? ?vars? body". The pattern
is an optional glob-style pattern that is matched against the -errorcode
of the exception. (If the pattern is specified then so must the vars).
Clearly this is mostly useful in the case of errors, but I believe it is
possible for non-error exceptions to also set -errorcode, so it might be
useful elsewhere. That would result in the following use-cases:
proc connect {schemes host user pass} {
foreach scheme $schemes {
try {
return [$scheme connect $host $user $pass]
} on error BADAUTH {} { continue }
}
error "unable to authenticate"
}
proc async-foreach {varName list body} {
if {[llength $list] == 0} { return }
set list [lassign $list item]
try {
apply [list $varName $body] $item
} on break { return } on continue {}
after 0 [list async-foreach $varName $list $body]
}
To me this seems like a good compromise, and people who want more
complex pattern matching can still do so. I'd like to be able to support
lists of exception types. I still don't believe glob-style errorCode
pattern matching is useful or particularly satisfactory, but I'm willing
to concede it for the sake of compromise. As before, define "then" as
"on ok", and possibly define "else/otherwise" as a catch-all default clause.
<aside>
I also see this meshing nicely with a hypothetical future
continuation-based exception mechanism that allows
resumable/non-destructive exception/warning notifications.
</aside>
[...]
> On supporting widely-used idioms, it would be much more important IMO to
> branch based on the result than on the return code. Branching based on
> return code (i.e. exception handling) is useful for creating new control
> structures, but if you want to handle errors then you must branch based
> on some information that is available from a return code = TCL_ERROR
> (2). That means either errorCode or result, and right now most APIs are
> using result.
> You make a good point about leaving pattern matching to existing
> commands -- I have been thinking along those lines for how best to
> exploit that (more in another mail).
Branching based on the result just seems like a fragile nightmare.
Localised error messages for instance could totally change the behaviour
of code.
[...]
>> People don't do it because it just isn't very useful. Errors in Tcl
>> tend to be real errors -- other than logging them, there is often not
>> much to do. Tcl's introspection, use of general control exceptions
>> like [continue]/[break], and custom control structures/HOFs etc make
>> this kind of exception-based case analysis much less necessary. I may
>> well be wrong about this, but I'd prefer to see some concrete use-cases.
> The nature of errors in Tcl is a side-effect of the weak support for
> distinguishing between types of errors. This functionality is useful
> any time that you are calling into an opaque API and can take different
> recovery actions based on the cause of the error. e.g. you want to wrap
> load balancing and/or fault tolerance (simplest case: auto-reconnect)
> around an RPC or DB interface; you want to try alternative
> authentication schemes when the password fails (but not when the
> connection fails, or the protocol is mismatched); you want to tell the
> user whether to 'try again later' or 'call the administrator'.
Thanks for these use-cases.
> Let's drop to Java-speak for a moment: the current practice of Tcl
> developers is to "catch (Exception e) { // handle }". In C++ it is
> "catch (...) { // handle }".
> Return code is not a mechanism for exception/error typing, it is a
> mechanism for implementing control flows. We need a typing mechanism.
Hmmm... error handling needs case-analysis not typing per se. Clearly,
any mechanism used for constructing error cases is strongly related to
the mechanism that is used for distinguishing them, which is why I'd
rather see these aspects separated/parameterised from the exception
handling, at least until it is clear what the best mechanism for this is.
[...]
-- Neil
This message has been checked for viruses but the contents of an attachment
may still contain software viruses, which could damage your computer system:
you are advised to perform your own checks. Email communications with the
University of Nottingham may be monitored as permitted by UK legislation.
|