|
From: Twylite <tw...@cr...> - 2008-11-20 16:40:13
|
Awesome stuff Neil - thanks for the detailed analysis :)
>> Developers use the result (Tcl_SetResult, not return code) as an
>> error message, representing both the class/category and exact nature
>> of the error (in a human-readable form), but don't provide a way to
>> programatically identify the class of error making control flow based
>> on the type of error a hit-and-miss match on the error code.
> I believe this is already addressed by the presence of errorCode.
> Pattern-matching against this code is already quite simple, with
> [switch] and so on. It just hasn't caught on. It seems like wishful
> thinking to expect that it will suddenly catch on just because its
> usage is made slightly more convenient (we're talking about a
> reduction of about 1 line of code).
As you note, it hasn't caught on. I believe that is because (i) the
language feature to throw errors discourages (or at least does not
encourage) the use of errorCode, and (ii) there is no obvious feature to
branch based on errorCode. I intend to resolve that through this TIP by
making [try] support branching on errorCode, and [throw] reorder the
arguments of [error] to encourage the use of an errorCode.
> I think it is worth explicitly pulling out the separate concerns
> embodied in this TIP and tying them to use-cases. As I see it, there
> are a number of more-or-less separable concerns here:
>
> 1. Support for ensuring resource cleanup in the presence of
> exceptions/errors (i.e. "finally").
> 2. Better support for case-analysis based on exception return code.
> 3. Better support for case-analysis based on errorcode.
> 4. Ability to trap only those exceptions/errors that you are
> interested in/prepared to deal with, letting others propagate normally.
Most wicked - I had just made this list myself (well, the first 3 at
least). We have 3 orthogonal concerns here that we are trying to
shoehorn into one command.
> try { process $chan } finally { close $chan }
>
> This is a saving of two lines, but I think the improvement in
> readability is worth it. A possible improvement would be for any
> errors in the finally clause to not completely mask any errors in the
> main script -- perhaps by adding a -original field to the options dict
> (which itself contains the message and options dict of the original
> error).
>
> I'd be happy to see this part of the TIP dropped or separated into a
> different command however.
I wouldn't be particularly happy to drop/separate this functionality.
Doing so will result in more deeply nested code, because the need for
cleanup regularly interacts with the need to handle (some)
errors/exceptions (rather than let them all propagate).
Also, there are three orthogonal concerns, all wanting to be called
"try", which will just get messy.
My proposal (or at least implementation, I forget) chains errors to the
original in all cases (i.e. if you throw an error from a handler or from
finally).
> 2. Case-analysis based on return code
> An example here is that of implementing custom control structures. Of
> particular interest here is the use of standard exceptions like
> [break] and [continue]. For instance, we may want to write a version
> of [foreach] that operates asynchronously using the event loop.
> Currently, we might write this as:
> The try/handle (Twylite/JE) alternative would be:
>
> } handle {code msg opts} {
> switch $code {
> 3 { return }
> 0 - 4 { # ok/continue }
> default { return -options $opts $msg }
> }
> }
Umm ... no. JE said:
> } handle {code ?resultVar ?optionsVar??} {
> #
> # return code was $code.
> #
> } finally {
>
> where "code" is one of ok/error/return/break/continue
> or an integer literal, a la [return -code].
>
So the use would be
} handle {2 msg opts} {
# handle the case where the return code is 2 (TCL_RETURN)
} handle {1 msg opts} {
# handle the case where the return code is 1 (TCL_ERROR)
}
etc. And you can use the keywords ok/error/return/break/continue
instead of the integer literals.
On http://wiki.tcl.tk/21608 I proposed using the keyword 'except' and
matching on 'spec' which I hadn't defined, but was tentatively assuming
to be the return code.
In other words, our proposal is the same as the try/catch example you
present.
> proc async-foreach {varName list body} {
> if {[llength $list] == 0} { return }
> set list [lassign $list item]
> try {
> apply [list $varName $body] $item
> } catch break {} { return } catch continue {} {}
> after 0 [list async-foreach $varName $list $body]
> }
The difference is that you use
catch on_what_code {?emvar? ?optsvar?} ?body?
and we use
handle {on_what_code ?emvar? ?optsvar?} ?body?
> Note that dispatch based on exception code is a simple branch. There
> is no need for complex pattern matching, sub-match capture, or
> case-insensitive matching.
Mmm ... except you cheat in your example ;)
In the examples using [switch] you use a range (0 - 4), which try/catch
doesn't handle. And you put all the cases on one line ;p
IF a catch/handle can only match on exactly one return code, the
branching is simple. But you may want to match on a range.
Or even against a discrete list:
> The syntax of the catch part would be: catch "?exception-type? vars
> script", where exception-type is a non-empty *list* of "error",
> "continue", "break", "return", "ok", or numeric return codes. It
> defaults to "error".
> 3. Case-analysis based on errorcode
>
> For this, I'll use Twylite's example of trying different
> authentication schemes. We will assume that the API throws an error
> with code BADAUTH when the wrong scheme is used, and throws other
> errors such as NOCONN to indicate that a connection to the host
> failed. (Note: this example doesn't require glob-matching, but I don't
> think the changes in such a case are that great). The existing way to
> handle this would be:
man tclVars: " This list value represents additional information about
the error in a form that is easy to process with programs. The first
element of the list identifies a general class of errors, and determines
the format of the rest of the list."
You need to do pattern matching on errorCode, or select based on [lindex
$::errorCode 0] assuming that only the most general class information is
relevant.
> OK, in this case the try/onerror approach certainly is clearer, and
> the try/catch approach is little better than the existing way with
> [catch] alone. I think it still wins slightly over [catch] in
> readability. While it is more verbose, the control flow is easier to
> read -- the [return] is in an obvious place, and not tucked away in an
> inconspicuous "else" clause. There's also less punctuation. But the
> onerror approach is clearly better for this case.
Forgive me for leaving our the preceding 40 lines of example, and
focusing on this paragraph ;)
> The questions then, are whether "onerror" is sufficient for all/most
> such cases, and whether these cases actually arise often/ever in
> practice (or would arise given appropriate promotion). Regarding the
> first part, dispatching based on errorcode is more complex than based
> on return code as while the latter is a simple integer, the former can
> be an arbitrarily complex data structure. In particular, the following
> requirements may have to be considered:
>
> a. Different forms of pattern-matching (e.g. exact, glob, regexp,
> "algebraic" type matching etc). If we stick to one type only, will
> that be appropriate? Will it cause problems? (e.g. if glob-only
> matching, then we have problems specifying glob-special characters
> such as * or []).
> b. Case sensitivity -- is "arith" the same error as "ARITH" or "ARiTh"?
> c. Sub-match capture: an errorcode may contain detail fields which we
> want to extract. It seems pointless to match once and then perform a
> separate extraction when I could have just used [regexp] or some other
> matching facility and performed both operations in one go.
> d. Disjunctive matching: perform this action if the errorcode matches
> either *this* or *that*.
>
> I'm sure there are others. To me, the range of choices here suggests
> that pattern matching is best kept separate from error-handling.
> Otherwise there is a risk of duplicating [switch].
It seems to me that the [try] must not dictate/limit the matching, but
must facilitate it.
The problem with catch+switch and similar approaches is that they are
_ugly_. Your try/catch suggestion is no more powerful than
catch+switch, but it is more readable. When you scan the code you see
"try ... catch", and the intention of the code is clear. When you see
"catch ... switch" the intention is not clear -- you have no idea that
these two statements are related, and if they are then on what basis are
you switching, etc.
If one accepts that Tcl return codes should be used for flow control,
and that the Tcl return code 1 (TCL_ERROR) covers all errors (which are
called structured exceptions in languages like C++ and Java), and that
there is a need to distinguish between different structured exceptions
(Tcl errors) based on their cause/type/class/nature (whatever you want
to call it), then the inescapable conclusion is that there needs to be a
not-ugly control structure to handle branching based on return code, and
a not-ugly control structure to handle branching based on error cause.
They may or may not be the same control structure, but they are both
equally necessary.
> Perhaps I am wrong here, and glob-matching meets all requirements.
Maybe, maybe not. I'm honestly not sure either way. Assuming that
errorCode is constructed in a hierarchical manner (most general class
first, becoming progressively more specific) then I can't think of a
situation that a glob match can't handle (where an equivalent situation
exists in say Java or C++ that can be handled by their syntax).
Of course its entirely possible that some bright spark declares that if
the general class of errorCode is "OBJECT" then [lindex $errorCode 1] is
an oo::object that is a child of tcl::errorobj, and you want to do
class-based matching on said object. This is probably quite a strong
argument against glob matching (as the only option).
> Personally, if errorcode matching was to take off, I would use some
> form of algebraic types (tagged lists) both for constructing and
> pattern-matching errorcodes, as that seems to me to be the most
> appropriate tool for the job. Perhaps there is a way to keep the
> behaviour but to parameterise the matching command (with switch
> -glob/string match being the default).
Yes, there is, maybe. More below.
> The other question is whether these cases arise in practice. I can't
> think of a single existing API that requires this kind of errorcode
> pattern matching. Is such a design even appropriate? Clearly, if you
> controlled the authentication scheme interface then you could just
> return continue for BADAUTH and an error for anything else:
Depending on your understanding of "pattern matching" ... Java. Like
the whole Java Runtime Library.
Java obviously doesn't use globs to match exceptions, but it does throw
different exceptions and you can catch a group of exceptions based on a
pattern (specifically: the exception object is of a particular class).
errorCode is defined as being a list where the first element identifies
the general class of errors; assuming that this format is maintained
(for legacy compatibility) the most obvious approach to asking "is this
an IO error" is "IO *".
So I have been suggesting "string match" on a structured list as an
approximation of the 'isa' / 'typeof' operator.
My point here applies most specifically to cases where you don't control
the API (which is rather common), but also to cases where you don't want
to modify the API - because it will affect other working code that you
don't want to refactor, or because you believe that adding a flow
control statement like 'break' or 'continue' outside of a flow control
construct is an inherently dangerous code practice because developers
using APIs don't expect stuff like that.
> 4. Trapping only those errors/exceptions you are interested in
> It's clear from the above that the Twylite/JE approach achieves this
> for errors, but not for other exceptions. My approach achieves it for
> exceptions but not for more specific error cases.
I disagree -- is this statement based on a misunderstanding of what
'except' / 'handle' was doing?
> I'd really prefer things to be unified in name at least: "on error",
> "on break", etc. One possible unification that might please all would
> be to adopt the syntax "on exception-type ?pattern? ?vars? body". The
> pattern is an optional glob-style pattern that is matched against the
> -errorcode of the exception. (If the pattern is specified then so must
> the vars). Clearly this is mostly useful in the case of errors, but I
> believe it is possible for non-error exceptions to also set
> -errorcode, so it might be useful elsewhere. That would result in the
> following use-cases:
Anything can set return options (including -errorcode) by using
[return], but only a code 1 return ( [error] ) automatically adds the
-errorcode to the dict.
> proc connect {schemes host user pass} {
> foreach scheme $schemes {
> try {
> return [$scheme connect $host $user $pass]
> } on error BADAUTH {} { continue }
> }
> error "unable to authenticate"
> }
> To me this seems like a good compromise, and people who want more
> complex pattern matching can still do so. I'd like to be able to
> support lists of exception types. I still don't believe glob-style
> errorCode pattern matching is useful or particularly satisfactory, but
> I'm willing to concede it for the sake of compromise. As before,
> define "then" as "on ok", and possibly define "else/otherwise" as a
> catch-all default clause.
It looks promising, but I doubt that errorCode is every really
meaningful outside the context of TCL_ERROR, and I'm leaning towards
glob being insufficient as the only supported matching style. It's all
your fault for being right ;p
> <aside>
> I also see this meshing nicely with a hypothetical future
> continuation-based exception mechanism that allows
> resumable/non-destructive exception/warning notifications.
> </aside>
I've been glimpsing some opportunities here, but I don't work with CPS
enough to have a good idea of how to work it in.
>> On supporting widely-used idioms, it would be much more important IMO
>> to branch based on the result than on the return code. Branching
>> based on return code (i.e. exception handling) is useful for creating
>> new control structures, but if you want to handle errors then you
>> must branch based on some information that is available from a return
>> code = TCL_ERROR (2). That means either errorCode or result, and
>> right now most APIs are using result.
>> You make a good point about leaving pattern matching to existing
>> commands -- I have been thinking along those lines for how best to
>> exploit that (more in another mail).
> Branching based on the result just seems like a fragile nightmare.
> Localised error messages for instance could totally change the
> behaviour of code.
Oh, it is ;( But its also the only option available in some existing
libraries. I've encountered code that puts an error identified in the
result and the error message in errorInfo, for example.
The question then is: how widespread is such (bad) code. Would
developers regularly resort to another branching mechanism (try+switch,
catch+switch, etc.) or would they be able to use the new [try] for the
vast majority of their code?
>> Let's drop to Java-speak for a moment: the current practice of Tcl
>> developers is to "catch (Exception e) { // handle }". In C++ it is
>> "catch (...) { // handle }".
>> Return code is not a mechanism for exception/error typing, it is a
>> mechanism for implementing control flows. We need a typing mechanism.
> Hmmm... error handling needs case-analysis not typing per se. Clearly,
> any mechanism used for constructing error cases is strongly related to
> the mechanism that is used for distinguishing them, which is why I'd
> rather see these aspects separated/parameterised from the exception
> handling, at least until it is clear what the best mechanism for this is.
I'm going to take a stab at identifying what the best mechanism is.
Let's move back for a moment to what we're trying to achieve here.
Functionality:
a. Handle errors by class (errorCode)
b. Handle success continuation (i.e. don't limit the control structure
to only exceptional situations)
c. Handle other (non-error) exceptions
d. It is clear that (a), (b) and (c) are all specialisations of
matching the Tcl return code (and possibly some other return
information, like errorCode and result)
e. Clean up at the end of an operation, regardless of errors/exceptions
Look & Feel:
a. Looks like if/then/else (traditional structured try/catch as in
C++, Java, etc), OR
b. Looks like switch (I don't actually know of another language that
does this), OR
c. Hybrid of (a) and (b): choose between success/error with an
if/then/else like construct, then switch on the specific result/error
(e.g. Erlang)
Matching:
a. Match a single return code, a range of return codes, or a discrete
list of return codes
b. Match a specific error code, and an errorcode pattern (glob-like)
c. Possibly provide for more complex matches involving other fields
(result), regular expressions, disjunctive matching, case sensitivity,
relational calculus/algebra, whatever.
catch+something_else can provide for arbitrary complexity. Should [try]
also do so, or should its complexity be limited? If limited, what is a
reasonable limit? Exact match? Range match? Glob match?
It strikes me that there are two possible solutions:
(1) Make [try] an unholy union of [catch] and [switch] such that the
user can specify the information being switched on (e.g. "%C,%E,%R"
where %C is the return code, %E is the errorcode and %R is the result).
You delegate pattern matching to [switch] and allow the user to match
based on any combination of return code, errorcode and result.
try {
# ...
} thenwith -glob -- "%C,%E" {
"1,POSIX *" { handle posix errors }
"3,*" -
"4,*" { handle break/continue }
}
(2) Make [try] an unholy union of [catch] and [if]/then/else, and
provide helper functions/operations to match exception/error cases with
expr.
try {
# ...
} handle { [string match "POSIX *" $::errorCode] } {
handle posix errors
} handle { code() in {3 4} } {
handle continue
}
Asssuming some minor changes to expr that could look more like:
try {
# ...
} handle { errorcode() like "POSIX *" } {
} handle { code() in {3 4} } {
}
I found the question on Reddit at
http://www.reddit.com/r/programming/comments/7dgwy/ask_proggit_where_clauses_in_catch_declarations/?sort=old
quite interesting in this regard.
Ick ... time to get home.
Twylite
|