Re: [TCLCORE] TIP#239 Try/Catch/Finally syntax [WAS Re: Tcl-Core Digest, Vol 30, Issue 10]

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Twylite wrote:
> And there I was thinking this had been discussed on tcl-core ;)
> 
> First up, I want to be clear about the intent behind TIP #329:
> a) The TIP is about error handling, not exception handling.  Tcl has 
> excellent exception handling which can be readily used in a mostly 
> non-ugly way (e.g. catch + if/then/else or catch + switch), but its 
> error handling is poor.  

I don't agree that exception handling is excellent in Tcl currently.

> Developers use the result (Tcl_SetResult, not 
> return code) as an error message, representing both the class/category 
> and exact nature of the error (in a human-readable form), but don't 
> provide a way to programatically identify the class of error making 
> control flow based on the type of error a hit-and-miss match on the 
> error code.

I believe this is already addressed by the presence of errorCode. 
Pattern-matching against this code is already quite simple, with 
[switch] and so on. It just hasn't caught on. It seems like wishful 
thinking to expect that it will suddenly catch on just because its usage 
is made slightly more convenient (we're talking about a reduction of 
about 1 line of code).

> b) The innovation of a new 'finally' that simplifies the linguistic 
> expression of the programmer's desires  has been a large part of my 
> motivation to develop this TIP.

I agree with the motivation for this entirely.

> c) The overall intent of the TIP can be summed up as "make a control 
> structure that makes dealing with errors and resource cleanup simpler - 
> both logically and visually".

I think it is worth explicitly pulling out the separate concerns 
embodied in this TIP and tying them to use-cases. As I see it, there are 
a number of more-or-less separable concerns here:

1. Support for ensuring resource cleanup in the presence of 
exceptions/errors (i.e. "finally").
2. Better support for case-analysis based on exception return code.
3. Better support for case-analysis based on errorcode.
4. Ability to trap only those exceptions/errors that you are interested 
in/prepared to deal with, letting others propagate normally.

Now, it we look at some use-cases and how they are handled currently, 
versus proposed methods for handling them:

1. Guaranteed resource cleanup
------------------------------

A typical example of this is ensuring that a channel is closed once we 
have finished processing it. Currently, you would do this via:

  set chan [open $myfile r]; # or [socket] etc
  catch { process $chan } msg opts
  close $chan
  return -options $opts $msg

Some things to notice here are: (a) the catch-all behaviour of [catch] 
is exactly what is required here: we don't want any exceptions to 
escape, (b) we don't require any case-analysis on the exception type or 
error code. The proposed alternative is:

  set chan [open $myfile r]
  try { process $chan } finally { close $chan }

This is a saving of two lines, but I think the improvement in 
readability is worth it. A possible improvement would be for any errors 
in the finally clause to not completely mask any errors in the main 
script -- perhaps by adding a -original field to the options dict (which 
itself contains the message and options dict of the original error).

I'd be happy to see this part of the TIP dropped or separated into a 
different command however.

2. Case-analysis based on return code
-------------------------------------

An example here is that of implementing custom control structures. Of 
particular interest here is the use of standard exceptions like [break] 
and [continue]. For instance, we may want to write a version of 
[foreach] that operates asynchronously using the event loop. Currently, 
we might write this as:

  proc async-foreach {varName list body} {
      if {[llength $list] == 0} { return }
      set list [lassign $list item]
      set code [catch { apply [list $varName $body] $item } msg opts]
      switch $code {
          3       { return }
          0 - 4   { # ok/continue }
          default { return -options $opts $msg }
      }
      after 0 [list async-foreach $varName $list $body]
  }

(I can never remember whether you need to [dict incr opts -level] in 
these situations).

The try/handle (Twylite/JE) alternative would be:

  proc async-foreach {varName list body} {
      if {[llength $list] == 0} { return }
      set list [lassign $list item]
      try {
          apply [list $varName $body] $item
      } handle {code msg opts} {
          switch $code {
              3       { return }
              0 - 4   { # ok/continue }
              default { return -options $opts $msg }
          }
      }
      after 0 [list async-foreach $varName $list $body]
  }

This is slightly longer and introduces more nesting than the original. 
In general, it seems to hamper rather than improve readability. By the 
way, does "handl" catch all exception types, or just non-errors? In 
general, try/handle seems just a more verbose version of the existing 
[catch].

The alternative using try/catch would be:

  proc async-foreach {varName list body} {
      if {[llength $list] == 0} { return }
      set list [lassign $list item]
      try {
          apply [list $varName $body] $item
      } catch break {} { return } catch continue {} {}
      after 0 [list async-foreach $varName $list $body]
  }

This scheme is shorter. It is more readable as we have symbolic names 
"break" and "continue" rather than magic numbers, and it avoids catching 
anything it doesn't know how to deal with.

Note that dispatch based on exception code is a simple branch. There is 
no need for complex pattern matching, sub-match capture, or 
case-insensitive matching.

3. Case-analysis based on errorcode
-----------------------------------

For this, I'll use Twylite's example of trying different authentication 
schemes. We will assume that the API throws an error with code BADAUTH 
when the wrong scheme is used, and throws other errors such as NOCONN to 
indicate that a connection to the host failed. (Note: this example 
doesn't require glob-matching, but I don't think the changes in such a 
case are that great). The existing way to handle this would be:

  proc connect {schemes host user pass} {
      foreach scheme $schemes {
          if {[catch { $scheme connect $host $user $pass } res opts]} {
              switch [dict get $opts -errorcode] {
                  BADAUTH  { continue }
                  default  { return -options $opts $res }
              }
          } else { return $res }
      }
      error "unable to authenticate"
  }

Using the proposed try/onerror approach, this would be:

  proc connect {schemes host user pass} {
      foreach scheme $schemes {
          try {
              return [$scheme connect $host $user $pass]
          } onerror BADAUTH { continue }
      }
      error "unable to authenticate"
  }

Using try/catch:

  proc connect {schemes host user pass} {
      foreach scheme $schemes {
          try {
              return [$scheme connect $host $user $pass]
          } catch error {msg opts} {
              switch [dict get $opts -errorcode] {
                  BADAUTH  { continue }
                  default  { return -options $opts $msg }
              }
          }
      }
      error "unable to authenticate"
  }

OK, in this case the try/onerror approach certainly is clearer, and the 
try/catch approach is little better than the existing way with [catch] 
alone. I think it still wins slightly over [catch] in readability. While 
it is more verbose, the control flow is easier to read -- the [return] 
is in an obvious place, and not tucked away in an inconspicuous "else" 
clause. There's also less punctuation. But the onerror approach is 
clearly better for this case.

The questions then, are whether "onerror" is sufficient for all/most 
such cases, and whether these cases actually arise often/ever in 
practice (or would arise given appropriate promotion). Regarding the 
first part, dispatching based on errorcode is more complex than based on 
return code as while the latter is a simple integer, the former can be 
an arbitrarily complex data structure. In particular, the following 
requirements may have to be considered:

  a. Different forms of pattern-matching (e.g. exact, glob, regexp, 
"algebraic" type matching etc). If we stick to one type only, will that 
be appropriate? Will it cause problems? (e.g. if glob-only matching, 
then we have problems specifying glob-special characters such as * or []).
  b. Case sensitivity -- is "arith" the same error as "ARITH" or "ARiTh"?
  c. Sub-match capture: an errorcode may contain detail fields which we 
want to extract. It seems pointless to match once and then perform a 
separate extraction when I could have just used [regexp] or some other 
matching facility and performed both operations in one go.
  d. Disjunctive matching: perform this action if the errorcode matches 
either *this* or *that*.

I'm sure there are others. To me, the range of choices here suggests 
that pattern matching is best kept separate from error-handling. 
Otherwise there is a risk of duplicating [switch]. Perhaps I am wrong 
here, and glob-matching meets all requirements. Personally, if errorcode 
matching was to take off, I would use some form of algebraic types 
(tagged lists) both for constructing and pattern-matching errorcodes, as 
that seems to me to be the most appropriate tool for the job. Perhaps 
there is a way to keep the behaviour but to parameterise the matching 
command (with switch -glob/string match being the default).

The other question is whether these cases arise in practice. I can't 
think of a single existing API that requires this kind of errorcode 
pattern matching. Is such a design even appropriate? Clearly, if you 
controlled the authentication scheme interface then you could just 
return continue for BADAUTH and an error for anything else:

  proc connect {schemes host user pass} {
      foreach scheme $schemes {
          return [$scheme connect $host $user $pass]
      }
      error "unable to authenticate"
  }

4. Trapping only those errors/exceptions you are interested in
--------------------------------------------------------------

It's clear from the above that the Twylite/JE approach achieves this for 
errors, but not for other exceptions. My approach achieves it for 
exceptions but not for more specific error cases.

> NEM:
>> Firstly, "onerror" and "except" seem like bad names to me. "except" in
>> particular would imply that the following error case *isn't* handled (as
>> in "catch everything *except* for these..."), which is just confusing. 
>> I also have some problems with the usage. I'd prefer to see something 
>> like:
> I accept your argument about "except", having had the same concern 
> myself.  I drew this from C's try...except.  Others have argued against 
> the use of 'catch' as it could be confused with the existing catch 
> command.  I'll consider 'handle' instead of catch - it sounds reasonable 
> for the domain; other suggestions are welcome.  I feel that "onerror" is 
> correctly named though.

I'd really prefer things to be unified in name at least: "on error", "on 
break", etc. One possible unification that might please all would be to 
adopt the syntax "on exception-type ?pattern? ?vars? body". The pattern 
is an optional glob-style pattern that is matched against the -errorcode 
of the exception. (If the pattern is specified then so must the vars). 
Clearly this is mostly useful in the case of errors, but I believe it is 
possible for non-error exceptions to also set -errorcode, so it might be 
useful elsewhere. That would result in the following use-cases:

  proc connect {schemes host user pass} {
      foreach scheme $schemes {
          try {
              return [$scheme connect $host $user $pass]
          } on error BADAUTH {} { continue }
      }
      error "unable to authenticate"
  }
  proc async-foreach {varName list body} {
      if {[llength $list] == 0} { return }
      set list [lassign $list item]
      try {
          apply [list $varName $body] $item
      } on break { return } on continue {}
      after 0 [list async-foreach $varName $list $body]
  }

To me this seems like a good compromise, and people who want more 
complex pattern matching can still do so. I'd like to be able to support 
lists of exception types. I still don't believe glob-style errorCode 
pattern matching is useful or particularly satisfactory, but I'm willing 
to concede it for the sake of compromise. As before, define "then" as 
"on ok", and possibly define "else/otherwise" as a catch-all default clause.

<aside>
I also see this meshing nicely with a hypothetical future 
continuation-based exception mechanism that allows 
resumable/non-destructive exception/warning notifications.
</aside>

[...]
> On supporting widely-used idioms, it would be much more important IMO to 
> branch based on the result than on the return code.  Branching based on 
> return code (i.e. exception handling) is useful for creating new control 
> structures, but if you want to handle errors then you must branch based 
> on some information that is available from a return code = TCL_ERROR 
> (2).  That means either errorCode or result, and right now most APIs are 
> using result.
> You make a good point about leaving pattern matching to existing 
> commands -- I have been thinking along those lines for how best to 
> exploit that (more in another mail).

Branching based on the result just seems like a fragile nightmare. 
Localised error messages for instance could totally change the behaviour 
of code.

[...]
>> People don't do it because it just isn't very useful. Errors in Tcl 
>> tend to be real errors -- other than logging them, there is often not 
>> much to do. Tcl's introspection, use of general control exceptions 
>> like [continue]/[break], and custom control structures/HOFs etc make 
>> this kind of exception-based case analysis much less necessary. I may 
>> well be wrong about this, but I'd prefer to see some concrete use-cases.
> The nature of errors in Tcl is a side-effect of the weak support for 
> distinguishing between types of errors.  This functionality is useful 
> any time that you are calling into an opaque API and can take different 
> recovery actions based on the cause of the error.  e.g. you want to wrap 
> load balancing and/or fault tolerance (simplest case: auto-reconnect) 
> around an RPC or DB interface; you want to try alternative 
> authentication schemes when the password fails (but not when the 
> connection fails, or the protocol is mismatched); you want to tell the 
> user whether to 'try again later' or 'call the administrator'.

Thanks for these use-cases.

> Let's drop to Java-speak for a moment: the current practice of Tcl 
> developers is to "catch (Exception e) { // handle }".  In C++ it is 
> "catch (...) { // handle }".
> Return code is not a mechanism for exception/error typing, it is a 
> mechanism for implementing control flows.  We need a typing mechanism.

Hmmm... error handling needs case-analysis not typing per se. Clearly, 
any mechanism used for constructing error cases is strongly related to 
the mechanism that is used for distinguishing them, which is why I'd 
rather see these aspects separated/parameterised from the exception 
handling, at least until it is clear what the best mechanism for this is.

[...]

-- Neil

This message has been checked for viruses but the contents of an attachment
may still contain software viruses, which could damage your computer system:
you are advised to perform your own checks. Email communications with the
University of Nottingham may be monitored as permitted by UK legislation.

Re: [TCLCORE] TIP#239 Try/Catch/Finally syntax [WAS Re: Tcl-Core Digest, Vol 30, Issue 10]

The Tool Command Language implementation

Re: [TCLCORE] TIP#239 Try/Catch/Finally syntax [WAS Re: Tcl-Core Digest, Vol 30, Issue 10]