From: Twylite <tw...@cr...> - 2008-12-01 11:50:26
|
Hi everyone, I've updated TIP #329 (http://tip.tcl.tk/329) and provided a reference implementation in Tcl. If you have any last-minute show-stoppers please yell, otherwise I think the TIP is ready for a vote, if someone is willing to sponsor it. Regards, Twylite |
From: Andreas L. <av...@lo...> - 2008-12-02 07:56:54
|
Twylite <tw...@cr...> wrote: > Case in point: we have an API for accessing a hardware coprocessor > (let's call the API "ABC"). The coprocessor returns numeric error > results in the range 1 to 99. So we defined the -errorcode to be [list > ABC $errnum], e.g. "ABC 4". > A little while later we realised that error 4 is quite special -- it is > allowed to return an extended error information field (the coprocessor > had to maintain the use of error 4 for backwards compatibility, but > there were times when knowing the exact cause was important). So we > extended the -errorcode in this case to [list ABC $errnum $extra], e.g. > "ABC 4 F". Using list-patterns of course adresses this usecase just perfectly. {ABC 4} would match "ABC 4 F" as well as "ABC 4 whatever" and also just "ABC 4", but still would not match "ABC 42". {ABC 4 [A-C]} could furthermore be used to match "ABC 4 B" but not "ABC 4 F. I guess it will take a few years from now till Donal runs into such a practical usecase himself, and at that point we will perhaps add a new ltrap clause with just that type of matching. Adding it now would unfortunately put the TIP at risk. Joe English <jen...@fl...> wrote: > Just about every other language with a try/catch/finally > statement binds variables as part of the handler clause; tcl is different :-) In this particular case, I even think it's good so. The resultvar's typical name may be confusing for the "on ok" block, but one can easily assign it to a better variable inside the block, if the information is later needed with an apt name. The "on ok" block is likely to be rather a rare case. Other languages have only one exception-variable, and that's length is generally much smaller than the length of the exception name, so two more chars don't hurt. Also in the other languags the names of the variables aren't visible anymore after the block. In tcl's try, having to specify a variable for each handler would make it much more bulky: try {...} trap {{MYERR FOO} vMsg vDict} {...} And since each variable would outlive the handler, we'd have a bulk of possibly but not always defined variables afterwards. In a nutshell: just because in tcl the error-variables are broader scoped, it makes sense to define them globally to the try-command. Twylite <tw...@cr...> wrote: > So far the opinion seems to be that a list prefix is too limited, > and a list pattern match (per element glob) is too difficult That's relative to its perceived usefulness. > and has no existing reference. Many things in tcl haven't. > The intended manner for extending [try] is by adding new handler > keywords (if the existing ones are not handling the required use cases). ... ltrap {ABC 4} ... |
From: Magentus <mag...@gm...> - 2008-12-07 06:24:52
Attachments:
signature.asc
|
On Tue, 2 Dec 2008 08:56:40 +0100, Andreas Leitgeb <av...@lo...> wrote: > I guess it will take a few years from now till Donal runs into such > a practical usecase himself, and at that point we will perhaps add > a new ltrap clause with just that type of matching. Adding it now > would unfortunately put the TIP at risk. Then put it at risk. From what I'm reading, there's still enough dissent; one camp wants ultra-KISS, the other wants a useful flexible tool (no prises for getting my preference ;) ). Postpone the TIP, implement both proposals as tcl::unsupported for people to try out, and then it'll be a simple question a little further down the road of "which one do we keep". The chosen implementation gets moved to ::try, and other one gets relegated to a script in tcllib for anyone who did actually use it in a real project. This isn't NEW functionality, it's a refactoring of OLD functionality. High past time, I'll grant, but it's not a MUST HAVE for this release. Having something to bang on for the next release, and a firm commitment to make errorcodes useful (represented by the trial going on in tcl::unsupported), will probably do more good then pushing through a rushed TIP just so it'll make a deadline. -- Fredderic Junk is something you've kept for years and then throw away three weeks before you need it. Debian/unstable (LC#384816) on i686 2.6.23-z2 2007 (up 3 days, 17:26) |
From: Andreas L. <av...@lo...> - 2008-12-07 10:59:17
|
On Sun, Dec 07, 2008 at 04:24:37PM +1000, Magentus wrote: > > Adding it now would unfortunately put the TIP at risk. > Then put it at risk. Your wish is heard, but not followed. > From what I'm reading, there's still enough dissent; This dissent has been resolved meanwhile. There's only remaining discussion about implementation details. The current version is indeed KISS but has potential to be enhanced later (with new handler pseudo-keywords) if that turned out to be desireable and worth it, later. The current solution goes quite short, but at least right-tracked (unlike globbing). It is good. |
From: Neil M. <ne...@Cs...> - 2008-12-01 17:00:35
|
Hi, On 1 Dec 2008, at 11:49, Twylite wrote: > Hi everyone, > > I've updated TIP #329 (http://tip.tcl.tk/329) and provided a reference > implementation in Tcl. If you have any last-minute show-stoppers > please > yell, otherwise I think the TIP is ready for a vote, if someone is > willing to sponsor it. Thanks for this. I think it looks good -- covers the basic cases and isn't too complicated. I still have some reservations about glob- matching as the only option, but it'll do. -- Neil This message has been checked for viruses but the contents of an attachment may still contain software viruses, which could damage your computer system: you are advised to perform your own checks. Email communications with the University of Nottingham may be monitored as permitted by UK legislation. |
From: Twylite <tw...@cr...> - 2008-12-01 19:40:02
|
Neil Madden wrote: > Thanks for this. I think it looks good -- covers the basic cases and > isn't too complicated. I still have some reservations about > glob-matching as the only option, but it'll do. You and me both ;) In all seriousness, I've already encountered some situations that lead me to believe there will be teething problems, and some careful thinking about what conventions to adopt for -errorcode. Case in point: we have an API for accessing a hardware coprocessor (let's call the API "ABC"). The coprocessor returns numeric error results in the range 1 to 99. So we defined the -errorcode to be [list ABC $errnum], e.g. "ABC 4". A little while later we realised that error 4 is quite special -- it is allowed to return an extended error information field (the coprocessor had to maintain the use of error 4 for backwards compatibility, but there were times when knowing the exact cause was important). So we extended the -errorcode in this case to [list ABC $errnum $extra], e.g. "ABC 4 F". So two hours in this is the position: Any "trap {ABC 4}" no longer works. Changing this to "trap {ABC 4*}" won't work either because it will also trap errors in the range 40-49. Using "trap {ABC 4 *}" will work, but would be incompatible with the older version of the API - that's not a problem right now because we can do a simple find & replace, but if you're intending to maintain the API over a long period of time it would be an issue. A glob against errorcode doesn't work like an OO is-a relationship. You can't arbitrarily subclass the error and maintain compatibility unless you plan for it. That means that you must always have a "*" in the glob; most likely you'll be doing a prefix match. It also means that your errorcode must end with a delimiter (like a space) so that you can distinguish between classes and subclasses of error (so that "ABC 4 *" can only be a subclass of "ABC 4", and not confused with "ABC 42"). My 2c. Twylite |
From: Joe E. <jen...@fl...> - 2008-12-01 21:54:52
|
Also also: re: "trap" clauses doing a glob match or a list prefix match: either way sounds OK to me. I have a slight preference for list prefix match, but glob matching is also OK. The critical thing is to pick *one* kind of matching and stick with it: we don't ever want to see [try { ... } trap -match regexp "..." { ... }] ... and lastly, on the "what color is the bikeshed" question: The names "on" and "trap" are the ones I like best of all the ones proposed. Please keep. --JE |
From: Neil M. <ne...@Cs...> - 2008-12-02 12:31:39
|
On 1 Dec 2008, at 21:32, Joe English wrote: > Also also: re: "trap" clauses doing a glob match or > a list prefix match: either way sounds OK to me. > I have a slight preference for list prefix match, > but glob matching is also OK. The critical thing > is to pick *one* kind of matching and stick with it: > we don't ever want to see > > [try { ... } trap -match regexp "..." { ... }] Why not? -- Neil This message has been checked for viruses but the contents of an attachment may still contain software viruses, which could damage your computer system: you are advised to perform your own checks. Email communications with the University of Nottingham may be monitored as permitted by UK legislation. |
From: Twylite <tw...@cr...> - 2008-12-01 23:05:49
|
Thanks Joe, > Also also: re: "trap" clauses doing a glob match or > a list prefix match: either way sounds OK to me. > I have a slight preference for list prefix match, > but glob matching is also OK. So far the opinion seems to be that a list prefix is too limited, and a list pattern match (per element glob) is too difficult and has no existing reference. List prefixes are limited because they don't allow an errorcode to represent multiple dimensions. e.g. A media streaming error could be both an AudioException and an IOException, and different users of your media library may want to treat the error in different ways (a stream ripper is more concerned with IOExceptions, whereas a music player is probably more concerned with all types of AudioException (io, encoding, etc) ). A list prefix matcher could cope with Java's single inheritance model, but not with the exception support available to (say) C++. > we don't ever want to see > > [try { ... } trap -match regexp "..." { ... }] > No, no we don't ;) The intended manner for extending [try] is by adding new handler keywords (if the existing ones are not handling the required use cases). For example if there was a move in future to a -errorobj in the options dict, then [try] could be extended to do ancestor matching on the -errorobj by introducing an "isa" keyword, e.g. try { ... } isa IoException { ... }. Regards, Twylite |
From: Donal K. F. <don...@ma...> - 2008-12-02 06:38:19
|
Twylite wrote: > In all seriousness, I've already encountered some situations that lead > me to believe there will be teething problems, and some careful thinking > about what conventions to adopt for -errorcode. > > Case in point: [...] What none of this does is make good error design easy, and people mess it up in other languages too. On the other hand, coming at things with a mind to the Art of the Possible :-) it's definitely the case that we want the [try] command to be bytecoded, it's definitely the case that exact matching of errorcodes is not enough, and it's definitely the case that we already have glob matching machinery in the bytecode engine but not anything more complex. Going beyond requires *much* more work. (Myself, I'd prefer to translate the major value from the hardware coprocessor in your example into some kind of name token as well as putting the number in there afterwards. But that's because I prefer to not expose magic numbers to Tcl code, and it's getting into stuff that's wildly off-topic...) > A glob against errorcode doesn't work like an OO is-a relationship. True, but is it "good enough"? We don't need perfection immediately. Donal. |
From: Twylite <tw...@cr...> - 2008-12-02 07:35:52
|
Hi, >> In all seriousness, I've already encountered some situations that lead >> me to believe there will be teething problems, and some careful thinking >> about what conventions to adopt for -errorcode. >> > What none of this does is make good error design easy, and people mess > it up in other languages too. Of course :) I was just pointing out that this approach is going to have some limitations and surprises. In general it is ridiculously easy to extend code written in Tcl while maintaining backwards compatibility - we regularly refactor the guts of packages to improve functionality while providing backwards-compatible behaviour with safe defaults. Even return values can be made backwards compatible using a facade that maps the original interface onto a new more powerful function. Errorcodes based on glob matching cannot always be extended in this manner (while maintaining backwards compatibility); you need to get some aspect of the design right up front, both in how the exception is thrown and how it is caught. So, as I said, teething problems. And I imaging an ongoing cause of questions to c.l.t as new developers encounter this sort of thing and grapple with it. > want the [try] command to be bytecoded, it's definitely the case that > exact matching of errorcodes is not enough, and it's definitely the case > that we already have glob matching machinery in the bytecode engine but > not anything more complex. Going beyond requires *much* more work. > List prefix matching? > (Myself, I'd prefer to translate the major value from the hardware > coprocessor in your example into some kind of name token as well as > putting the number in there afterwards. But that's because I prefer to > not expose magic numbers to Tcl code, and it's getting into stuff that's > wildly off-topic...) > While this addresses the confusion between two similar errorcodes ("ABC 4" vs "ABC 42") ... (1) The potential for confusion is still there and not necessarily obvious to the person naming the errors. A quick analysis of the WIN32 platform SDK shows 42 error names that are also prefixes of other error names (e.g. ERROR_INVALID_HANDLE is a prefix of ERROR_INVALID_HANDLE_STATE, ERROR_INVALID_DATA is a prefix of ERROR_INVALID_DATATYPE). Trapping "WIN32 ERROR_INVALID_HANDLE*" may trap errors you don't want, and "WIN32 ERROR_INVALID_HANDLE *" won't work unless the code throwing the error ensures that there is a trailing space or subclass (like "NONE"). (2) If you trap using an exact match rather than a prefix match, your trap will no longer work if the errorcode is subclassed or otherwise extended. So it is not safe to trap using exact matching. e.g. If I extend my Win32 binding to include the numeric code ("WIN32 ERROR_INVALID_HANDLE 6" instead of "WIN32 ERROR_INVALID_HANDLE") any traps that use exact matching will no longer function as intended. >> A glob against errorcode doesn't work like an OO is-a relationship. >> > True, but is it "good enough"? We don't need perfection immediately. > Dunno :) The examples/problems given above tell me that an exact match is almost never desirable, and a (list) prefix match is likely to be the most common case. A true list prefix match (not a glob match on a string) gives functionality equivalent to catching exceptions by class in a single-inheritance OO model (e.g. Java). A glob match can potentially do more, but at the risk of handling yourself enough rope to get it wrong a lot of the time. I'll see what else experience brings out today... Regards, Twylite |
From: Magentus <mag...@gm...> - 2008-12-06 09:23:27
Attachments:
signature.asc
|
On Tue, 02 Dec 2008 01:05:45 +0200, Twylite <tw...@cr...> wrote: > Thanks Joe, >> Also also: re: "trap" clauses doing a glob match or >> a list prefix match: either way sounds OK to me. >> I have a slight preference for list prefix match, >> but glob matching is also OK. > So far the opinion seems to be that a list prefix is too limited, and > a list pattern match (per element glob) is too difficult and has no > existing reference. That's been a short-coming with glob matching all alone, and something I tried to address in a pluggable manner as a side-issue of my proposal. Just because it has no existing reference is no reason to brush it under the rug and pretend it doesn't exist. > List prefixes are limited because they don't allow an errorcode to > represent multiple dimensions. e.g. A media streaming error could be > both an AudioException and an IOException, and different users of > your media library may want to treat the error in different ways (a > stream ripper is more concerned with IOExceptions, whereas a music > player is probably more concerned with all types of AudioException > (io, encoding, etc) ). A list prefix matcher could cope with Java's > single inheritance model, but not with the exception support > available to (say) C++. I dealt with that in my proposal, too. You're not going to shoe-horn C++ exceptions into any kind of string match. That's why my proposal split the handler into three keywords, two being common use cases, and the third being a catch-all for everything else. The idea also, was that entirely new error paradigns could be added down the road, such as OO exception handling. Further, not all exceptions are errors. A piece of information may or may not be available yet. If it is, then a result can be produced. If it isn't, then some information can be returned and presented to the user. That can readily be reflected by an exception, but is not an error. It is in that same line of reasoning that I pushed for matching on OK return values also. >> we don't ever want to see >> [try { ... } trap -match regexp "..." { ... }] > No, no we don't ;) Why the heck not? (I believe the very next message also says basically the same) > The intended manner for extending [try] is by adding new handler > keywords (if the existing ones are not handling the required use > cases). As I proposed very early on in the piece. That doesn't mean you want to swamp [try] with 1001 handler keywords for every possible error source and string match combination. [try] keywords should deal exclusively with the form that the error takes, and a string match option should take care of matching against the actual content of the error, if doing so even makes any sense for that error form. And all I'm reading is people complaining that the current proposal won't do this, and won't do that, that my proposal answers cleanly. Sounds like a case of "this is my party" to me. -- Fredderic You would if you could, but you can't so you won't. Debian/unstable (LC#384816) on i686 2.6.23-z2 2007 (up 2 days, 20:29) |
From: Donal K. F. <don...@ma...> - 2008-12-06 10:23:03
|
Magentus wrote: > And all I'm reading is people complaining that the current proposal > won't do this, and won't do that, that my proposal answers cleanly. > Sounds like a case of "this is my party" to me. We're in a vote now, on version 1.7 of the proposal. That covers exactly one scheme for error matching: exact list prefixes. While there are use cases which are not dealt with elegantly by it, they tend to rely on badly constructed or badly designed errors. The first case is (probably) a bug elsewhere (going by existing documentation) and the second is really not our fault. If you need something more complex, you can either use a smaller prefix or handle all errors directly (through the exception code handler) and use a Tcl script to do the rest. That is Good Enough. Donal. |
From: Joe E. <jen...@fl...> - 2008-12-01 18:07:57
|
Twylite wrote: > > I've updated TIP #329 (http://tip.tcl.tk/329) and provided a reference > implementation in Tcl. If you have any last-minute show-stoppers please > yell, otherwise I think the TIP is ready for a vote, if someone is > willing to sponsor it. Thank you; this looks just about right. Please reconsider the "as" clause. It is better to specify the variables to be bound at each handler as was done in the previous version of the TIP. The original rationale for introducing 'as' is uncompelling. The meaning of the return value (and whether it's even needed) depends on which handler clause triggered. Compare: try { open $filename } as {XXX opts} on ok { # (1) } trap {POSIX ENOENT *} { # (2a) } on error { # (2b) } # (3) If control reaches (1), $XXX holds an open file channel. If control reaches (2a) or (2b), it holds an error message. At (3), all you know is that XXX is set -- but not what it means. This is better written as: try { open $filename } on {ok fp} { # ... write to $fp } trap {{POSIX ENOENT *}} { # ... handle "file not found" condition. } on {error msg} { # ... report uncaught error $msg } Also: in section "Handlers", subsection "Notes & clarifications", bullet point 4: | If any exception is replaced (by an exception in a handler body or in | the finally body) then the new exception shall introduce a field into | its options dict that contains all details of the original exception. This needs clarification. (Is it even necessary?) --Joe English jen...@fl... |
From: Twylite <tw...@cr...> - 2008-12-01 19:12:45
|
Hi, > Please reconsider the "as" clause. It is better to > specify the variables to be bound at each handler as was > done in the previous version of the TIP. > I am starting to use this version of [try] in a product that will have a beta release & customer demo in about two days. This is giving me a better feel for how [try] will behave in real work ;) I should have time tomorrow to experiment with both approaches (as vs per-handler) and see what feels best (I can also run it past some junior developers and see how they respond to the syntax - it's usually a good indicator of complexity). Anyone else have strong feelings about this (either way)? > Also: in section "Handlers", subsection "Notes & clarifications", > bullet point 4: > > | If any exception is replaced (by an exception in a handler body or in > | the finally body) then the new exception shall introduce a field into > | its options dict that contains all details of the original exception. > > > This needs clarification. (Is it even necessary?) > In short: exception chaining. If an exception is thrown in an exception handler or finally clause, the original exception (the "root cause") should not be lost. This is critical for debugging problems in live systems, and very useful in development as well. I'll point out Java's chain exception facility (http://java.sun.com/j2se/1.4.2/docs/guide/lang/chained-exceptions.html) that was introduced in Java 1.4 after experiences with previous versions. Many developers saw the need to chain exceptions and developed their own schemes to do so, leading to a situation where high level code could not adequately introspect exceptions, thus complicating logging, debugging, etc. The facility Java provides is still inadequate though -- many developers use boilerplate code or AOP to put their catch handler itself into a try/catch that automates exception chaining so that unintended exceptions in the handler don't obscure the original cause of the problem. Note that Java has supported fillInStackTrace() since 1.1 (IIRC), which is roughly equivalent to Tcl's return/error with errorInfo, and which was not adequate for the needs of Java developers. Tcl is not Java, but it would be rather brash to dismiss the experiences of that community. Regards, Twylite |
From: Joe E. <jen...@fl...> - 2008-12-01 21:20:01
|
Twylite wrote: > I am starting to use this version of [try] in a product that will have a > beta release & customer demo in about two days. This is giving me a > better feel for how [try] will behave in real work ;) I should have > time tomorrow to experiment with both approaches (as vs per-handler) and > see what feels best (I can also run it past some junior developers and > see how they respond to the syntax - it's usually a good indicator of > complexity). Good plan. > Anyone else have strong feelings about this (either way)? Two more points: (1) Just about every other language with a try/catch/finally statement [*] binds variables as part of the handler clause; and (2) personal experience: with the "as" form, I can never figure out a good place to put braces and linebreaks :-) > > Also: in section "Handlers", subsection "Notes & clarifications", > > bullet point 4: > > | If any exception is replaced (by an exception in a handler body or in > > | the finally body) then the new exception shall introduce a field into > > | its options dict that contains all details of the original exception. > > This needs clarification. (Is it even necessary?) > > > In short: exception chaining. If an exception is thrown in an exception > handler or finally clause, the original exception (the "root cause") > should not be lost. This is critical for debugging problems in live > systems, and very useful in development as well. That's fine, but it still needs clarification: what is the name of the entry that gets added to the options dictionary, and what does it contain? --JE [*] Examples: Java, C#, C++, Ada, Erlang, ML, ... |
From: Twylite <tw...@cr...> - 2008-12-02 09:10:53
|
Some feedback on JE's request to remove the "as" clause: Approved ;) I have spoken with two colleagues of different experience levels and they agree that associating the variables with each handler is a more readable and logical approach. So the TIP will be updated to: try body ?handler ...? ?finally body? where handler is: on code {?em ?opts??} body trap pattern {?em ?opts??} body This is slightly different from my original suggestion in that the code/pattern is separate to the args. These are separate concerns so they should not be combined in one list, and it also means that the bytecoding needs to do less sublist parsing. Regards, Twylite |
From: Twylite <tw...@cr...> - 2008-12-02 09:21:20
|
Further feedback on error matching: It looks like glob matching is not going to cut it. List prefix matching will be a similarly-powerful and generally safer approach. Looking at the functionality of other exception matching systems, we can see that Java (with single inheritance) is the equivalent of matching against a tree, and C++ (with multiple inheritance) is the equivalent of matching against a Directed Acyclic Graph. Neither string matching nor list prefix matching nor element-wise list matching give the functionality of a tree match. Example: In Java you may define a MyIOException extends IOException (extends Exception extends Throwable). Code using your API will catch (MyIOException e) { // handle it } You can safely subclass MyIOException and throw these subclasses without breaking code that catches MyIOException (as in the example above). You can also safely introduce a new superclass between IOException and MyIOException (MyCompanysGenericIOException) without breaking code that catches MyIOException. In C++ you could introduce a completely new base class to MyIOException via multiple inheritance, without breaking existing code. With glob/list/prefix matching you can support either subclassing or superclassing the error, not both, and not MI. You cannot use glob/list matching safely to allow for superclassing (don't got there -- think of the pattern "Throwable * FormatException *" and how many third party APIs are going to have something called "FormatException" and you'll understand the folly of this approach). So while element-wise list matching may SEEM really powerful, its just a knapsack with a big gun and rope. Similarly a glob match may seem flexible, but in most cases its going to be used a prefix match, and often used incorrectly such that it prevents subclassing of errors in future. So, in the absence of an approach to tree/DAG matching, a list prefix seems like the most sensible option. More to come. Regards, Twylite |
From: Donal K. F. <don...@ma...> - 2008-12-02 11:58:32
|
Twylite wrote: > More to come. No. Please, no. The primary constraint is the deadline for new features, being Wednesday next week. That which is not approved by then won't be making 8.6; we're strict about that (especially for something as big as a major new command). It takes a week to run a vote. A solution that answers today's problems is what is possible to do in the time available; a solution that is "perfect" is a solution that won't happen. We won't hold the deadline while waiting for [try] to become "perfect". You have (realistically) until end of business *today* to produce a final specification that balances functionality and practicality of implementation. That means you have to ignore input from people with theoretically good ideas and instead focus on "good enough for now". I had to do the same with TclOO; if I'd incorporated everyone's ideas and dealt with every issue they had, it would never have actually happened. One good thing is that if it becomes deemed necessary to do something as complex as exception type tree handling, we can add it in the future with another keyword. But not now. Donal. |
From: Twylite <tw...@cr...> - 2008-12-02 12:10:46
|
>> More to come. > > No. Please, no. More to come ... on whether glob matching or list prefix matching actually works best in practice, which will determine which one goes in the TIP. Regards, Trevor |
From: Neil M. <ne...@Cs...> - 2008-12-02 12:30:37
|
On 2 Dec 2008, at 09:21, Twylite wrote: > Further feedback on error matching: > > It looks like glob matching is not going to cut it. List prefix > matching will be a similarly-powerful and generally safer approach. > > Looking at the functionality of other exception matching systems, > we can > see that Java (with single inheritance) is the equivalent of matching > against a tree, and C++ (with multiple inheritance) is the > equivalent of > matching against a Directed Acyclic Graph. > > Neither string matching nor list prefix matching nor element-wise list > matching give the functionality of a tree match. > > Example: > > In Java you may define a MyIOException extends IOException (extends > Exception extends Throwable). Code using your API will > catch (MyIOException e) { // handle it } > > You can safely subclass MyIOException and throw these subclasses > without > breaking code that catches MyIOException (as in the example above). > > You can also safely introduce a new superclass between IOException and > MyIOException (MyCompanysGenericIOException) without breaking code > that > catches MyIOException. This is because inheritance matching is encapsulated, whereas as pattern matching on a concrete data structure isn't. If you really want to support these use-cases then they way to go is to use TclOO. > In C++ you could introduce a completely new base class to > MyIOException > via multiple inheritance, without breaking existing code. Likewise in Java you can happily add new interfaces (implements Blah...). > With glob/list/prefix matching you can support either subclassing or > superclassing the error, not both, and not MI. You cannot use glob/ > list > matching safely to allow for superclassing (don't got there -- > think of > the pattern "Throwable * FormatException *" and how many third party > APIs are going to have something called "FormatException" and you'll > understand the folly of this approach). > > So while element-wise list matching may SEEM really powerful, its > just a > knapsack with a big gun and rope. Similarly a glob match may seem > flexible, but in most cases its going to be used a prefix match, and > often used incorrectly such that it prevents subclassing of errors in > future. > > So, in the absence of an approach to tree/DAG matching, a list prefix > seems like the most sensible option. Well, the most flexible form of (concrete) pattern matching is the algebraic sort found in functional programming languages: that can match anything expressible as a sum-of-products, which includes lists, trees, and so on. It still matches against a concrete data- type, however, so the use-cases you imagine would still break. The only real way to get around that is encapsulation of the matching, which can be achieved either by making the new data structure look like the old ("views"), or by defining something like an abstract "is- a" relation. If you want full flexibility and the ability to match arbitrary DAGs then there's always Prolog. Most of those suggestions aren't practical, at least not with current Tcl. Inheritance based matching using TclOO is practical, but would represent a radical departure from the way -errorcode is currently used. Of the other options, it does look like strict list-prefix matching is the least surprising, but also the least flexible. -- Neil This message has been checked for viruses but the contents of an attachment may still contain software viruses, which could damage your computer system: you are advised to perform your own checks. Email communications with the University of Nottingham may be monitored as permitted by UK legislation. |
From: Twylite <tw...@cr...> - 2008-12-02 13:56:43
|
Since I'm actually far shorter on time than this mail wold suggest, this is as much to document my thoughts & findings for later reference as it is to explain my reasoning. > Further feedback on error matching: > > It looks like glob matching is not going to cut it. List prefix > matching will be a similarly-powerful and generally safer approach. > Okay, we've done a quick analysis on our source repository (C, C++, Java and Tcl, more than 1m lines over hundreds of applications & utils developed over 10+ years). In most cases we avoid branching based on errors/exceptions (a number of Best Practice authors advise against doing so), so we can categorise our exception handling into "log and ignore", "log and rethrow", "log and abort", "recover/retry" (just try it again, it may work) and "intelligent recover/retry" (attempt to overcome the specific problem then try it again). Exceptions that demand user interaction are included in the "log and X" categories. Very few exceptions fall outside these categories. Since this was a _quick_ analysis I can only talk in impressions, but our impression is that "log and X" is far and away the most common use case, and the vast majority (way about 80%) of these cases are "catch all, log and X". The next category down is intelligent retry (we have applications with some really domain specific retry logic), which needs to catch error classes (IO errors) and specific errors (ApiException with cause 1125). In Java and C++ we catch on classes near the top of the hierarchy and then switch or if/then for more specific errors. In most cases the IO errors are coming from a subsystem and we catch all SystemIoExceptions rather than (say) java.lang.IOException. What we learned from this is that if we represent an error as a unique prefix word followed by a unique error name or code, then an exact prefix match is going to be good enough for us 80% of the time. If we represent an error as a list of increasingly specific elements (API SUBSYSTEM ERRNAME ...) then an exact prefix match is going to be good enough upwards of 90% of the time _assuming we separate code into subsystems that have high cohesion and low coupling_ (which is generally a good idea), and capable of greater specificity in error handling than Java or C++ (but not of greater generality). We identified only one placed in our entire code base that cannot be adequately handled by a prefix match against an errorcode list. A base exception class has two integer fields indicating the cause of the error; each function in the API has its own associated exception class that inherits from the base class. Yes, it sounds very weird (it is very weird), but it allows very high level code to determine which _function_ failed, which is the essential bit of information needed to determine how to recover. Without getting into more details about the hierarchy, let me assure you that there is no list representation that can be matched with a prefix that covers all types of catch we need to do (i.e. catch on one of the error fields or on the subclass type). If we constructed the errorcode as "XAPI code1 code2 FUNCNAME" then a string glob match _could_ work (e.g. "XAPI * FUNCNAME"). But that solution isn't good enough -- there is a special case subclass of one of the function exceptions, and it was added after the first drop of the product. If we made the errorcode "XAPI code1 code2 FUNCNAME sub1" it would break existing trap patterns (that don't have a trailing *). Using "XAPI code1 code2 sub1 FUNCNAME" may or may not be backwards compatible (e.g. code could be trapping code1 == 5 and then logging FUNCNAME ... but did it use lindex end or lindex 3?). So the general rules one must follow with -errorcode to avoid shooting yourself in the foot are: (1) When trapping errors exact matching against the full errorcode is always a bad idea. It prevents any future extension of the errorcode to distinguish between different errors that currently share the same errorcode (or new functionality that must conform to an existing error model and thus share an existing code). A match (prefix, glob, etc.) is pretty much required if you want maintainable code. (2) If you are building errorcodes with [list] and matching them with glob it becomes impossible to distinguish between error subclasses and adjacent errorcodes that share a common prefix. e.g. "ABC 4" vs "ABC 42", or "WIN32 INVALID_DATA" vs "WIN32 INVALID_DATATYPE". To use glob you must build errorcodes as a string and add a trailing space or other appropriate delimiter, so that you can match "ABC 4 *" instead of "ABC 4*". (3) If your errorcode information is represented as a list then you should assume that the user trapping the error is parsing the list to extract useful information, and you should further assume that such parsing involves positional arguments (e.g. lindex $errorcode 2). It is therefore only safe to extend errorcodes at one end - you cannot safely add more fields in the middle of the errorcode. (4) The only thing that a glob match can do - that a prefix/suffix match cannot - is match stuff in the middle of an errorcode. Since you can't safely extended errorcodes in the middle this is of limited use unless you have two different dimensions on which to trap. Most other languages don't support this sort of thing directly in their try/catch syntax. So I'm calling it at this: using glob is going to lead to mistakes and design inflexibility that are hard to overcome unless you notice them early, and there is little practical benefit associated with this cost. A list prefix match (for each element in the pattern there must exist a corresponding element with the identical value in the list under consideration) is Good Enough, and far safer. Anything else can be handled with an extension when we know more about the problem. I'll update the TIP accordingly. Regards, Twylite |
From: Twylite <tw...@cr...> - 2008-12-02 14:45:31
|
TIP #329 updated. Changes: (1) Variables are now assigned per handler Was: try { ... } as {em opts} on error { ... } finally { ... } Now: try { ... } on error {em opts} { ... } finally { ... } (2) Trap matching uses list prefix instead of glob Was: try { ... } trap {POSIX *} {em opts} { ... } Now: try { ... } trap {POSIX} {em opts} { ... } (3) Clarified exception chaining If an exception is thrown from a handler or the finally body, the options dict of the exception it replaces is added into the new exception's options dict under the key "-during". _If you have a suggestion for a better name than -during speak really really fast_ I will take a final read over the TIP when I get home (+/- 1.5 hours) and it should then, finally, hopefully, trapping new and unexpected objections, be ready for a vote. Regards, Twylite |
From: Twylite <tw...@cr...> - 2008-12-02 17:22:20
|
Final edit on TIP #329 completed - a couple of inconsistencies have been fixed. There will be no further changes to the specification. Everyone has had their say, you now either like it or you don't. I haven't had a chance to update the reference implementation - it has been moved out of the TIP and onto my site where I will update it in X (for "shortly" < X < "in due course"). Donal - if you're willing to sponsor this TIP will you please call a vote. Regards, Twylite |
From: Twylite <tw...@cr...> - 2008-12-02 21:58:22
|
Hi, > I've a couple of clarification questions... > > 1) Is [throw] necessary? (Maybe yes...) I believe it is, to make developers think of errors as having a type first and foremost, and a description as a secondary concern (side benefit: we're likely to see more use of msgcat for error messages). If one does not think of errors in this way then there will be nothing on which to trap, and little value to the trap handler. Is it necessary for [throw] to be in the core? It's easily implemented in Tcl and not performance critical, so I'd be happy to let it mature in a tcllib extension first. > 2) Is the matching algorithm this: > [lrange $pattern 0 end] eq [lrange $errcode 0 [llength $pattern]-1] > If so, that's acceptable as that's practical to implement. Yes (*). (*) I take it that [lrange] is forcing a canonical representation of the lists and that my current level of awakeness is not fooling me. As a cross-check, this would be an iterative implementation of what I am intending: for {set i 0} {$i < [llength $pattern]} {incr i} { if { [lindex $pattern $i] ne [lindex $errorcode $i] } { return 0 } } return 1 Regards, Twylite |