|
From: Twylite <tw...@cr...> - 2008-11-25 15:57:20
|
Hi, > And also: > http://www.acmqueue.org/modules.php?pa=showpage&pid=505&name=Content > > I hold architecture astronautics in contempt. Reminds me too much of > work and of how projects can go wrong for failing to focus on critical > details. > Very well. Here is the proposal: -- try tscript ?as {emvar ?optsvar?}? ?handler ...? finally fscript where handler is one of: on code hscript trap pattern hscript code may be any integer or the magic values ok, return, error, break, continue. trap handles only return code error where pattern is a glob match against the -errorcode. only one handler's hscript is executed, and it will be the first matching handler in left-to-right order. If the hscript is the literal "-" then the hscript for the next handler (in left to right order) shall be used. The result of executing the hscript is the result of the [try]. exceptions in the hscript replace the original exception and propagate (no further handlers are searched). any unhandled exception is propagated. fscript is executed regardless of success or exceptions, and is done after the handler (if any) is executed. Exceptions in the fscript replace the original (or hscript) exception and propagate, otherwise the result of fscript is discarded. If an exception is replaced (by one in hscript and/or fscript) then the new exception shall introduce a field to its options dict the contains all details of the original exception (forming a chain of exceptions). emvar and optsvar, if specified, will always be populated with details of the outcome of tscript. To complement [try] there shall also be a new command [throw] to encourage the use of the trap facility. throw type message where type is a list that will become the -errorcode. throw is equivalent to [error $message {} $type] -- Opportunity for future extension: - If the handlers presented are shown to be insufficient for common use cases, it will be possibly to add more handler types (using keywords other than "on", "trap"). - If the variables that capture exception information are insufficient, additional vars can be added to the "as" list without affecting compatibility - If all else fails, options can be added between try and tscript (as with switch & friends). Sample implementation to follow. Sadly it will be some time before I can gain value from this function myself. Not until the third-party libraries I am tied to - which require me to parse the (non-localised) result - are updated to produce an errorcode. Not until I find enough time on or between projects to refactor legacy code (core libraries that I depend on and use in pretty much all new utilities and applications) to use errorcode rather than (or, for backwards compatibility, in addition to) a list structure in the result. Not in places where I need to maintain code that allocates & frees more than 2 resources, which can be done cleanly using finally callbacks, but otherwise sends you to the depths of nested try/finally blocks and you can't see the wood for the error-and-cleanup-handling trees. But hey, I'm an astronaut, what would I know about this real-world stuff? Twylite |
|
From: Donal K. F. <don...@ma...> - 2008-11-25 20:23:44
|
Andreas Leitgeb wrote:
> Even {CHILDKILLED * SIGSEGV *} may have its use. The errorCodes
> thrown from core appear to have been carefully crafted to be
> reasonably glob-able. At this point in time, glob appears like
> the perfect hammer for the nail, but nails are likely going to
> change when programmers get more into the habit of creating new
> errorCodes for their applications, and introducing ambiguities,
> that could have been avoided by list patterns in the first place.
So... you're rejecting an admittedly perfect solution for a much more
complex one because of a possible theoretical problem in the future?
Words fail me.
Donal.
|
|
From: Andreas L. <av...@lo...> - 2008-11-25 22:22:59
|
On Tue, Nov 25, 2008 at 08:23:35PM +0000, Donal K. Fellows wrote:
> Andreas Leitgeb wrote:
> >Even {CHILDKILLED * SIGSEGV *} may have its use. The errorCodes
> >thrown from core appear to have been carefully crafted to be
> >reasonably glob-able.
> So... you're rejecting an admittedly perfect solution for a much more
> complex one because of a possible theoretical problem in the future?
- I don't see it as that *much* more complex.
- I seem to estimate the probability of that future demand
differently than you. That's just gut-feeling on both sides.
- That it happens to work ok with those errorcodes currently thrown
from the core does not yet make it perfect.
Anyway, all my arguments have been said. Seems like I just failed to
prevent what I consider a bad thing from happening to Tcl. Alas. <EOD>
|
|
From: Magentus <mag...@gm...> - 2008-11-26 12:56:17
Attachments:
signature.asc
|
On Tue, 25 Nov 2008 20:23:35 +0000,
"Donal K. Fellows" <don...@ma...> wrote:
> Andreas Leitgeb wrote:
>> Even {CHILDKILLED * SIGSEGV *} may have its use. The errorCodes
>> thrown from core appear to have been carefully crafted to be
>> reasonably glob-able. At this point in time, glob appears like
>> the perfect hammer for the nail, but nails are likely going to
>> change when programmers get more into the habit of creating new
>> errorCodes for their applications, and introducing ambiguities,
>> that could have been avoided by list patterns in the first place.
> So... you're rejecting an admittedly perfect solution for a much more
> complex one because of a possible theoretical problem in the future?
> Words fail me.
I don't think it's been admitted to be perfect. I believe one of the
the key phrases was "reasonably glob-able", and the other key concept
was that just because it ALMOST fits the present carefully crafted
nails, doesn't mean it will fit the less carefully crafted nails that
will start to appear once the idea of using errorcodes catches on.
Glob on a list is flawed. It's as simple as that. And it astounds me
that you can defend the idea of matching a list with at least one piece
of arbitrary text, with glob matching that has no regard for word
boundaries.
Andreas Leitgeb:
> Anyway, all my arguments have been said. Seems like I just failed to
> prevent what I consider a bad thing from happening to Tcl. Alas. <EOD>
While sometimes I go off on a tangent that even I don't expect to be
followed on, and I freely admit that, a good useful implementation of
this will be immediately practical and useful. We're ALL doing the
kinds of things this command is designed to make easier. We ALL want
to see good error handling. We ALL want to be able to get rid of those
annoying and hard to follow if..elseif and catch+switch blocks. And
[try] has the opportunity to do that in a clean way.
But the way it's being dumbed down to mere triviality, I can't agree
more on this particular sentiment.
--
Fredderic
Debian/unstable (LC#384816) on i686 2.6.23-z2 2007 (up 49 days, 7:01)
|
|
From: Magentus <mag...@gm...> - 2008-11-23 06:29:05
Attachments:
signature.asc
|
On Sat, 22 Nov 2008 15:05:53 +0200,
Twylite <tw...@cr...> wrote:
> Hi,
>> From: Magentus <mag...@gm...>
>> The [finally script] usage is trivial to implement using unset
>> traces (although not quite as clean, mostly since it uses a magic
>> variable name).
> This works for [proc] and [apply], but is not completely reliable.
> There is no guarantee that the magic finally variable will be the
> last to be unset, so a script like 'finally [list close $f]' is safe
> but 'finally { close $f }' may not behave as expected.
Granted. Which is why I generally attach it to the variable holding
the channel descriptor, and always use the [list ...] form. The
statement still stands, as usual, with some caveats.
> Also [try] is not a separate scope for variables, so it would have to
> have a special interaction with the magic finally variable such that
> [finally] scripts added inside the context of [try] are executed at
> the end of the [try].
No arguments that it's a bad way of adding finally scripts to [try]. I
was responding the to idea of a [finally] command in general. As said,
it's a nice idea, but trivial to implement (add: with care), and not
useful enough to worry about at this time. My apologies if the extra
comments didn't make that clear.
In short, I do think that a proper [finally] command would be handy,
the one I offered can optionally be bound to a specific variable which
makes it a lot safer; I was just saying it's not necessary. Easier
persistent local storage for a [proc] would be handy too (and sort of
achieved with continuations, although they're still much more fiddly
than necessary).
> Example:
> proc dostuff {} {
> set f [open {c:/boot.ini} r]
> trace add variable --finally--trap-- unset [list apply [list args
> { close $f ; puts done }]]
> }
> dostuff
> chan names ;# -> stdout stderr filed27ae8 stdin
As you said, a bad way of doing it. I tend to use [list] for ANYTHING
that's going to be deferred, unless I absolutely have it. Avoids a
whole lot of such problems.
> proc dostuff {} {
> set f [open {c:/boot.ini} r]
> trace add variable --finally--trap-- unset [list apply [list args
> [list close $f]]]
> }
> dostuff
> chan names ;# -> stdout stderr stdin
Certainly the way I'd do it. Mind you, I wouldn't waste an [apply] on
a hard-coded script like that; unless some of the core wizards here can
think of a reason why it's a good thing. (Which I would be most
interested in hearing.)
>> The [try] command for matching on something other than the return
>> code is excellent. Especially if it can match on return values as
>> well as errorcodes. How about this for a twist on the idea...
>>
>> try {
>> script
>> } catch {
>> var ?opts?
>> } then {
>> script
>> } handler .....and so on.....
>>
> This fits with extending [catch], e.g.
> catch { ... } em opts then { ... } handler {...}
> The feedback I've had so far on this approach has not been
> favorable. It seems that developers would prefer to keep the
> args/vars in the context of the handler body.
Hmmm..... Fair enough. My reasoning is this:
- Restricts and confuses the arguments to the individual handlers.
They're obvious, mostly redundant, and get in the way of other more
useful potentially optional arguments.
- What happens if each handler specifies a different set of variables.
Which ones will be defined when the code block completes? Or are they
only defined within the context of the handler being invoked? It's
confusing.
Having them specified up front makes it obvious that they're set within
the current scope, and hence will be available both to the invoked
handler and to code following the [try] block.
>> Regardless, why not have the handler clause evaluate an expression
>> in the context of a [dict with $opts]? Then you can use whatever
>> matching function you wish, the only minor pain is that you have to
>> use some ugly bracketing of the option names { ${-code} == 2 }.
>> But maybe there's a way around that, too, especially if the [dict
>> with] is doable read-only and non-destructively somehow.
> In a word, performance. I have been having conversations with other
> Tcl developers off-list, and proposed exactly this. It is
> unquestionably the most flexible option, but it forces a sequential
> consideration of each handler's expression, preventing any sort of
> heuristic to improve the performance of the construct. Since one of
> the uses of this [try] will be to build other language constructs,
> performance is something that deserves reasonable consideration.
> The tradeoff may be to have "pluggable handler matching" where some
> handlers can use exact matching ( O(1) time), some can use glob, some
> can use expr, etc. Doing this in a manner that maintains a simple
> syntax is quite difficult however.
This is pretty much exactly what I expected, and why I was thinking
that adding it later, to the standard already-specified
most-common-cases forms, would be optimal.
The simple on and handle cases are fast and efficient, and need only
support basic [glob] matching against the return and errorcode values
respectively. (A list-wise glob match would probably be useful, in a
few places.) The expr-based matching is then reserved for making curly
cases readable, able to perform and/or conditionals as well as
extraction with [regexp] and every other form of matching known to
TCL-kind.
>> And finally for over-all syntax, what'd be wrong with tagging the
>> try clauses onto the end of the present [catch] command. Make the
>> options variable mandatory in this usage, and bring it into scope
>> for the evaluations as above.
> See above. I'm not necessarily against it, but it doesn't seem to be
> a popular option.
Yeah. I kind of got that myself. Just seems like a bit of duplication
to me. Nevermind.
>>> handle {code ?resultVar ?optionsVar??} { script }
>> Is there any actual practical use to putting code in the braces?
> Not that I'm aware of, no. My current thinking is that it will be
> outside the brackets, e.g.
> handle code/expr {?resultvar? ?optionsvar?} { body }
That would be _much_ preferable. I do think, though, that being able
to glob-match on a returned value is a requirement to being worth the
effort. Otherwise you'll have a bunch of branches each with an
embedded [switch] and it's going to look worse, be less useful, and
probably less efficient, than what I've sometimes done:
switch -glob -- [catch {...} foo],$foo
hence my personal preference to moving the variables up top, and having
the syntax:
HANDLE errorcode-pattern {...}
ON return-code returnvalue-pattern {...}
One possible thought; a "return" (pending a better name) handler that
matches the return value, and leaving that off from the "on" handler.
So...
HANDLE errorcode-pattern {...}
RETURN returnvalue-pattern {...}
ON return-code {...}
might be better, on the basis that most of the return codes don't allow
you to specify a return value without producing them directly through
[return]. Further on that, the return-code could optionally be a list
of two words with the return value pattern being the second, which
would allow the "on" form to handle it transparently without the
"return" form at all.
>> Something like a:
>> withvars {resultVar ?optionsVar?}
>> following the main try script indicating where to stash the
>> variables.
> One advantage of having the vars with the handler script is that it
> allows you to reuse handlers. e.g.
> set GENERAL_IO_HANDLER {{em opts} { log "Problem: $em" }}
> ...
> try {
> # some IO routine
> } handle error * {*}$GENERAL_IO_HANDLER
> And in this case its no coincidence that the GENERAL_IO_HANDLER looks
> like an anonymous function that can be used with [apply]
I don't see any advantage to that at all. The handler won't be
compiled or anything of the kind any more than it would be without the
vars, and special magic is still going to need to be added to allow it
to efficiently be re-used with [apply] or [eval] or what-not.
>> For the blending with [if] option, there was chatter a while back
>> about fast [expr]-local variables intended mostly to hold partial
>> results during an expression; the main terms of the options dict
>> could quite readily be pre-loaded as [expr]-local variables.
> I'm very interesting in the idea of extending [expr] in various ways,
> especially to make pattern matching easier and somehow bind the error
> options as variables into the expr. It's just not going to happen by
> 10 December, so we can't use any approach that relies on it.
Absolutely. Again, that's why I suggested having the [expr]-based
method in addition to basic glob-matched "handler" and "on code"
forms. Almost every place where I'd use the [try] structure, fall into
one of two catagories;
try {
... open a file and do stuff ...
} finally {
... close the file ...
}
and
try {
... do some stuff that might error ...
} handle "error BLAH:*" {
... handle error blah ...
} handle "error FOO:*" {
... handle error foo ...
} on break * {
... handle the aborted case ...
} on ok "* *" {
... handle two or more word return ...
} on ok "" {
... handle empty return ...
} on ok * {
... handle single-word or empty return ...
}
Without the [expr]-based match that's a little uglier than needed, but
still marginally better than the usual catch+switch method. The
pluggable handlers might let me do a [proc args] style match, which
would be very useful for several other places as well as here (eg.
useful continuations), but this would suffice for every use case I can
think of. The "return" form or option-second-word of the "on" forms
code argument, would make that just a little bit neater...
--
Fredderic
Debian/unstable (LC#384816) on i686 2.6.23-z2 2007 (up 45 days, 22:33)
|
|
From: Twylite <tw...@cr...> - 2008-11-23 11:13:16
|
Hi,
>>> The [try] command for matching on something other than the return
>>> code is excellent. Especially if it can match on return values as
>>> well as errorcodes. How about this for a twist on the idea...
>>> try {
>>> script
>>> } catch {
>>> var ?opts?
>>> } then {
>>> script
>>> } handler .....and so on.....
>>>
>> This fits with extending [catch], e.g.
>> catch { ... } em opts then { ... } handler {...}
>> The feedback I've had so far on this approach has not been
>> favorable. It seems that developers would prefer to keep the
>> args/vars in the context of the handler body.
>>
> Hmmm..... Fair enough. My reasoning is this:
>
> - Restricts and confuses the arguments to the individual handlers.
> They're obvious, mostly redundant, and get in the way of other more
> useful potentially optional arguments.
>
> - What happens if each handler specifies a different set of variables.
> Which ones will be defined when the code block completes? Or are they
> only defined within the context of the handler being invoked? It's
> confusing.
>
> Having them specified up front makes it obvious that they're set within
> the current scope, and hence will be available both to the invoked
> handler and to code following the [try] block.
>
Looking at the recent "pluggable matcher/handler" proposals it is
becoming clear that the vars must be defined before the "errorPattern"
is evaluated - in the case of an [expr] type handler the errorPattern
may be an expression that involves the return code, result and options,
so they must be brought into scope before the pattern is checked.
Your second point (what is defined when a block completes) is also a
good one - I was wondering about this last night. Having the vars from
all handlers defined would be unexpected (and in most cases a lot of
extra work); having vars from only the executed handler would mean a lot
of pulling vars into & out of scope until the right handler is found,
and then potential for errors in the code following the [try] as it
dereferences the wrong variable name.
Taken together these are a strong argument in favour of defining the
vars up front.
An unfortunate consequence of this is that handlers become harder to reuse.
>>> Regardless, why not have the handler clause evaluate an expression
>>> in the context of a [dict with $opts]? Then you can use whatever
>>> matching function you wish, the only minor pain is that you have to
>>> use some ugly bracketing of the option names { ${-code} == 2 }.
>>> But maybe there's a way around that, too, especially if the [dict
>>> with] is doable read-only and non-destructively somehow.
>>>
A little more on this one: accessing the return options in the [expr]
would be painful, because it requires using [dict]. It would be fairly
trivial to implement mathfuncs for code(), errorcode() and opts(-what)
that would do the right thing.
>> Not that I'm aware of, no. My current thinking is that it will be
>> outside the brackets, e.g.
>> handle code/expr {?resultvar? ?optionsvar?} { body }
>>
> That would be _much_ preferable. I do think, though, that being able
> to glob-match on a returned value is a requirement to being worth the
> effort. Otherwise you'll have a bunch of branches each with an
> embedded [switch] and it's going to look worse, be less useful, and
> probably less efficient, than what I've sometimes done:
>
As several people have pointed out on this list, matching against the
return value is something we want to discourage. It would be possible
via an [expr]-type handler, but I don't think it should be supported by
the default (probably glob-type) handler.
> One possible thought; a "return" (pending a better name) handler that
> matches the return value, and leaving that off from the "on" handler.
> So...
>
> HANDLE errorcode-pattern {...}
> RETURN returnvalue-pattern {...}
> ON return-code {...}
>
> might be better, on the basis that most of the return codes don't allow
> you to specify a return value without producing them directly through
> [return]. Further on that, the return-code could optionally be a list
> of two words with the return value pattern being the second, which
> would allow the "on" form to handle it transparently without the
> "return" form at all.
>
I'm in favour of pluggable matchers _per handler_, which would allow you
to do this sort of thing (but possibly not with the default handlers).
e.g.
try {
# stuff
} on error -like "POSIX *" {
} on error -withresult "foo*" {
}
You can define the "withreturn" matcher to do whatever you want (in this
case a glob match against the result)
>> One advantage of having the vars with the handler script is that it
>> allows you to reuse handlers. e.g.
>> set GENERAL_IO_HANDLER {{em opts} { log "Problem: $em" }}
>> ...
>> try {
>> # some IO routine
>> } handle error * {*}$GENERAL_IO_HANDLER
>> And in this case its no coincidence that the GENERAL_IO_HANDLER looks
>> like an anonymous function that can be used with [apply]
>>
> I don't see any advantage to that at all. The handler won't be
> compiled or anything of the kind any more than it would be without the
> vars, and special magic is still going to need to be added to allow it
> to efficiently be re-used with [apply] or [eval] or what-not.
>
Code reuse. If you have a cross-cutting strategy (over several
packages/components in an application or library) for handling a
particular type of error (say IO errors), you can abstract that into a
code snippet that looks like an anon proc. You can only do this if the
proc knows the names of the variables its going to deal with, hence the
advantage of having the vars and body adjacent.
The alternative is to call a proc, but it is slightly less convenient:
try {
# some IO routine
} on error {*}$GENERAL_IO_ERROR_MATCH { do_general_io_error $code $em
$opts }
Besides being slightly more verbose, the pattern constant and the proc
are no longer closely associated in code.
Anyway, I don't think this is a particularly important bit of
functionality -- it would be nice to have, but there are stronger
arguments for having the vars at the front of the [try].
Regards,
Twylite
|