|
From: Kevin K. <kev...@gm...> - 2019-01-02 14:58:20
|
On Wed, Jan 2, 2019 at 6:35 AM Donal K. Fellows
<don...@ma...> wrote:
> On 02/01/2019 02:15, Kevin Kenny wrote:
> In theory, there's no problem with having multiple CALLFRAMEs within a
> single function once they're distinct entities (due to inlining). I've
> no idea if this is actually practical, of course!
That's something else that I wanted to discuss. Right now, I'm doing
only the very simplest inlining. In effect, the inlined function has
to be trivial enough that I don't need its callframe. That rules out
having the inlined code do anything like [regexp] because without
multiple callframes, there's no way for the C implementation of
[regsub] to find the match variables. It also rules out throwing
errors until and unless I figure out what to do with 'procLeave'. But
that's an issue for another day.
> > The new algorithms will be fairly aggressive about moving code around
> > - so that, for instance, a read-only use of a namespace variable
> > within a loop can be hoisted out of the loop if nothing reads or
> > writes the variable or any alias. IN designing this, I realize that
> > the logic is about to founder on directSet, directGet, directArraySet,
> > directArrayGet, etc. - fourteen operations in all. All of these at
> > least notionally modify or access the state of variables, and so need
> > to have the callframe among their parameters and results.
>
> Those operations are essentially designed for handling the cases where
> variables are accessed by qualified names, and CAN TRIGGER TRACES. We
> must handle them correctly, alas; lots of Tcl code is strongly dependent
> on traces. However, in the case that it is possible to prove that a
> variable name does not contain a namespace separator, something more
> efficient can be done; the side effects are much more contained (as we
> explicitly state that we won't support local variable traces).
We do need to constrain scope somewhat! Traces, as they are currently
defined, are a feature about which I say, "if you need the
interpreter, you know where to find it!" Consider the following:
proc tracer {n1 n2 op} {
uplevel 1 {set b 2}
}
trace add variable ::a write tracer
proc x {} {
set b 1
set ::a 1
return $b
}
puts [x]
In the presence of traces, no assumptions about the semantics of code
are safe. 'tracer', instead of setting an irrelevant variable, could
have redefined 'set' or something equally nasty.
What I suspect will save us is that nobody sane does radical semantic
alterations in performance-sensitive code. There will be some core set
of algorithms where a program spends its tiime that don't need the
full functionality of the interpreter and can be aggressively
optimized.
> You're correct that they should be returning CALLFRAME FAIL STRING in
> the general case.
Right now, on trunk, none of these functions takes the callframe. It's
just 'directSet varName value' and so on. That's the immediate
modification I'm proposing.
> > Beyond this, the next thing to think about will be breaking up direct*
> > into smaller pieces. If 'directLappend result callframe listvar
> > element' could turn into a sequence like
> >
> > directRef listRef callframe var
> > refGet val listRef # don't know if I might
> > lappend val2 val element
> > refPut temp callframe listvar element
> > extractCallframe callframe2 temp
> > retrieveResult result temp
> >
> > (plus whatever error checking goes there)
> >
> > then I couid start looking at aggregating all the directRef's to the
> > same variable, hoisting them out of loops, and so on. Many of the
> > instructions in straight-line sequences like this are optimized to
> > zero-cost operations.
>
> Doing this is all about getting the semantics right. That's surprisingly
> difficult. For example, the code for the equivalent of:
>
> lappend ::foo::bar::var $a
> lappend ::foo::bar::var $b
> lappend ::foo::bar::var $c
>
> and
>
> lappend ::foo::bar::var $a $b $c
>
> needs to be able to avoid making extra references to the list in the
> variable if possible or we risk transforming O(N) algorithms into O(N²),
> and nobody will thank us for that. (Early versions of 8.6 had exactly
> this bug due to my getting things slightly wrong in the bytecode.)
Indeed. I was talking about making references to the variable, not the
value. Already, for a local variable, we're able to retain the
variable ref - it's in the callframe after all - without trouble, As
far as retaining the value ref goes, that's also likely not to be an
issue because of the elimination of common subexpressions.
I imagine a pattern like the following (all the error checking stuff
suppressed):
getVarRef aRef callframe "::foo::bar::var"
lappendRef result aRef a
unset result
lappendRef result aRef b
unset result
lappendRef result aRef c
unset aRef
The only reference to the list is still in the variable, but we avoid
the cost of resolution on all references except the first.
Essentially, aRef is holding a Var*.
But that's if we think that ::foo::bar::var must be in sync at all
times, so read on...
> Similarly, we *must* get the trace semantics mostly right on global
> variables, at least to the point of having at least one trace call in
> the right places so that access to variables synched into C code in
> extensions will work correctly. Also, I vaguely recall that [append] is
> strange in this regard. Or perhaps that was arguably a bug that is now
> fixed (or ought to be, for consistency's sake)?
I think we're in agreement here, for sufficiently small values of
'mostly right'. Rather than 'all variables in sync at all times', I
already have logic that assesses known effects. There are already some
commands that are identified as, 'this might change the value of any
variable, but otherwise have known effects', plus some commands like
[regexp] that have constrained effects: '[regexp] does not read any
variable and writes exactly those variables whose names are passed as
match variables'.
So one definition of 'mostly right' could be that all variables are in
sync before and after calls to C commands that might examine or alter
them, and all namespace variables are in sync on procedure exit.
Dropping traces that would have fired between those points might be
fair game.
> I suspect that these will reduce your scope for improvement a lot.
> Otherwise, I'm generally happy to do this. I'm guessing that it might be
> easiest to split things so that we have operations that produce a VAR
> type (not a subclass of STRING) that we can then do the other changes
> on. That'd at least let us factor out the (potentially expensive!)
> variable lookup operations. The actual reads and writes are usually much
> cheaper (other than that the values must always be boxed and traces are
> a true issue).
Given the problem that traces have arbitrary and unknowable side
effects, the best we can hope for is that some benign ones work.
In such a world, there's also an opportunity for quadcode that we've
not yet properly addressed. If we allow variables to be out of sync
momentarily (provided that we've restored consistency any time we get
to external code that might use them), then we could mostly eliminate
the need for the K combinator. If we allow 'x' to have a transient
state where its value is lost, then
set x [lreplace $x $a $b {*}$neweleents]
would not incur a performance penalty with respect to the mysterious
set x [lreplace $x[set x {}] $a $b {*}$newelements]
because we could detect that the new value of x is always a
newly-created value and avoid the extra ref. (This would also involve
proving either that the [lreplace] cannot throw, or that the value of
$x will never again be used if it does.)
Anyway, the immediate issue is that direct ops don't reference the
CALLFRAME formally, and need to. :)
Kevin
|