|
From: Kevin K. <kev...@gm...> - 2019-01-03 20:23:44
|
I had a few further thoughts about traces.
When we're talking about traces as a feature that MUST be supported
for quadcode to represent an acceptable subset of the language, we're
mostly talking about, "what is needed for Tk to work?" [1] That's a
lot less general than "support all use cases of traces".
Tk uses write traces only, and generally accumulates data about what
has happened and defers acting upon it to the idle loop. Because it
works this way, trace elision actually sounds to me like a fairly good
idea. The concept here would be:
(a) Read traces (if we support them at all) will fire at least once
for any quadcode sequence that reads a value from a namespace
variable. Subsequent reads of a value that is known not to have been
modified since the previous read need not be traced.
(b) Write traces will fire when the value of a program variable is
stored back in the Var data structure. It is not required that every
place that a variable is set in a quadcode sequence must correspond
with a write trace. Rather, all namespace variables must be updated
prior to invoking code that might change them or cede control, and
prior to return to a call thunk. (n other words, require that the
variables actually be in sync only when giving control to something
that isn't quadcode.) In addition, variables in the callframe must be
in sync when invoking any code that may access them. There are a fair
number of Tcl commands that access variables in a controlled,
predictable fashion (for example, [gets], [read], [regexp], [regsub],
[scan] and [binary scan]), and it is permissible to defer
synchronization when invoking a command that does not access the
variable in question.
(c) Unset traces follow the same rule as write traces - they require
synchronization prior to returning to a call thunk or invoking
non-quadcode that might try to access the variable.
Essentially, this makes the tracing rule for quadcode be, "before any
non-quadcode subsystem gets control, whether by call, return, or
coroutine transfer, at least one write or unset traces has fired
presenting the current value of each traced variable that has been
modified."
That keeps Tk (and other subsystems depending on linked variables)
working, and in fact saves them work if the same variable is hit
multiple times. It also allows us considerable latitude for moving
reads and writes around, for instance by pushing them above or below
loops in which they appear.
It's surely a change in language semantics between interpreted Tcl and
quadcode, but I think it's most likely a change for the better, given
what traces appear to be used for 'in the wild'.[2]
----
[1] There's also ::env, but I think a bunch of us concluded quite a
long time ago that the read trace on ::env is NOT desirable. I really
ought to work up the TIP for per-interp env, initialized from the
process environment at initial interp creation time, copied from the
parent interp when creating a child interp, and copied to the process
environment (under mutual exclusion) during [exec] and other C logic
that needs the environment in sync. The fact that neither we nor
POSIX defines any sort of synchronization requirement for environment
variable changes is crazy. End of dirgression.
[2] I've also seen traces used for debugging instrumentation, in which
case they *are* interested in each individual access or change to a
variable. I'd imagine, though, that we'd want either to have a
debugging option to the code generation to force such traces to be
left in, or else run this sort of instrumentation only on interpreted
code, rather than burden every use of quadcode with the cost of the
additional hash table lookups and assignments.
On Wed, Jan 2, 2019 at 9:58 AM Kevin Kenny <kev...@gm...> wrote:
>
> On Wed, Jan 2, 2019 at 6:35 AM Donal K. Fellows
> <don...@ma...> wrote:
> > On 02/01/2019 02:15, Kevin Kenny wrote:
> > In theory, there's no problem with having multiple CALLFRAMEs within a
> > single function once they're distinct entities (due to inlining). I've
> > no idea if this is actually practical, of course!
>
> That's something else that I wanted to discuss. Right now, I'm doing
> only the very simplest inlining. In effect, the inlined function has
> to be trivial enough that I don't need its callframe. That rules out
> having the inlined code do anything like [regexp] because without
> multiple callframes, there's no way for the C implementation of
> [regsub] to find the match variables. It also rules out throwing
> errors until and unless I figure out what to do with 'procLeave'. But
> that's an issue for another day.
>
> > > The new algorithms will be fairly aggressive about moving code around
> > > - so that, for instance, a read-only use of a namespace variable
> > > within a loop can be hoisted out of the loop if nothing reads or
> > > writes the variable or any alias. IN designing this, I realize that
> > > the logic is about to founder on directSet, directGet, directArraySet,
> > > directArrayGet, etc. - fourteen operations in all. All of these at
> > > least notionally modify or access the state of variables, and so need
> > > to have the callframe among their parameters and results.
> >
> > Those operations are essentially designed for handling the cases where
> > variables are accessed by qualified names, and CAN TRIGGER TRACES. We
> > must handle them correctly, alas; lots of Tcl code is strongly dependent
> > on traces. However, in the case that it is possible to prove that a
> > variable name does not contain a namespace separator, something more
> > efficient can be done; the side effects are much more contained (as we
> > explicitly state that we won't support local variable traces).
>
> We do need to constrain scope somewhat! Traces, as they are currently
> defined, are a feature about which I say, "if you need the
> interpreter, you know where to find it!" Consider the following:
>
> proc tracer {n1 n2 op} {
> uplevel 1 {set b 2}
> }
> trace add variable ::a write tracer
> proc x {} {
> set b 1
> set ::a 1
> return $b
> }
> puts [x]
>
> In the presence of traces, no assumptions about the semantics of code
> are safe. 'tracer', instead of setting an irrelevant variable, could
> have redefined 'set' or something equally nasty.
>
> What I suspect will save us is that nobody sane does radical semantic
> alterations in performance-sensitive code. There will be some core set
> of algorithms where a program spends its tiime that don't need the
> full functionality of the interpreter and can be aggressively
> optimized.
>
> > You're correct that they should be returning CALLFRAME FAIL STRING in
> > the general case.
>
> Right now, on trunk, none of these functions takes the callframe. It's
> just 'directSet varName value' and so on. That's the immediate
> modification I'm proposing.
>
> > > Beyond this, the next thing to think about will be breaking up direct*
> > > into smaller pieces. If 'directLappend result callframe listvar
> > > element' could turn into a sequence like
> > >
> > > directRef listRef callframe var
> > > refGet val listRef # don't know if I might
> > > lappend val2 val element
> > > refPut temp callframe listvar element
> > > extractCallframe callframe2 temp
> > > retrieveResult result temp
> > >
> > > (plus whatever error checking goes there)
> > >
> > > then I couid start looking at aggregating all the directRef's to the
> > > same variable, hoisting them out of loops, and so on. Many of the
> > > instructions in straight-line sequences like this are optimized to
> > > zero-cost operations.
> >
> > Doing this is all about getting the semantics right. That's surprisingly
> > difficult. For example, the code for the equivalent of:
> >
> > lappend ::foo::bar::var $a
> > lappend ::foo::bar::var $b
> > lappend ::foo::bar::var $c
> >
> > and
> >
> > lappend ::foo::bar::var $a $b $c
> >
> > needs to be able to avoid making extra references to the list in the
> > variable if possible or we risk transforming O(N) algorithms into O(N²),
> > and nobody will thank us for that. (Early versions of 8.6 had exactly
> > this bug due to my getting things slightly wrong in the bytecode.)
>
> Indeed. I was talking about making references to the variable, not the
> value. Already, for a local variable, we're able to retain the
> variable ref - it's in the callframe after all - without trouble, As
> far as retaining the value ref goes, that's also likely not to be an
> issue because of the elimination of common subexpressions.
>
> I imagine a pattern like the following (all the error checking stuff
> suppressed):
>
> getVarRef aRef callframe "::foo::bar::var"
> lappendRef result aRef a
> unset result
> lappendRef result aRef b
> unset result
> lappendRef result aRef c
> unset aRef
>
> The only reference to the list is still in the variable, but we avoid
> the cost of resolution on all references except the first.
> Essentially, aRef is holding a Var*.
>
> But that's if we think that ::foo::bar::var must be in sync at all
> times, so read on...
>
> > Similarly, we *must* get the trace semantics mostly right on global
> > variables, at least to the point of having at least one trace call in
> > the right places so that access to variables synched into C code in
> > extensions will work correctly. Also, I vaguely recall that [append] is
> > strange in this regard. Or perhaps that was arguably a bug that is now
> > fixed (or ought to be, for consistency's sake)?
>
> I think we're in agreement here, for sufficiently small values of
> 'mostly right'. Rather than 'all variables in sync at all times', I
> already have logic that assesses known effects. There are already some
> commands that are identified as, 'this might change the value of any
> variable, but otherwise have known effects', plus some commands like
> [regexp] that have constrained effects: '[regexp] does not read any
> variable and writes exactly those variables whose names are passed as
> match variables'.
>
> So one definition of 'mostly right' could be that all variables are in
> sync before and after calls to C commands that might examine or alter
> them, and all namespace variables are in sync on procedure exit.
> Dropping traces that would have fired between those points might be
> fair game.
>
> > I suspect that these will reduce your scope for improvement a lot.
>
> > Otherwise, I'm generally happy to do this. I'm guessing that it might be
> > easiest to split things so that we have operations that produce a VAR
> > type (not a subclass of STRING) that we can then do the other changes
> > on. That'd at least let us factor out the (potentially expensive!)
> > variable lookup operations. The actual reads and writes are usually much
> > cheaper (other than that the values must always be boxed and traces are
> > a true issue).
>
> Given the problem that traces have arbitrary and unknowable side
> effects, the best we can hope for is that some benign ones work.
>
> In such a world, there's also an opportunity for quadcode that we've
> not yet properly addressed. If we allow variables to be out of sync
> momentarily (provided that we've restored consistency any time we get
> to external code that might use them), then we could mostly eliminate
> the need for the K combinator. If we allow 'x' to have a transient
> state where its value is lost, then
>
> set x [lreplace $x $a $b {*}$neweleents]
>
> would not incur a performance penalty with respect to the mysterious
>
> set x [lreplace $x[set x {}] $a $b {*}$newelements]
>
> because we could detect that the new value of x is always a
> newly-created value and avoid the extra ref. (This would also involve
> proving either that the [lreplace] cannot throw, or that the value of
> $x will never again be used if it does.)
>
> Anyway, the immediate issue is that direct ops don't reference the
> CALLFRAME formally, and need to. :)
>
> Kevin
|