Re: [tclquadcode] Procedure inlining

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Best wishes for a safe, healthy and prosperous 2018!
(Did you know that 2018! has exactly 555 8's in its decimal representation?)

On Sun, Dec 31, 2017 at 9:00 PM, Donal K. Fellows <
don...@ma...> wrote:

> On 30/12/2017 21:48, Kevin Kenny wrote:
>
>> First, and most important by far, is that I can't inline a procedure that
>> might return failure. I've simply not found a way to do it with the
>> existing quadcode instruction repertoire. I can see the 'returnException'
>> instruction in the procedure to be inlined, but I don't see a good way to
>> construct the object that the procedure would have returned. (Ideally,
>> including the stack trace that would have appeared, for which I have all
>> necessary information, I think.) This is blocking all but about a dozen of
>> the potential opportunities for inlining.
>>
>
> Yes, that's important and will require a new opcode. The key things it
> will need to do will be to release the stack frame (I presume we have a
> plan to create new stack frames too?) and process the procedure exit
> sequence. That opcode will need to know what procedure it is actually in
> (currently, 'returnException' uses [namespace tail $cmd]).
>

I'd really like to keep the callframe management separate, because I
need to do the error information even when the inlined procedure has a
suppressed callframe. There is a fair subset of procedures that don't
actually need a frame.

> We might wish to split our current 'returnException', which would
> coincidentally allow the error exit sequence to use 'return'.
>

That sounds rather like what I had in mind. One caveat:
'return' at present appears to take a FOO and cause the procedure to
return a Just FOO. A cursory reading of the code suggests that it won't
work with a FAIL without some tweaking. (Which would be a good thing
anyway, if it leads to simplification elsewhere.)

There will also need to be some review of 'param' usage. That opcode's
> implementation accesses the bytecode definition dict in some cases (with
> a warning about "default injection" if it does so; I think we don't spit
> those out at the moment so we might be OK though I don't know how
> thorough our testing is).

In inlined procedures, 'param's get replaced with copies. That part's
working well already. When I'm calling any compiled procedure, default
injection isn't needed; I supply defaults on caller side. And even
'invokeExpanded' will get it right (except that I'm wrestling with a
bug at the moment - it turns out that [upvar] in an 'invokeExpanded'
fails rather nastily. I never contemplated having CALLFRAME FAIL FOO
pass through a phi - and I don't *want* to contemplate it, so I have
to refactor how I'm handling 'invokeExpanded'.

> A second limitation is that I don't seem to have any way in the
>> @debug quadcodes to indicate that I've brought in code from a
>> different source file. That will mess up stack traces, compile-time
>> messages and so on, if a procedure in one file inlines a procedure
>> defined in a different file.
>>
>
> There will need to be something for the shift in source file
> (spitballing: @debug-file {} {literal /path/to/file}) and something for
> the shift in procedure. (The order of these things is likely to be
> important.)
>
> I'm hoping that Donal will be able to help me out.
>>
>
> Of course. But it might take quite a bit of tinkering to get the debug
> info exactly correct.

Sure. Let me know how I can help you help me! (By the way,
I don't think that TIP 86 information for a [proc] inside a
[namespace eval] is making it all the way through. Another
mystery to track down.)

> I suspect that in this specific case, LLVM is smart enough to inline
>> the 'impure' procedure itself and generates approximately the same
>> code.
>>
>
> Reading the (rather long) output of
>
>   tclsh8.6 demos/perftest/tester.tcl -just impure -asm 1
>
> indicates that this indeed the case. Indeed, it goes further in some
> cases and inlines the result into the thunk function that provides the
> actual Tcl command. (What's curious is that it doesn't when we use an
> external optimiser. I guess the llvmtcl internal one has looser limits
> on that sort of thing.)

ISTR that inlining in the default clang configuration is deferred until
link-time optimization. If we're not looking at output from the LLVM
linker, we may not see that stuff.

I have enough to go a little farther before I get stuck. I'd like to
have non-TCL_OK returns working before tackling the harder
parts, which means that the first thing I need is the separated
'returnException' operation. I can generate it easily enough,
including passing through the '@debug-file', '@debug-line'
and '@debug-script' values that are current, so you don't
have to look them up. (It's easier for me to look them up, they
follow the dominator tree.) Let me know what you need the
operands to be. (They should not ordinarily, I think, need the
callframe, since we want this part to work in proc's with omitted
frames.)

I suspect you've done more thinking than I have about how you
want to handle push and pop of callframes. I am imagining
that a fair amound of callframe work can turn static if we know
that we're in an invocation of proc B from proc A, and that
the inlining is safe. There's also no need to worry about
recursion at that stage. By that point, every connected component
of more than one node in the call graph has had its head
node marked 'never inline'.

That sort of thing is what allows me to make inlining work now.
The code is copied inline, 'return' is replaced with a copy
and a jump to the procedure exit, 'param' is replaced with
a copy from the arg to 'invoke', and all variables, temps, basic
blocks and split markers are renumbered. That turns
out to be a subtle beauty of SSA. I can have a variable 'x'
in the caller, and a variable 'x' in the callee, and they
simply will have different instance numbers and data
flows, and it all works out. OK, preserving SSA is a little
hairy when doing this sort of code surgery, but I've
gotten good at that.

Is there any reason I shouldn't merge what I have so far
into trunk? All tests pass, except for some new ones that
are still commented out. It simply inlines a lot less than
it otherwise might.

Re: [tclquadcode] Procedure inlining

The Tool Command Language implementation

Re: [tclquadcode] Procedure inlining