I think most committers have given some thought to this question. Three recent wishlist items (lp 554145, 554167 and 554190) make me think we should write down our general philosophy regarding optimisations and rewritings, and save that somewhere in the source tree. This isn't only a question for control/data-flow optimisations, but is also closely related to the memory model and the rewritings we perform on floating-point arithmetic.
On one hand, Python (the compiler) is an interesting and useful platform mostly because of the analyses and rewrites it performs. However, those rewrites tend to be very localised and thus easy to understand, even when multiple rewrites interact. Similarly, the model under which analyses are executed isn't too hard to understand. (I'll write about that in the post scriptum) I've said it before, it's only because of the relative naïveté of our middle-end that we haven't yet had to provide, e.g., a memory model: the optimised output will always perform exactly the side-effects and (non-elided) reads as written in the source, in the same order.
A priori, I am opposed to introducing complex (read "global") rewrites. I find even common subexpression elimination somewhat suspect in a general-purpose compiler, never mind loop optimisations.
Especially when dealing with a language that provides powerful code generation tools like CL, I believe a compiler should really focus on eliminating the abstraction overhead associated with the language. Analysing dynamic types away falls under that goal. Similarly for tail calls, inlining single-use local functions or specialising calling conventions. TAGBODY/GO, on the other hand, is extremely close to our target (assembly language); there is scarce little abstraction to eliminate in loops.
Moreover, I find that being able to construct a mental model of how a compiler will transform one's code is usually much more important that the output's performance. Python already walks a very fine line: while our transforms are simple, the analyses that drive them are complex enough that most people can only guess at what happens thanks to the notes spewed during compilation. This is what makes CSE suspect to me: (G)VN analyses aren't always simple to understand, and it can be hard to tell when CSE will be an improvement. When it comes to transforming a program enough to improve its asymptotic complexity (run-time, space, or any other resource), these optimisations obviously hinder the development of the aforementioned mental model.
I can understand that one wants no-holds-barred optimisations in some situations. An EDSL might be more appropriate then... and definitely easier to analyse than full-blown CL, if designed correctly.
To summarise my position: It should be simple to build a mental model of how SBCL will implement one's code. SBCL should focus on optimising away the overhead associated with the abstractions it provides itself. This is what will then enable the existence of EDSLs that offer domain-specific optimisations and invariants.
Discuss!
Paul Khuong
P.S. I find that the two most important things to understand when it comes to type propagation is that:
1. Python works with variables, exactly as apparent in the source (modulo inlining). The only time references to a given variable will be analysed differently is under a branch.
2. Python tightens types iteratively, from an initial guess of T, or whatever the user provided as a type declaration.
|