Thread: JIT optimization framework (was: Re: [Sablevm-user] bug report...)
Brought to you by:
egagnon
From: Andrew P. <apu...@ac...> - 2000-07-28 20:27:00
|
> "John Leuner" <je...@pi...> wrote: > Andrew, I've been developing a JVM for the past while > (kissme.sourceforge.net), it's implemented in C and targeted at (free) > UNIX. Hi, John. > Unfortunately I had to stop working on this project for two months > due to a sports injury but I've recovered enough to carry on working > now. That's unfortunate. I'm sorry to hear that. > If you're interested in the code or just discussing jits or > whatever, just let me know. Both. :) I've downloaded your 0.07 source and I'll have a look next week when I have time. As for my interest in JIT technology, I've been inspired by the OpenJIT project (www.openjit.org) to implement a JIT framework where the JIT, code generator, and optimizer are all first-class Java objects. Normally I am not taken to duplicating the work of others, but this is cool technology that really should be available under the GPL. I have not looked at the OpenJIT source, so this will be a "clean-room" effort in the respect. I'm investigating if that status also applies to work with Java in general. I note that you generate assembly text and then compile it. I was thinking of doing something more along the lines of the following: - Partition bytecodes into basic blocks while building a control flow graph. - Optionally run the standard Java code safety verification process, if this has been requested, e.g. "jvm -verify ..." - Convert bytecodes into register transfer lists, mapping values on the operand stack to symbolic registers. - Hand off the RTL to "plug-in" optimization modules. What optimizations are actually performed depend very much on how quickly I can get them to operate. Some classic optimizations are very time-intensive. Initially I was thinking: - Constant folding, constant propagation, copy propagation, and value numbering. - Code straightening and jump threading. - Common subexpression elimination (maybe). - Dead code elimination (maybe). - Replacement of certain bytecode patterns with precooked assembly templates, e.g. for x86: aload $value; invokevirtual "ntohl" -> asm ("bswap %0" : "=r" (value) : "0" (value)) aload $value; invokevirtual "ntohs" -> asm ("xchgb %b0,%h0" : "=q" (value) : "0" (value)) or for Alpha: aload $value; invokevirtual "ntohs" -> asm ("insbl %2,1,%1; extbl %2,1,%0; or $2,$1,$2" : "=r"(t1), "=&r"(t2) : "r"(x)) - Finally, hand over the munged RTL to an emitter which emits native code directly into space allocated on the Java heap. The emitter would assign real registers from symbolic ones using one of the graph coloring algorithms. If that kind of thing is too expensive, there are alternatives. - For Alpha, a final peephole optimization pass is required because instruction scheduling is very important for that platform. All of that may be a bit much for a JIT, but whether or not that is true is what I want to find out. :) I would also want to add a module for converting variable references to immediate constants if the values are known at runtime. Perhaps there will also be an API for specifying such values. Code like this: if (case_a_is_true) { /* a long block of code follows */ } else { /* more code */ } could be processed into: /* case_a_is_true is constant and always false */ /* more code */ Implementing all of the optimization framework in Java means everything is very dynamic and -- best of all -- can be self- bootstrapped into native code for maximum performance. Andrew Purtell from home -- apu...@ac... |
From: Etienne M. G. <eg...@j-...> - 2000-07-28 20:45:49
|
Andrew Purtell wrote: > I note that you generate assembly text and then compile it. I was > thinking of doing something more along the lines of the following: > > - Partition bytecodes into basic blocks while building a control > flow graph. Have you looked at the "Soot" framework (http://www.sable.mcgill.ca/soot/)? It does already provide more than this, including most analyses you need. It's internal representation is "stackless" 3-address codes, with "typed" local variables. We have already a few papers about this. It might be a good starting point. Currently, Soot statically analyses whole applications with no dynamic class loading, but you could probably adapt it to dynamic code. > - Optionally run the standard Java code safety verification process, > if this has been requested, e.g. "jvm -verify ..." Assuming SableVM already executes the bytecodes (while you are optimizing them on another thread), this should have already been done, so you shouldn't care about verification in the JIT. > - Convert bytecodes into register transfer lists, mapping values on > the operand stack to symbolic registers. > > - Hand off the RTL to "plug-in" optimization modules. What > optimizations are actually performed depend very much on how > quickly I can get them to operate. Some classic optimizations are > very time-intensive. Initially I was thinking: > > - Constant folding, constant propagation, copy propagation, and > value numbering. > > - Code straightening and jump threading. > > - Common subexpression elimination (maybe). > > - Dead code elimination (maybe). > > - Replacement of certain bytecode patterns with precooked > assembly templates, e.g. for x86: > > aload $value; invokevirtual "ntohl" -> > asm ("bswap %0" : "=r" (value) : "0" (value)) > > aload $value; invokevirtual "ntohs" -> > asm ("xchgb %b0,%h0" : "=q" (value) : "0" (value)) > > or for Alpha: > > aload $value; invokevirtual "ntohs" -> > asm ("insbl %2,1,%1; extbl %2,1,%0; or $2,$1,$2" > : "=r"(t1), "=&r"(t2) : "r"(x)) > Some of these are probably better left to a higher 3-address representation (see Soot). Others might fit well with a back-end, including some peep-hole optimizations. > - Finally, hand over the munged RTL to an emitter which emits native > code directly into space allocated on the Java heap. The emitter > would assign real registers from symbolic ones using one of the > graph coloring algorithms. If that kind of thing is too expensive, > there are alternatives. > > - For Alpha, a final peephole optimization pass is required because > instruction scheduling is very important for that platform. > > All of that may be a bit much for a JIT, but whether or not that is > true is what I want to find out. :) As I said in an earlier message, I do not think that JIT compiling every method is a good way to go. Ideally, you want the optimizer to pick frequently executed code, then take the time to crunch this as best as it can. The idea is, if the interpreter is fast enough, then this should be OK, as short programs will execute fast enough, and long running programs will benefit from an aggressive optimizer. Etienne > I would also want to add a module for converting variable references > to immediate constants if the values are known at runtime. Perhaps > there will also be an API for specifying such values. Code like this: > > if (case_a_is_true) { > /* a long block of code follows */ > } else { > /* more code */ > } > > could be processed into: > > /* case_a_is_true is constant and always false */ > /* more code */ > > Implementing all of the optimization framework in Java means > everything is very dynamic and -- best of all -- can be self- > bootstrapped into native code for maximum performance. > > Andrew Purtell > from home -- apu...@ac... > > _______________________________________________ > Sablevm-user mailing list > Sab...@li... > http://lists.sourceforge.net/mailman/listinfo/sablevm-user -- ---------------------------------------------------------------------- Etienne M. Gagnon, M.Sc. e-mail: eg...@j-... Author of SableVM: http://www.sablevm.org/ ---------------------------------------------------------------------- |
From: Andrew P. <apu...@ac...> - 2000-07-28 21:19:05
|
"Etienne M. Gagnon" wrote: > Have you looked at the "Soot" framework > (http://www.sable.mcgill.ca/soot/)? It does already provide more than > this, including most analyses you need. It's internal representation is > "stackless" 3-address codes, with "typed" local variables. We have > already a few papers about this. It might be a good starting point. > Currently, Soot statically analyses whole applications with no dynamic > class loading, but you could probably adapt it to dynamic code. Thanks! And now I guess I know where the name "SableVM" comes from... > Assuming SableVM already executes the bytecodes (while you are > optimizing them on another thread), this should have already been done, > so you shouldn't care about verification in the JIT. That's true. > As I said in an earlier message, I do not think that JIT compiling every > method is a good way to go. Ideally, you want the optimizer to pick > frequently executed code, then take the time to crunch this as best as > it can. I agree. It is the VM's choice when to invoke the JIT anyway. It is free to compile information wrt. method invocation and invoke the JIT only when a particular method meets some selection criteria -- such as a high invocation count, code length, or advance knowledge of performance "importance". It's an integration detail, albiet a big one. Andrew Purtell from home -- apu...@ac... |
From: Etienne M. G. <eg...@j-...> - 2000-07-28 21:32:12
|
Adrew, Andrew Purtell wrote: > "Etienne M. Gagnon" wrote: > > Have you looked at the "Soot" framework > > (http://www.sable.mcgill.ca/soot/)? ... > > Thanks! > > And now I guess I know where the name "SableVM" comes from... To tell you a little more about Soot. Soot has not only scalar optimizations, but also OO analyses/transformations. It does things like changing virtual method calls into static calls (invokevirtual -> invokespecial), changing interface calls into virtual calls, it also implements inlining (which is key for good optimization of OO code with many small methods), etc. Currently, Soot does .class -> .class optimization. You would think that Sun's HotSpot would not get any gain from such a static optimizer, but you would be wrong (as we have unexpectedly discovered); statically optimized class files were faster on many vms and many platforms (Linux,WinNT,...). You can give a look at the CC'2000 paper on the Sable web page (http://www.sable.mcgill.ca/), under publications. OK. That was it for the commercial;-) As you are programming the thing, you get to decide whatever tools you will to use:-))) Etienne -- ---------------------------------------------------------------------- Etienne M. Gagnon, M.Sc. e-mail: eg...@j-... Author of SableVM: http://www.sablevm.org/ ---------------------------------------------------------------------- |
From: Andrew P. <apu...@ac...> - 2000-07-28 21:56:31
|
"Etienne M. Gagnon" wrote: > Soot has not only scalar optimizations, but also OO analyses/ > transformations. Soot does indeed sound like a good starting point. I'll read the available papers -- and code, eventually -- with an eye toward determining how expensive (in terms of time) the current set of analyses and transformations are. I was going to put together something as simple and lightweight as possible, but considering an aggressively optimizing JIT is best applied selectively, there may be some wiggle room. Hmm. I'm getting ahead of myself. Soot (or a subset) may well be appropriate. > You would think that Sun's HotSpot would not get any gain > from such a static optimizer, but you would be wrong (as we have > unexpectedly discovered); statically optimized class files were > faster on many vms and many platforms (Linux,WinNT,...). I truly don't know anything about HotSpot, but I know people who do, and I commonly hear it is constructed of rubber bands and matchsticks. Those same folks claim IBM's JVM is much better. Andrew Purtell from home -- apu...@ac... |