From: Erik H. <eh...@gm...> - 2008-12-30 23:05:03
|
Currently, as indicated by the CompilationPhases page in our wiki (http://trac.common-lisp.net/armedbear/wiki/CompilationPhases), pass2 in the compilation process generates byte code. This pass does mixed attempts to generating efficient byte code. Sometimes it does, sometimes it just chooses codes which "do the job". Later on, in the byte code generation phase (resolve-instructions), several byte codes are treated as one and the same type (like bipush and sipush). The bytecode resolver then chooses the byte code which fits best. In case of bipush and sipush, it only considers these 2 instructions, not the iconst_X instructions which are also available. There's an additional problem with the way the resolver is set up: because it uses existing byte codes from the JVM, it's not possible to use the WIDE instruction-prefix. All instructions will look like a WIDE instruction, however, they don't all have the same width. E.g.: WIDE IINC differs in with from WIDE ALOAD_n. I would like to propose the following resolution: * instruction output from pass2 will only be symbolic opcode indicators * opcode indicators will be exactly that: pseudo opcodes * in contrast with the actual opcodes used now, pseudo opcodes don't have a fixed length What I mean here is that "push-constant-int n" maybe resolved to a single-byte opcode "iconst_1" if n == 1, but to a 2 byte opcode bipush in case n < 128. * the optimization routines will optimize based on these pseudo opcodes * operations with different stack effects need different pseudo opcodes * after all calculations have been done on pseudo opcodes, pseudo opcodes are translated into real opcodes * once opcode lengths have been determined, label positions can be calculated and used to calculate jump distances. The difference from the above with what we have now is that all calculations are done on numerical opcodes, meaning that the steps can be randomly mixed (except, ofcourse for label elimination). My proposal changes that. It lays down an exact order in which the different steps need to happen, which is in my opinion really a clarification of the code. Also, the code will be more self-documenting: the resolvers won't be keyed off on numbers anymore; they'll use symbols, as will the opcode traversal routines. Opinions? Bye, Erik. |
From: Ville V. <vil...@gm...> - 2008-12-30 23:16:55
|
On Wed, Dec 31, 2008 at 1:04 AM, Erik Huelsmann <eh...@gm...> wrote: > I would like to propose the following resolution: > * instruction output from pass2 will only be symbolic opcode indicators > * opcode indicators will be exactly that: pseudo opcodes > * in contrast with the actual opcodes used now, pseudo opcodes don't > have a fixed length Sounds very good to me. This also sounds reasonable, because I believe it will allow p2 handlers to avoid having to know exactly the low-level bytecode to generate, thus making the handlers possibly simpler and easier. > The difference from the above with what we have now is that all > calculations are done on numerical opcodes, meaning that the steps can > be randomly mixed (except, ofcourse for label elimination). My > proposal changes that. It lays down an exact order in which the As you already mention, this is not a bad thing at all. I'm no compiler expert, but we have somewhat rigid ordering in the higher-level phases anyway (precompile -> p1 -> p2), and I see no problem having such ordering for things that are (currently) part of p2. I do see potential for splitting the code up further, this kind of change would play well with that idea, supporting it and making it possible. > Opinions? Seconded, completely. -VJV- |
From: Mark E. <ev...@pa...> - 2008-12-31 09:42:57
|
Erik Huelsmann wrote: […] > > I would like to propose the following resolution: > > * instruction output from pass2 will only be symbolic opcode > indicators > * opcode indicators will be exactly that: pseudo opcodes > * in contrast with the actual opcodes used now, pseudo opcodes don't > have a fixed length What I mean here is that "push-constant-int n" > maybe resolved to a single-byte opcode "iconst_1" if n == 1, but to a > 2 byte opcode bipush in case n < 128. > * the optimization routines will optimize based on these pseudo opcodes > * operations with different stack effects need different pseudo opcodes > * after all calculations have been done on pseudo opcodes, pseudo opcodes are > translated into real opcodes > * once opcode lengths have been determined, label positions can be calculated > and used to calculate jump distances. […] > > Opinions? Emitting symbolic opcodes at the end of pass2 certainly seems like the more forward looking design strategy to accommodate different strategies for actual bytecode resolutions. -- "A screaming comes across the sky. It has happened before, but there is nothing to compare to it now." |
From: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX - 2008-12-31 12:52:59
|
> > * in contrast with the actual opcodes used now, pseudo opcodes don't >> have a fixed length What I mean here is that "push-constant-int n" >> maybe resolved to a single-byte opcode "iconst_1" if n == 1, but to a >> 2 byte opcode bipush in case n < 128. > > * the optimization routines will optimize based on these pseudo opcodes > > * operations with different stack effects need different pseudo opcodes > > * after all calculations have been done on pseudo opcodes, pseudo > opcodes are >> translated into real opcodes > > * once opcode lengths have been determined, label positions can be > calculated >> and used to calculate jump distances. > > […] >> >> Opinions? > > Emitting symbolic opcodes at the end of pass2 certainly seems like the > more forward looking design strategy to accommodate different strategies > for actual bytecode resolutions. Exactly. All that remains is that we design a useable pseudo-bytecode set now. Some instructions come to mind quite easily: * push-constant (parameters: java-type [:int, :long, ...], value) * push-register (parameters: java-type [same], register-number) [corresponds to Xload-instructions] * push-array-element (parameters: java-type) [corresponds to Xaload-instructions] * push-nil * push-t * pop (parameters: java-type [because some unboxed types take more than 1 slot]) * pop-register (parameters: java-type [...], register-number) [corresponds to Xstore-instructions] * pop-array-element (parameters: java-type) [corresponds to Xastore instructions] * dup (parameters: none) * new (parameters: constant-pool index) * newarray (parameters: type-indicator) * arraylength * throw * return (parameters: type) * invokestatic/invokevirtual/invokespecial (parameters: argument list types, return type) [possibly under different names] * compare (parameters: java-type [:int, :long; :ref, ..], condition [:eq, :ne, :lt, :le, :gt, :ge], result-type [:boolean, nil]) [corresponds with if_Xcmp instructions], pushes a boolean result on the stack * compare-with-zero ; same as compare, but uses 1 stack argument * jump-if-true (parameters: true-label) [compares boolean-on-stack and jumps to true-label] * iinc (parameters: register; constant-int) * instanceof (parameters: class-for-check) * checkcast (parameters: class-for-check) * clear-values (paramerets: none) * swap * convert-type (parameters: source-type, target-type) * load-constant (parameters: constant-pool-index) [corresponds to ldc/ldc_w/ldc2_w instructions] * getstatic (parameters: static class, static name, java-type) * putstatic (parameters: static class, static name, java-type) * goto (parameters: jump target label name) * label (parameters: label name) By having compare and jump-if-true instructions, it's easier to reorder and integrate the bytecode resulting from compilation of the IF condition and branching to each of the IF branches. Some of the other instructions integrate iINSTR, lINSTR, aINSTR. I decided to do that because each of these instructions have the same stack and side-effects. Then, in the optimizer/stack analyser, it's not necessary to handle them differently. Only in the byte code generator (resolvers) does it become important what type the resulting argument will be, so that the right instruction can be chosen. The instructions above roughly correspond with the instructions used in our (emit ...) statements. If any instructions are missing, I think it should be easy to add. Each pseudo opcode will be created with some data for use for the code-optimizer: * number of stack arguments used (positive == pushed, negative == popped) * type of side effects (apart from stack effect, which can be ":none" [in case of dup, iconst_null, getstatic, etc], :locals [in case of istore_0] or :full [in case side effects can't be completely determined, for example with invokevirtual/invokestatic]) The side effect information can be used by the bytecode optimizer to re-order instructions for instruction-elimination. It's just my ideas so far. Bye, Erik - who always welcomes your comments but appreciates the fact that it's new years eve. |