From: SourceForge.net <no...@so...> - 2012-03-11 12:30:45
|
Feature Requests item #1921061, was opened at 2008-03-20 06:44 Message generated for change (Comment added) made by spth You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=350599&aid=1921061&group_id=599 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: z80 port Group: None Status: Open Priority: 5 Private: No Submitted By: Philipp Klaus Krause (spth) Assigned to: Nobody/Anonymous (nobody) Summary: Register parameter passing Initial Comment: Similar to RFE #979838, but #979838 is for mcs51, while this one is for Z80. Passing arguments in registers would reduce call overhead. However this makes sense for small parameters only: Any register pair can be pushed by the caller, but if the arguments are passed in registers that could mean that we'd have to move them around in registers a lot before the call. In a simialr way the callee would probably spend a lot of time reordering arguments, unless we take the registers used for arguments away from the register allocator. Using the second register set is probably quite complicated. I see two possible solutions: *Use register arguments for functions where the sum of the arguments' sizes is below 24 bits only. We could then use a, h, l, which are not used by the register allocator for arguments. *Let the user decide. The C standard allows use of the register storage class for function parameters. Passing arguments in registers would mostly help with small function, which is currently one of sdcc's weak points. Philipp ---------------------------------------------------------------------- >Comment By: Philipp Klaus Krause (spth) Date: 2012-03-11 05:30 Message: We support the keyword "register". However, according to the standard (at least last time I looked, which was before the release of the C11 standard - I'll have another look today), we are not allowed to change the calling convention depending upon its presence. I.e. a programmer is allowed to do this: voif f(register char) { // Defintion void f(char); // Declaration in another file Philipp ---------------------------------------------------------------------- Comment By: b-s-a (b-s-a) Date: 2012-03-11 05:26 Message: May be add keyword 'register' to the core language (as I understand, currently it is not supported)? Or more powerfull version of it, like: void func(__register("HL") uint16_t x); where HL can be changed by a16 (from 16-bit Accumulator) or r16 (any 16-bit register/pair). ---------------------------------------------------------------------- Comment By: Philipp Klaus Krause (spth) Date: 2012-03-11 05:08 Message: No progress directly on this issue. However, 1) I'm not sure this is really worth it 2) There's other things to be done that would improve code quality more For 1): Passing arguments in register is only an advantage if the arguments can stay there. If the function is longer the callee would have to put them on the stack, which most of the time is less efficient than when the caller does it. As a result, register arguments are good for small functions, but bad for big functions. But the calling convention cannot depend on this, it may only depend on parts of t the function declaration. The one exception would be static functions that do not have their address taken. But if those are small, inlining them is even better. On the other hand, stack (and thus parameter access has been improved somewaht recently for small functions, so the advantage of register parameters even there is no longer as big as it seemed when this feature request was filed. See e.g. the last column at https://sourceforge.net/apps/trac/sdcc/wiki/Philipp%27s%20TODO%20list, which shows a reduction from revision #6749 to #7347 in code size from 24 byte to 12 byte for the smallest function in the benchmark. 2) In the graph at the page linked above, you can see a significant increase in code size and compilation speed that affects all ports related to the z80 in revision #6761 (bug #3400613). This is due to a bug fix. However I think that by improving common subexpression elimination, we could regain what was lost. 3) There are compilation time issues with the new register allcoator. For some sdcc users it is too slow. It seems this is partially due to my not so efficent implementation of some algorithms (I wanted to make it work first, fast later) and partially due to a problem with Thorup's algorithm. If these issues can be fixed, everyone will benefit, by faster compilation or better code quality for a given compilation time. See the graph at the link above again, to see waht is possible and the current situation: As you can see from the red, green, light blue and violet lines, code size is a lot smaller when one uses --max-allocs-per-node 1000000, but compilation tkaes much longer. Philipp Philipp ---------------------------------------------------------------------- Comment By: b-s-a (b-s-a) Date: 2012-03-10 12:38 Message: Also it is possible to pass first argument (if it pointer to variable of complex type (struct/union) only) in IY. void memcpy(void*, void*, uint16_t) will accept: HL, DE, BC void process(struct MyStruct *ms, uint16_t, uint16_t) will accept IY, HL, DE ---------------------------------------------------------------------- Comment By: b-s-a (b-s-a) Date: 2012-03-10 12:29 Message: aralbrec, your suggestion is good for large number of passing data. Because data is always pushed to the stack and popped back. My suggestion is not optimal too, but it cause simpler use of external libraries... spth, is there any progress? ---------------------------------------------------------------------- Comment By: alvin (aralbrec) Date: 2010-11-13 15:18 Message: You misunderstand, I agree with you :) The way things are now in sdcc-z80 is not adequate for adding external asm functions, including library functions. I am just proposing alternatives to what you suggested. The idea of specifying parameters to be placed in certain registers will not work well IMO unless it is very few parameters (ie one, two, *maybe* three). This is because of all the gymnastic the compiler would have to perform to compute parameter values and get them into the right registers prior to call. This cost is paid in code size every time the external asm function is called. Also, I would guess that parameters would have to be temporarily saved to the stack prior to final register set up quite frequently which would also make it slower than the CALLEE alternative I suggested. The CALLEE alternative has the compiler push parameters onto the stack and the external asm function pop them into the correct registers. The caller (ie compiler) does not have clean up the stack afterward as the external asm function does that itself when popping parameters into registers. The FASTCALL alternative I mentioned works with passing a limited number of parameters in DE,HL. It could be one parameter in HL (16 bit), one in DEHL (32 bit) or two 16-bit in DE,HL. This would be the passing params in registers thing you are requesting but only for one or two params. This places essentially no overhead on the compiler because quite frequently a parameter will be computed in DE,HL for the call. The external asm function quite frequently wants the one or two parameters in DEHL (due to the nature of the z80 instruction set!) so it is almost a clean transfer of program flow without the compiler needing more information about the internals of an external function. Lastly, I agree with it being the compiler's job to ensure its temporaries are saved prior to a call. I don't know if it does that now or if a new CALLER-save qualifer needs to be introduced to tell the compiler to save its temporaries. So my proposal is add: CALLEE, FASTCALL linkage and CALLER-save qualifier if necessary. It is just not practical to add external asm functions and even libraries without them. Longer term, it may be advantageous to pass some metadata about the external function to the compiler such as which registers are actually destroyed so the compiler can make more intelligent decisions about placement of the function call and what temporaries actually need to be saved but this is something that would need a lot of work. I would also like to see sdcc-z80 get away from assigning roles to registers and using ix as a stack frame altogether but that is also a very long term project if it ever happens... there is a reason why expert z80 programmers almost never use ix as a frame pointer in their hand coded assembler :) ---------------------------------------------------------------------- Comment By: b-s-a (b-s-a) Date: 2010-11-12 06:38 Message: As I said before there is some cases where it is required to call external function, passing parameters in registers. Currently to do call to my functions I need use inline assembler with lot of overhead: 1. store IX and other registers 2. load BC 3. load HL 4. load IX (it may be difficult) 5. do call 6. restore IX and other registers I do not know which registers I should save at any time. It is compiler's job, isn't it? ---------------------------------------------------------------------- Comment By: alvin (aralbrec) Date: 2010-11-11 12:19 Message: As Philipp mentioned, the z80's small number and non-orthogonal instruction set means placing parameters in arbitrary registers prior to a call can involve a lot of overhead that would make it detrimental in most cases involving more than 2 or 3 parameters. However there are two calling linkages that we have found greatly improves z80 code: (1) CALLEE linkage where the callee is responsible for stack cleanup and (2) FASTCALL linkage where a small number of parameters (1 or 2) are passed in [DE]HL. The latter is consistent with how values are returned out of functions in all z80 C compilers I have found (specifically longs are returned in DEHL). We've used both for library code and supply an additional asm entrypoint in library functions using CALLEE linkage for asm programmers to sidestep the register initilizaition in function calls. Here is an example of both: CALLEE linkage: " (CALLEE linkage assumes left to right -pushing of params on stack) ; long __CALLEE__ strtol(const char * restrict nptr, char ** restrict endptr, int base) XLIB strtol, asm_strtol ; export C and asm entrypoints strtol: pop hl pop bc pop ix ex (sp),hl ; enter: ; bc = base ; ix = char **endptr ; hl = char *nptr ; ; exit: ; dehl = result (could be LONG_MAX or LONG_MIN on overflow) ; bc = address of next char to examine in nptr[] ; carry = error (overflow, bad base, empty conversion string) ; errno set to ERANGE (overflow), EINVAL (bad base / empty conversion string) ; *endptr set appropriately ; ; uses: ; af, bc, de, hl, af', bc', de', hl', ix asm_strtol: ........ body continues " compiler calls like so: ; parameter collection in hl push hl ; nptr ; parameter collection in hl push hl ; endptr ; parameter collection in hl push hl ; base call strtol ; note no stack cleanup, this adds up quickly and saves a lot on code size An example of FASTCALL linkage (incoming parameter always in [DE]HL) " ; char *strrev(char *s) ; reverses string s ; ; enter: ; hl = char *s ; ; exit: ; hl = char *s ; ; uses: ; af, bc, de strrev: .... body continues " compiler calls like so: ld hl,parameter call strrev The C and asm entrypoint is shared. ---------------------------------------------------------------------- Comment By: b-s-a (b-s-a) Date: 2010-11-11 02:48 Message: *fix: ... returns IX, uses AF, BC, DE, HL, IX. ... ---------------------------------------------------------------------- Comment By: b-s-a (b-s-a) Date: 2010-11-11 02:45 Message: Is it possible add feature of using specified registers as arguments? It is useful for external functions written on assembler. For example, I have set of asm procedures which takes parameters: IX - some address, HL - some other address/data, B, C, D, E, returns IX, uses all, except AF, BC, DE, HL, IX. So I want to use these functions (they are time critical) without wrappers. For example: extern uint16_t* asm_func(uint16_t*, const uint16_t*, uint8_t, uint8_t) __naked(IX : IX, HL, B, C : AF, BC, DE, HL, IX); ---------------------------------------------------------------------- Comment By: Philipp Klaus Krause (spth) Date: 2010-05-21 05:33 Message: I reently learned that the C stnadard ignores storage class specifiers for the parameters in function declarations, they matter only in the definition. Thus, we cannot do the passing in regiasters depending on the storage class specifier. That makes this feature request unlikely to get implemented soon, if at all. Philipp ---------------------------------------------------------------------- Comment By: Philipp Klaus Krause (spth) Date: 2010-05-19 00:48 Message: Steps needed: - Make the notUsed() peephole function aware of passing parameters in registers, so assignments to register arguments are not considered dead code. This has to be done exactly, since notUsed() is probably the peephole optimizer's single most powerful tool, and it's effectiveness should not be compromised. - Fix bug #2811521. - Enable register parameters by changing the default value of --no-reg-params to 0 for the Z80 port. - This solution would use de and bc for passing parameters that have the register storage class specifier. Philipp ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=350599&aid=1921061&group_id=599 |