From: Michel V. d. B. <vd...@lu...> - 2001-03-08 08:03:55
|
> Hello Scot, As I understand it there is only a problem for recursion since then functions have to be reentrant? If all functions are non-reentrant then a stack isn't necessary and the register allocation can be done by live analysis (as you say this could (should?) actually be done in the linker, but the lack of a processor independent linker makes this unatractive). In case of recursion I don's see how you can calculate a maximum for the size of the stack. (or the similar idea of a linked list). Do I misunderstand what you are saying? Best regards, Michel > > > Message: 1 > Date: Wed, 7 Mar 2001 01:38:15 -0600 (CST) > From: Scott Dattalo <sc...@da...> > To: sdc...@li... > Subject: [sdcc-devel]pCode stack optimization > Reply-To: sdc...@li... > > pCode = post code generation optimization > > The PIC doesn't have a data stack. Bummer. This was made painfully clear by this > simple test: > > void inc(unsigned char k) > { > uchar0 = uchar0 + k; > } > > void nested_call(unsigned char u) > { > > inc(uchar0); > uchar1 = uchar1 + u; > > } > > each of these functions saves a local copy of the input parameter. The logic of > SDCC is such that these get saved into the same registers. Consequently when > `nested_call' calls `inc', the state of nested_call's local variables must be > preserved. This is done quite easily with a PUSH, easily that is if you have a > data stack. > > The way I thought of getting around this issue on the PIC Port was to advantage > of pCode. But I'd like to bounce this idea off of others to see if it's > reasonable. > > Here's how it would work. Instead of statically assigning local registers, use a > dynamic scheme. Designate a block of memory to be used as this `dynamic' > allocation area. This is loosely analogous to designating RAM for a stack. (This > designation is mostly for conceptual purposes, because the algorithm > automatically determines the max amount of storage needed for the`stack'.) > > The code generation logic in ralloc.c/gen.c can remain essentially the same. > However, when the pCode is expanded into assembly, the pCodes for push and pop > will demarcate ranges for which local variables may conflict. So for each push, > a flow analysis will be performed to ascertain the register usage up to the > point of the matching pop. In a sense, this is like the live range calculation > except that it can cross ebb's or function boundaries. Based on this flow > analysis, the push/pop pair can be removed by reassigning the location of the > register being pushed. > > So taking the above example, each function will designate say r0 as the location > for storing the local copy of the parameter being passed in. In the pCode > expansion of `nested call', the PUSH pCode is encountered. The flow is traced > and when the CALL pCode is encountered the flow analysis will be directed > accordingly. Upon analyzing the pCode for the function `inc()', it's determined > that register r0 is being used. When inc()'s return is encountered, the flow > analysis again returns to 'nested_call()' and the pCode for the matching POP is > found. The flow analysis stops. From this it's determined that the flow between > the matching PUSH and POP's uses r0. Consequently, nested_call()'s usage of r0 > is changed to say r1. > > Caveat - this would have to go into the linker. But I'm already convinced that > to get the most out of pCode optimization that linker logic will necessarily be > affected. > > One of the benefits of this approach is that it will automatically determine the > maximum space needed for 'stack' storage. (It may not determine the optimum, > though.) > > Except for the complications of recursion (and the related issue of calling the > same function more than once in a string of calls) does anyone see anything > wrong with this approach? > > I hope this explanation was not too confusing. > > Scott > |