From: JMGross <ms...@gr...> - 2012-04-30 13:00:44
|
----- Ursprüngliche Nachricht ----- Von: Peter Bigot Gesendet am: 27 Apr 2012 14:13:43 >> The immediate part of an indexed CALLA (or MOVA) >> instruction therefore has only 15 usable bits, if it is meant >> to be an (relocated) address. > I think that behavior is fairly clearly specified in the CPUX > architecture documentation, but I can see that somebody might not be > thinking about that when implementing the call operation in a higher > level language. Thanks for highlighting the issue. Inded, it is documented, however, it isn't really obvious. Obviously, it was so obfuscated that the CCS coders missed it :) Personally, I also thought that the register part would be the index (as it is variable) and the relocated immediate part the base address. I ttook me a second look to see that the static part of the instruction is the index part, which is quite contrary to the intended use. Remember, I was talking about an actual CCS compiler output, that didn't work, not hand-crafted assembly code. ----- Ursprüngliche Nachricht ----- Von: David Brown Gesendet am: 27 Apr 2012 15:42:41 > One possible idea is that since r12-r15 are caller-saves rather than > callee saves, then it might be possible to restrict 20bit usage to those > registers. Then it is the caller's responsibility to save the register > over function calls, and the callee doesn't have to have __sr20__. Four > 20-bit registers could be restrictive if you are doing a lot of 20-bit > arithmetic, but it should be good enough for most data accesses or far > function calls. For function calls this might be a possible solution, Especially if the use of the 20bit pointer actually ends at the function call. But as soon as you have far data, this would be rather inefficient. >I note that this all takes about 20-bit registers, addresses, >attributes, etc. I have a slight fear that once you've got everything >working, TI will extend the range to 24 bits (perhaps to keep up with >the AVR XMEGA that "supports" 16MB address ranges). Will you then have >to start from scratch, or can you re-use much of the code? Sicne the MSP has no external address bus which could benefit from a larger addressing range, and 1M is four times as much as the biggest currently used range, and the MSP is a low-power processor which won't be lower power with more flash, and the fact that additional bits couldn't be placed in the extension word... It wold be a completely different, not backwards-compatible (to MSP430X at least) processor then, and therefore a completely new job. I don't think it is necessary to plan the current mspgcc for this. > One idea for dealing with calling "__c16__" functions from far memory > would be to handle them indirectly. For every "__c16__" function "foo", > you could generate the original function "foo" and a trampoline > "foo_c20" which consists of "calla foo; reta" and is force to reside in > the near code area. Not calla, but call :) But a good idea anyway. It also works for ISRs, with BR then, of course. The only drawback is the additional latency. > Any functions placed in the "__far" address space > would call "foo_c20" instead of "foo". Any functions declared with the > "__c20__" attribute would make "foo_c20" an alias for "foo". This would > mean that code that is mostly in the lower memory would be smaller > (unused "foo_c20" trampolines could be garbage-collected by the linker), > faster, and use less stack space, while the occasional "__far" function > would still work correctly. Nice idea. It needs to be sorted out how to handle function pointers (always 20 bit to foo_c20? Always 16 bit to foo? Could cause some confusion), bu totherwise a good middle way between compatibility and efficiency. JMGross |