While doing some test for [patches:#392] I discovered what I think are bugs of PDK backend.
In the test: bug2686159.c
After many tests here is what i found:
After applying my optimization:
Using absolute addresses with "__at" and no init:
unsigned char __at(0x0020) REG_2;
An assembler symbol is used, generating this code:
; ram data
;--------------------------------------------------------
.area DATA
_REG_1_RES = 0x0020
_REG_2 = 0x0021
....
or _REG_2+0, a
with error: <a> machine specific addressing or addressing mode error
Same case without using absolute addresses works fine:
unsigned char REG_2 = 0;
generating a data storage in ram area:
; ram data
;--------------------------------------------------------
.area DATA
_REG_2::
.ds 1
....
or _REG_2+0, a
So the problem seems to be that the assembler is confused and doesn't know how to handle the symbol, it thinks it is an SFR address and shows the error:
The error can also be reproduced manually by inserting the asm instructions:
Instead of:
REG_2 |= 1;
REG_2 |= 2;
Use:
REG_2 |= 1;
__asm__("mov a, #0x02");
__asm__("or _REG_2+0, a");
This is also a very good confirmation that something is confusing the assembler, because if you change the asm "or" or a "xor" it does not show any error and passes the test.
REG_2 |= 1;
__asm__("mov a, #0x02");
__asm__("xor _REG_2+0, a");
The xor instruction is valid for both memory and SFR but "or" is only valid for memory.
So the assembler is taking REG_2 as an SFR when it is not.
Summary:
This shows the error:
# define ADDRESS(x) (0x20 + 0x ## x)
__xdata unsigned char __at(ADDRESS(01)) REG_2;
...
__asm__("or _REG_2+0, a");
; ram data
;--------------------------------------------------------
.area DATA
_REG_1_RES = 0x0020
_REG_2 = 0x0021
...
; gen/pdk15/bug2686159/bug2686159.c: 43: __asm__("or _REG_2+0, a");
; genInline
or _REG_2+0, a
This does not:
# define ADDRESS(x) (0x20 + 0x ## x)
__xdata unsigned char __at(ADDRESS(01)) REG_2 = 0;
...
__asm__("or _REG_2+0, a");
; absolute external ram data
;--------------------------------------------------------
.area DABS (ABS)
.org 0x0021
_REG_2::
.ds 1
...
; gen/pdk15/bug2686159/bug2686159.c: 43: __asm__("or _REG_2+0, a");
; genInline
or _REG_2+0, a
Does pdk have suport for external ram?
Does all this make sense to any expert in the pdk assembler?
Added support for pdk in test bug2686159 : [r12498].
All currently existing pdk devices have internal RAM only. With the small RAM address space supported by these devices having an external RAM chip wouldn't make sense, so I don't think any with external RAM ever existed or will exist.
The architecture only has three address spaces: RAM, ROM, I/O.
After some more tests and learning of pdk asm I have new findings.
It seems that addressing is not done correctly by assembler:
What we get:
What is should be:
For reference (pdk15):
Ram opcode:
IO opcode:
SFR not initialized, opcodes are correct.
SFR initialized, opcodes for RAM!.
Relocatable RAM (initialized or not):
Absolute RAM initialized:
All these cases generate correct opcodes for RAM:
(cases with relocatable variable show different addresses, but same opcodes).
Absolute RAM with no init, we get IO opcodes!:
You have attached the 2 test files I used to get this code.
Side note: the tests pass, because they don't test whether the memory space is the correct one or not, only if the write / read was successful.
Once someone else confirms that this behavior is not the expected one, I will write tests to check that the memory space is the same using the different cases and also that writing one does not affect the others.
I am 90% sure it is not OK, but I am not familiar with this kind of subtle details in SDAS directives, and documentation is not very good / not updated.
It looks like that the .area where the symbol is defined or space is reserved is not taken into account.
It only seems to care about space storage, if the symbol is from a .ds it handles it as RAM, otherwise it is IO.
I think this is incorrect and related to this code in pdkadr.c, look at the default case, constants with no area information?:
I am not familiar with the assemblers, pdk or not. The one most familiar with the pdk assemblers is Nicolas Lesser. But I haven't heard from him in a while.
For pdk, there is the mov instruction and some logic instructions that access memory or I/O. The mnemonic is the same for both versions.
I don't really know what would be the correct way to the assemblers to distinguish between memory and I/O access. Do we need to invent a different mnemonic for the I/O case?
I am also no expert in pdk assembler, but I did a self-taught accelerated course yesterday
I also arrived to that same conclusion about the new mnemonic yesterday.
This is the way it is handled for other backends, different instructions for different opcodes (for example: "in" and "out" for sfr).
The assembler has no easy way of knowing, because the assembler does not store (in some cases) the needed information to use the right opcode.
It should be easy to fix it in gen.c, and check if any rules in peep.h are involved.
I also would involve changing several files of assembler, but should be easily doable.
Do you mind about backwards compatibility?, for people using directly the asm mnemonics.
I don't think it is important, because pdk user base is not very big and most of them are here for the "C", not for the asm details.
With backwards compatibility out of the way, my proposal would be to create only new mnemonics for the IO instructions, these two should be enough:
same names followed by .s (sfr).
And should cover us from future instruction set upgrades, for example covering 'and' or 'or', because we could easily follow the same pattern.
This follows the same pattern as the MSP430 asm I've seen in the past (mov.b mov.w)
I was also able to fix the problem for the SFR with initialization, using the data from the section and checking if it is in a "RSEGx" section, the generation of RAM addressing can be avoided.
But the case with absolute RAM and no initialization is not easily fixable without adding new features to the assembler, that is one of the reasons why I think the best option is new mnemonics.
The other reason as already said is that other backends already do it that way.
Changes in pdkadr.c (would not be needed with new mnemonics):
Last edit: Visenri 2021-06-27
mov.io might be a bit more intuitive (for those that also program in assembler) than .s, but it's not a big deal to me. Besides mov and xor, there is also t0sn, t2sn, set0, set1, wait0, wait1, swapc.
I guess for consistency the .s or .io suffix should be used on all instructions accessing I/O.
This is a backwards compability issue, so I think the moment it changes, we should bump the minor version number (from 4.1.6 to 4.1.7 in .version), mention it in the backwards compability section (manual), and in the list of notable changes for 4.2.0 (wiki).
I was looking right now at those instructions too, there is another one, TOG, for pdk16.
No big deal, just follow the same pattern.
This would be the complete list.
T2SN does not exist, a typo, I suppose :D.
I don't know the current state of the pdk16 assembler in trunk. Maybe the pdk branch also has some improvements that have not yet merged to trunk. AFAIR, the pdk branch was last used to work on a pdk16 port, but it either never got far enough to merge it back to trunk, or most of it has been merged to trunk, but it wasn't good enough to enable the pdk16 port build for trunk.
Yesterday I also found more problems, other backends also have problems with initialization in variables using the __sfr qualifier. But at least, they directly stop with an error.
Z80 for example fails with this error:
Regarding these lines, there is an org with a missing section:
Initializers for absolute addresses seem to work fine.
But that would be the subject for a different ticket.
I have a working version (passes all tests) with the required changes for the new mnemonics, but several questions have arisen:
I have changed all places using io address space in gen.c to use the new ".io" mnemonic, but as I was doing the changes I thought:
Should the assembler show an error in these contradictory cases?:
The alternative would be to tolerate both mnemonics only in those cases, because we can derive the address space from the operand.
What should we do with the incomplete pdk13 and pdk16?
Who knows what are the pending issues?
I can just apply blindly the required changes (very minimal for each pdkxx) to those backends and they should work.
But it will be very useful if someone could show me what are the current issues preventing those backends to be fully operational.
Same thing with the assembler / simulator.
Last edit: Visenri 2021-07-20
From the assembler perspective, p should not be special at all. It has a special role in the compiler only.
For sp and f, I'm not sure if it should be a warning vs. an error.
pdk13 should already work like the other pdk ports. Users report that it works for them (and have in the past reported bugs when it didn't). However, to be considered a stable port it would have to pass regression tests, but pdk13 does not have enough memory for the regression test infrastructure. That might change with future optimizations.
pdk16 is a bit more complicated. A very basic port should be possible once the assemblers and linkers work. But the word size being 16 = 8 * 2 bits, and all known implementations having multiple "cores" makes it somewhat special.
I guess making pdk16 a full port is out of the scope of this bug report.
Implemented in [r12558].
For the case discussed, "sp" and "f" used without .io instruction, I've implemented a warning for now.
I have not implemented these mnemonics for pdk16 because they were not present originally, and pdk16 is not testable right now:
I have modified/implemented this mnemonic (valid for pdk14-16) because it was present originally.
But it is not used by the code generator, so untested (as it was).
I think it would be good for the assembler to support all mnemonics. Even those not used by the compiler and those valid only for some devices. Users might want to write individual functions in assembler and use those.
Them not yet being supported by the pdk16 assembler is probably just due to the pdk16 port being incomplete.
Last edit: Philipp Klaus Krause 2021-07-26
Of course, I completely agree.
It was just not necessary right now.
I will implement the remaining mnemonics ASAP.
After all the patches commited [r12557], [r12558],[r12559], [r12560], [r12561] & [r12566], it seems that pdk is failing:
http://sdcc.sourceforge.net/regression_test_results/aarch64-unknown-freebsd13.0/regression-test-aarch64-unknown-freebsd13.0-20210727-12566.log
http://sdcc.sourceforge.net/regression_test_results/x86_64-apple-macosx/regression-test-x86_64-apple-macosx-20210727-12566.log
But in my machine, all pdk tests finish successfully.
May I see the output of the simulator of those tests?
Can someone else reproduce this issue and give me some extra information to debug the problem?.
The problem seems to be in the simulator, the cycle count is just 1 for many tests.
Last edit: Visenri 2021-07-27
I was not able to reproduce it in my test machine, but I found a problem in the code that could make pdk15 simulation fail randomly.
It was just a missing initialization.
Should be fixed in [r12575].
I can reproduce the issue regarding the python script, but not the pdk test failure.
I was not aware of the valdiag tests!.
I will fix the python script ASAP.
Python script fixed (or I hope so) in [r12574].
But the issue with pdk tests remains unanswered.