Activity for Oleg Endo

  • Oleg Endo Oleg Endo posted a comment on ticket #934

    It would require annotated function forward declarations everywhere. Otherwise the compiler can't know how to generate the call to the function.

  • Oleg Endo Oleg Endo posted a comment on ticket #934

    I dug out my notes on that matter from a while ago. This is what I arrived at for a possibly improved mcs51 ABI: * call clobbered registers: r0-r3, a, b, dptr, carry flag * call saved registers: r4-r7, psw bank select bits * function arguments and return value: r0-r3 instead of { dpl, dph, b, a, r4. r5, r6, r7 } values are stuffed into registers as much as possible, the remaining go onto the stack. can pass/return 4x uint8_t, 2x uint16_t or 1x uint32_t directly in registers pointers are converted...

  • Oleg Endo Oleg Endo posted a comment on ticket #466

    I know. But didn't want to do that for the bigger software. The asm diffs become difficult to check or so ;)

  • Oleg Endo Oleg Endo posted a comment on ticket #466

    I had the carry flag working completely transparently, just like any other register. The reg set/use information should be already all there. The lines // If we don't know what it is, assume it might be used. // todo: allow a, and support it in removeDeadMove. // todo: allow dpl, dph, and support it in removeDeadMove. if (!(what[0] == 'r' && isdigit(what[1])) && !(what[0] == 'a' && what[0] == 'r' && isdigit(what[2]))) // Allow r? return (false); .. seem superfluous. scan4op will check it through...

  • Oleg Endo Oleg Endo posted a comment on ticket #466

    Checked it again. Couldn't reproduce it. Probably my comparison setup was messed up. Your changes to pattern 302 don't seem to be causing this.

  • Oleg Endo Oleg Endo posted a comment on ticket #466

    I've checked it again. Adding restart to peephole pattern 302 will give do the following in my software (see attached diff screenshot) In this case it's not a wrong-code. I guess it's actually not pattern 302 but another pattern that kicks in due to the restart. I see also several other improvements (more other patterns kicking in, e.g. for jumps ... ) when restart is added.

  • Oleg Endo Oleg Endo posted a comment on ticket #466

    the 2 lines that you've added in [r14474] look buggy if (!strcmp (inst, "jnz") || !strcmp (inst, "jz")) aln->regsWritten = bitVectSetBit (aln->regsRead, A_IDX); should probably be: if (!strcmp (inst, "jnz") || !strcmp (inst, "jz")) aln->regsRead = bitVectSetBit (aln->regsRead, A_IDX);

  • Oleg Endo Oleg Endo posted a comment on ticket #466

    I've spotted it by comparing the asm code of my software with some previous output of SDCC. It appears in the middle of larger functions... here's one example: mov ar2,r4 <<< unused store to ar2 mov ar3,r5 mov a,r3 jnb acc.7,00149$ mov r3,#0xff sjmp 00150$ My guess would be broken operand matching ..... there were quite a couple of cases which I ran into before, I wouldn't exlude any remaining bugs. Sorry, haven't checked though, just a guess.

  • Oleg Endo Oleg Endo posted a comment on ticket #466

    Not sure how, I just saw it by looking at the diffs of generated asm code of my software, compared to some previous SDCC's output. BTW this thing has to be extended to support also notUsed checks of the carry flag ('C' or 'c'). I wanted to start submitting those patterns around the carry flag first, but if that has to be in place.

  • Oleg Endo Oleg Endo posted a comment on ticket #466

    @spth in your commit in r14467 "Replace deadMove by notUsed in non-push/pop roles." you have collapsed peephole patterns 302,303 into a single pattern 302. I'm now seeing cases where it fails to eliminate e.g. this: mov ar2,r4 Not sure what's with that, because theoretically it should work. This pattern now can easily produce wrong-code bugs. E.g. it can eliminate mov sp,a ... which can be observed if we add a restart to the peephole pattern.

  • Oleg Endo Oleg Endo posted a comment on ticket #466

    Alright then. Frankly, I don't remember the details anymore. So let's start adding new peephole patterns and fix it up as we move along.

  • Oleg Endo Oleg Endo posted a comment on ticket #466

    Sure, as you like. However ... if (!strcmp (inst, "jnz") || !strcmp (inst, "jz")) aln->regsWritten = bitVectSetBit (aln->regsWritten, A_IDX); "jnz" and "jz" insns don't write A register, but read from it. So it should use regsRead instead of regsWritten. Isn't it? On Wed, 2023-11-29 at 20:34 +0000, Philipp Klaus Krause wrote: For patch #5, IMO something like the attached version would be good enough for now (and it is much shorter). Attachments: * jz.patch (512 Bytes; text/x-patch) [patches:#466]...

  • Oleg Endo Oleg Endo posted a comment on ticket #466

    Sure, as you like. However ... if (!strcmp (inst, "jnz") || !strcmp (inst, "jz")) aln->regsWritten = bitVectSetBit (aln->regsWritten, A_IDX); jnz and jz insns don't write A register but read from it. So it should use regsRead instead of regsWritten. Isn't it?

  • Oleg Endo Oleg Endo posted a comment on ticket #466

    Can you point to he code that results in those two spaces after mov? IMO, that is a bug. Looking through mcs51 .c files, I couldn't find anything. However, there's Peephole 177.a which has two spaces after the mov. But I haven't checked if that pattern is responsible for it. Looking through rtrack.c we see a lot of cases with hardcoded expected asm line formatting, like: if (!strncmp (line, "setb\tb.", 7) || !strncmp (line, "clrb\tb.", 7)) { rtrack_data_unset (B_IDX); return false; } /* bitwise operations...

  • Oleg Endo Oleg Endo posted a comment on ticket #466

    @spth @spth could you please look at my original patch#13 again and suggest how to replace lines likealn->regsRead = bitVectUnion (aln->regsRead, ln->ic->rCallUsed);? Regarding this issue, it just occurred to me that it's probably going to be difficult to figure out the real regs that are used by a call/jump insn of banked calls. The function symbol is not part of the call/jump insn line, and is loaded in some way into other registers, possibly farther away from the actual call/jump insn. Essentially...

  • Oleg Endo Oleg Endo posted a comment on ticket #466

    From what I remember, I ran across a few spots where the code seems to expect a certain whitespace format, but don't remember where it was. That's why I'm mentioning this. That has to be weaned out. I believe all mcs51 peephole patterns will re-emit the code without the space. So it will be even more inconsistent. I guess we could say "sieht irgendwie aus wie larifari" ;) Another thing I remember ... peephole patterns need to have the actual TAB chars in the correct place, or the/some patterns will...

  • Oleg Endo Oleg Endo posted a comment on ticket #466

    @spth, @maartenbrock I've noticed that the latest svn trunk version changes white space output in the asm code. For example: 000019 159 _bleh_0: 000019 AD 82 [24] 160 mov r5, dpl << space after comma 00001B AE 83 [24] 161 mov r6, dph 00001D AF F0 [24] 162 mov r7, b 163 ; bug.c:27: return (uint32_t)x == 0; 00001F ED [12] 164 mov a,r5 000020 4E [12] 165 orl a,r6 000021 4F [12] 166 orl a,r7 000022 B4 01 00 [24] 167 cjne a,#0x01,00103$ 000025 168 00103$: 000025 92*00 [24] 169 mov b0,c << 2 spaces after...

  • Oleg Endo Oleg Endo modified a comment on ticket #466

    @spth I think the necessary mcs51 codegen changes are in trunk now. so we could go back to improving the mcs51 peephole optimizer. Great, thanks! IMO, a good starting point would be a minimal implementation of notUsed; then we could build on that in two ways: 1) improving the accuracy of notUsed, 2) introduce rules building on notUsed. This is where I was couple months ago.... but since there seems still no consensus on some of the open points, it might be the best way to wade through it together...

  • Oleg Endo Oleg Endo posted a comment on ticket #466

    @spth I think the necessary mcs51 codegen changes are in trunk now. so we could go back to improving the mcs51 peephole optimizer. Great, thanks! IMO, a good starting point would be a minimal implementation of notUsed; then we could build on that in two ways: 1) improving the accuracy of notUsed, 2) introduce rules building on notUsed. This is where I was couple months ago.... but since there seems still no consensus on some of the open points, it might be the best way to wade through it together...

  • Oleg Endo Oleg Endo posted a comment on ticket #3672

    If you already have that in place then why are you asking about placing "hello bogus nullptr check" at code address 0x0000? If the reset vector is always present at 0x0000 no matter which code bank is active then how can you have a text string there? Yes, you're right. The example was a bad one. Also note that a generic pointer with value 0x060000 points to __xdata, not to __code. Yes, I know that. Forgot to set the MSB when writing it. It should have been 0x860000. And even then 0x060000 would still...

  • Oleg Endo Oleg Endo posted a comment on ticket #3672

    Well yeah, that's obviously already in place. Otherwise it wouldn't work at all (almost). My case was actually having only part of the flash used for banked code while some other parts used for other data. I was trying to use void gptr to have a more efficient representation of the 23-bit pointer. E.g. something would return either 0x060000 or nullptr and it would always fail the != nullptr check. Using uint32_t is the better choice here.

  • Oleg Endo Oleg Endo posted a comment on ticket #3672

    Are you saying that you have a construction where you swap out the reset and interrupt vectors of the mcs51? Yes, in fact I'm working on/with such a system where the whole 64 KByte bank is switched (although I'm not sure how it matters here). The common bank thing has to be implemented in software. It's been working ok-ish, although SDCC could definitely use some improvements to make it less painful to use. Also note that this code is only generated for comparison against a constant null pointer....

  • Oleg Endo Oleg Endo posted a comment on ticket #466

    What is the point of patch 9. ? The peepholes are meant to optimize generated asm for C code Yes, it's supposed to optimize things like mov a,_ACC ... in a safe way. I.e. if the symbol is known to map to one of the standard SFR addresses, then this can be optimized. Usually in C code we'd use some SFR name like ... sdcc/device/include/mcs51/8051.h:__sfr __at (0xE0) ACC ; ... but it could be named anything actually. So it's better to check for known standard address mappings instead. One thing is...

  • Oleg Endo Oleg Endo posted a comment on ticket #466

    but I now doubt if they should not lose those leading underscores altogether. These functions should not exist in the C namespace. According to the C standard... All identifiers that begin with an underscore and either an uppercase letter or another underscore are always reserved for any use. So either 3 underscores as Philipp suggested before, or no underscores at all would probably be most "correct". However, I think it's more important to be consistent and apply the same patterns for all runtime...

  • Oleg Endo Oleg Endo posted a comment on ticket #466

    Why is there no regsWritten variant of this piece of code? If a write to b0 implies a write to bits then a write to any bit b0-b7 implies a write to any other bit. This will prevent it from eliminating dead stores of unrelated bits. That's what I recall. I know I introduced b0 - b7 in the past, but am now wondering if it could not be better handled by bits.0 - bits.7 directly. It would make this patch simpler I believe. I thought b0-b7 are some kind of allocatable "special 1-bit registers"?

  • Oleg Endo Oleg Endo posted a comment on ticket #466

    Thanks!

  • Oleg Endo Oleg Endo posted a comment on ticket #3672

    So a memory space of __code in bank 0 is considered a different memory space than __code in bank 1? I can hardly imagine. Intuitively I'd think of the whole __code address space as a single space, but it's just larger than 64 KByte. With code banking enabled, if we do something like this: bool check_ptr (const void* gptr) __banked { return gptr != 0; } void some_other_func (void) __banked { return check_ptr ("hello bogus nullptr check"); } If the string literal constant falls onto a 64K bank boundary...

  • Oleg Endo Oleg Endo posted a comment on ticket #466

    I've tried to rebase my local version (with a couple more accumulated peephole changes since the original submission) on to SVN r14412, that's about before @spth started the 'aop Arg' stuff. Something must have changed and it's now wrongly eliminating function return values. Not sure why it happens, would need to investigate. I guess it's wasted effort, because those parts will be done differently anyway after the 'aopArg' pieces are in place. However, I'd like to post two more extracted patches...

  • Oleg Endo Oleg Endo posted a comment on ticket #3286

    Issue still present as of SVN r14412. @spth I guess this will evaporate with the new register alloactor?

  • Oleg Endo Oleg Endo created ticket #893

    Transform for loops into do-while

  • Oleg Endo Oleg Endo created ticket #3672

    mcs51: gptr comparison ignores high byte

  • Oleg Endo Oleg Endo posted a comment on ticket #462

    I'd really appreciate some feedback where we are with this one.

  • Oleg Endo Oleg Endo posted a comment on ticket #466

    IMO, it would make sense for notUsed to not give an error if it gets something that it can't really handle. It could just err on the safe side, returning false for that case. That's what it already does with my patch series. It prints a warning-level thing returns "false" (so 'notUsed' will not eliminate the register) and goes on. The warnings are extremely useful when working on peephole patterns, because there are no compile time checks of the peephole pattern definitions. It's very easy to have...

  • Oleg Endo Oleg Endo posted a comment on ticket #466

    We're not quite there yet, but you're welcome!

  • Oleg Endo Oleg Endo modified a comment on ticket #466

    I've just tried that, but noticed that later on, I'm also adding dbglog_scan4op in patch 14 and it's used like that: dbglog_scan4op ( { port->peep.getRegsRead (*pl); port->peep.getRegsWritten (*pl); printf (" regs rd: "); bitVectPrint (stdout, port->peep.getRegsRead (*pl)); printf ("\n regs wr: "); bitVectPrint (stdout, port->peep.getRegsWritten (*pl)); printf ("\n regs call rd: "); if ((*pl)->ic) bitVectPrint (stdout, (*pl)->ic->rCallUsed); printf ("\n"); }); While the D macro is already there,...

  • Oleg Endo Oleg Endo posted a comment on ticket #466

    I've just tried that, but noticed that later on, I'm also adding dbglog_scan4op in patch 14 and it's used like that: dbglog_scan4op ( { port->peep.getRegsRead (*pl); port->peep.getRegsWritten (*pl); printf (" regs rd: "); bitVectPrint (stdout, port->peep.getRegsRead (*pl)); printf ("\n regs wr: "); bitVectPrint (stdout, port->peep.getRegsWritten (*pl)); printf ("\n regs call rd: "); if ((*pl)->ic) bitVectPrint (stdout, (*pl)->ic->rCallUsed); printf ("\n"); }); While the D macro is already there,...

  • Oleg Endo Oleg Endo posted a comment on ticket #3640

    Can anybody reproduce/confirm/refute this?

  • Oleg Endo Oleg Endo posted a comment on ticket #466

    Also, there probably still will be some banked programs with no optimizeable tail calls. I'd suggest to put the tailcall trampoline into a separate assembler file (so the linker will only link it into programs that actually use it). I know what you mean. However... - Right now the other two existing functions __sdcc_banked_calland__sdcc_banked_ret are all in the same file, so I've added it there - The file lib/mcs51/crtbank.asm is just an example implementation. As far as I can see, it's not even...

  • Oleg Endo Oleg Endo posted a comment on ticket #466

    But the main point is that, IMO we cannot accurately trace the ic in the peephole optimizer, we can only try to approximate there. And if that approximation ends up assigning the wrong ic to some line, this should not affect the correctness of the generated code. I guess it might be a good thing to ensure the correct synced association of a couple of key entities. For example in GCC we can easily get the basic block within which an RTL insn resides. This is very useful crucial information for low-level...

  • Oleg Endo Oleg Endo posted a comment on ticket #466

    OK. I'll replace the patch in my local tree with your version.

  • Oleg Endo Oleg Endo posted a comment on ticket #466

    Is this worth it? Well, I think it is! First, we have normal function calls being converted into tailcalls. Why should not the same be done for banked calls? Tailcalls have some interesting properties. The additional code size of the trampoline argument is kinda moot. Non-banked programs will not include those trampolines and helper functions at all, so there is no code size regression. As for banked code programs, those systems having > 64 KByte code space usually don't have code size issues and...

  • Oleg Endo Oleg Endo posted a comment on ticket #466

    What I have there instead at the moment is this: replace restart { mov ar%1,%2 } by { ; Peephole 303 mov ar%1,%2 removed } if notVolatile(%2), deadMove(%1) replace restart { mov %1,%2 } by { ; Peephole 304 mov %1,%2 removed } if notVolatile(%2), same(%1 'a' 'dpl' 'dph' 'b' 'r0' 'r1' 'r2' 'r3' 'r4' 'r5' 'r6' 'r7'), notUsed(%1) // pattern 304 catches most dead loads of immediates. // however direct addresses of variables might not always pass the notVolatile // it seems and will get stuck around. //...

  • Oleg Endo Oleg Endo modified a comment on ticket #466

    Honestly, I don't understand what's wrong with having a bitvector attached to the call insn that records the registers it uses and sets. Why not record this exact information when the call insn is emitted (e.g. via aopArg & et al or whatever) and avoid having to wade through it over and over again and resort to some "approximation"? What's the benefit? amendment: Sorry I got this mixed up with patch #11, but it's sort of related. Perhaps we should throw patch #4 and patch#11 into the same discussion...

  • Oleg Endo Oleg Endo posted a comment on ticket #466

    Honestly, I don't understand what's wrong with having a bitvector attached to the call insn that records the registers it uses and sets. Why not record this exact information when the call insn is emitted (e.g. via aopArg & et al or whatever) and avoid having to wade through it over and over again and resort to some "approximation"? What's the benefit?

  • Oleg Endo Oleg Endo posted a comment on ticket #466

    And what about function calls where the label might not not be in the symbol table?

  • Oleg Endo Oleg Endo posted a comment on ticket #466

    Got it. I will then merge patch 8 and patch 9, OK?

  • Oleg Endo Oleg Endo modified a comment on ticket #466

    11: At first sight, I don't see why you want to attach this information to the ic. It seems redundant: The ic would be a CALL or PCALL; the type of the function called is known there. I've tried to re-use the exising 'rUsed' bit vector, but it somehow conflicted with the way it's used in other places. It somehow ended up having the wrong information in some cases (and would generate wrong code). One particular case I remember strongly is function 'mcs51/gen.c (getFreePtr)'. It would somehow wrongly...

  • Oleg Endo Oleg Endo modified a comment on ticket #466

    11: At first sight, I don't see why you want to attach this information to the ic. It seems redundant: The ic would be a CALL or PCALL; the type of the function called is known there. I've tried to re-use the exising 'rUsed' bit vector, but it somehow conflicted with the way it's used in other places. It somehow ended up having the wrong information in some cases (and would generate wrong code). One particular case I remember strongly is function 'mcs51/gen.c (getFreePtr)'. It would somehow wrongly...

  • Oleg Endo Oleg Endo modified a comment on ticket #466

    11: At first sight, I don't see why you want to attach this information to the ic. It seems redundant: The ic would be a CALL or PCALL; the type of the function called is known there. I've tried to re-use the exising 'rUsed' bit vector, but it somehow conflicted with the way it's used in other places. It somehow ended up having the wrong information in some cases (and would generate wrong code). One particular case I remember strongly is function 'mcs51/gen.c (getFreePtr)'. It would somehow wrongly...

  • Oleg Endo Oleg Endo posted a comment on ticket #466

    11: At first sight, I don't see why you want to attach this information to the ic. It seems redundant: The ic would be a CALL or PCALL; the type of the function called is known there. I've tried to re-use the exising 'rUsed' bit vector, but it somehow conflicted with the way it's used in other places. It somehow ended up having the wrong information in some cases (and would generate wrong code). One particular case I remember strongly is function 'mcs51/gen.c (getFreePtr)'. It would somehow wrongly...

  • Oleg Endo Oleg Endo posted a comment on ticket #466

    8: Looks ok to me, but I didn't apply it yet, since it is apparently needed for 9 only. That's right. 9: The idea is ok, but IMO the patch needs some minor changes: The MCS-51 SFR are a special way of doing I/O (separate I/O space that does not support indirect addressing). Not just some "special register". Some other architectures supported by SDCC have similar concepts in the hardware or the port, others don't. Honestly, I did not consider how other targets/ports might interpret the naming of this...

  • Oleg Endo Oleg Endo posted a comment on ticket #466

    5: This patch makes a lot of effort to handle the read of acc by jz/jnz. Why not just check for jz/jnz in asmLineNodeFromLineNode and then set the A_IDX bit? It becomes relevant later on. The patches in the set build on top of each other.

  • Oleg Endo Oleg Endo posted a comment on ticket #466

    4: I disagree with this patch. IMO it is a rather crude approach that in many cases will work worse than the current one: the patch basically just checks if there is a CALL, PCALL or RET around, and if yes assigns all lines with the first CALL/PCALL/RET encountered (if no it falls back to the current approach). I just clobbered it together somehow to avoid generating wrong-code. Not sure what you mean by "work worse" ... since the current approach doesn't work at all for call insns. If you have an...

  • Oleg Endo Oleg Endo posted a comment on ticket #466

    1: While I agree with the goal of patch 1, I didn't like the macro use, so I rewrote it. Thanks for looking at it! Since you've squashed the patches into a single commit, it's now difficult for me to see which parts you have modified. It'd be easier if you applied the patches as individual SVN commits. Then I can also update/rewrite the patches in my own local tree with your version.

  • Oleg Endo Oleg Endo modified a comment on ticket #883

  • Oleg Endo Oleg Endo posted a comment on ticket #883

    Although that works as expected, since bits is handled as call-preserved and the dead-move scanning stops at function calls, it won't be able to optimize those temporary stores away if a function call is found after the dead-store. It would need to be able to look through calls to do a better job.

  • Oleg Endo Oleg Endo posted a comment on ticket #79

    assuming the default is the new ABI, all declarations of the old-ABI function will need to have __sdcccall(0) added (also all function pointers to old-ABI functions) Sounds workable!

  • Oleg Endo Oleg Endo posted a comment on ticket #79

    I do have a situation where newer ABI code would need to call the older ABI code (for some compatibility reason). It'd be good if the backend was a bit more flexible regarding handling of different calling conventions and could eventually support both (new + old). It kinda does already ... with e.g. callee_saves and also special libcalls. The fact that register based calling conventions are generally are a good thing (given enough registers) is actually nothing new, I think. For Z80, we found that...

  • Oleg Endo Oleg Endo posted a comment on ticket #79

    When looking at some of my larger firmware running on mcs51, yes, this is a real bummer. mcs51 is not good at accessing variables on the stack. And that tends to accumulate as code bloat. In some cases I even resort to passing gptr as void* or uint32_t just to avoid ferrying the args through the stack. The stack calling also prevents tail call optimizations, or just simple functions that only forward arguments to another function (common for wrappers/ adapter functions). Although DPTR is convenient...

  • Oleg Endo Oleg Endo modified a comment on ticket #253

    Similar to RFE #979838, but #979838 is for mcs51, while this one is for Z80. @spth I can't find said RFE #979838 for mcs51, where is that? Highly interested in this one. Looking through the code generated for my mcs51 software, a different calling convention could improve things quite a bit.

  • Oleg Endo Oleg Endo posted a comment on ticket #253

    Similar to RFE #979838, but #979838 is for mcs51, while this one is for Z80. I can't find said RFE #979838 for mcs51, where is that? HIghly interested in it. Looking through the code generated for my mcs51 software, a different calling convention could improve things quite a bit.

  • Oleg Endo Oleg Endo posted a comment on ticket #466

    Thanks Philipp. I think the things I'm touching there could be of interest for you, since you wanted to port the mcs51 backend to your new register allocator. Since my original submission of the patches, I've accumulated a couple of follow up patches that build on top. Mainly it's adding more peephole patterns and adjusting some of the existing ones. But I don't want to make the initial change set even bigger than it already is.

  • Oleg Endo Oleg Endo posted a comment on ticket #466

    Any feedback on this would be highly appreciated.

  • Oleg Endo Oleg Endo posted a comment on ticket #462

    @maartenbrock any updates on this one?

  • Oleg Endo Oleg Endo posted a comment on ticket #3644

    Thanks for the clarification!

  • Oleg Endo Oleg Endo created ticket #3644

    mcs51: implicit promotion of 8-bit -> 16-bit in variadic args not working

  • Oleg Endo Oleg Endo posted a comment on ticket #466

    updated patch 0001 -- fixed non-working case 'ab'

  • Oleg Endo Oleg Endo posted a comment on ticket #892

    I think the cprop code in that spot is already trying to do something with known bits or such. But I didn't understand it at first glance. Perhaps @spth knows this part best.

  • Oleg Endo Oleg Endo modified a comment on ticket #892

    Thanks. What I've been doing in my software for some parts which can access the external 24 MByte address space directly: #pragma codeseg BANK3 // works #pragma constseg BANK3 // is effectively ignored #define constseg_ptr(x) ((const void*)((const char*)0 + ((uintptr_t)x | (BANK3 & 0xFF0000UL)))) ... and then use that macro to make a user-defined 24-bit pointer with the correct high byte. This is for e.g. transferring data included in the C source file by DMA or other flash reading routines, not...

  • Oleg Endo Oleg Endo posted a comment on ticket #892

    Thanks. What I've been doing in my software for some parts which can access the external 24 MByte address space directly: #pragma codeseg BANK3 // works #pragma constseg BANK3 // is effectively ignored #define constseg_ptr(x) ((const void*)((const char*)0 + ((uintptr_t)x | (BANK3 & 0xFF0000UL)))) ... and then use that macro to make a user-defined 24-bit pointer with the correct high byte. This is for e.g. transferring data by DMA or flash reading routines, not directly by CPU code. However, compiling...

  • Oleg Endo Oleg Endo posted a comment on ticket #466

    related: https://sourceforge.net/p/sdcc/feature-requests/884/ https://sourceforge.net/p/sdcc/feature-requests/883/

  • Oleg Endo Oleg Endo created ticket #466

    mcs51: improve peephole optimizations

  • Oleg Endo Oleg Endo posted a comment on ticket #3640

    These are the code size stats (in bytes) for one of my apps. Before: SVN r12575 + my mcs51 peephole work After: SVN r14340 + my mcs51 peephole work CSEG 13310 -> 13368 +58 BANK1 38195 -> 41658 +3463 BANK2 32634 -> 32593 -41 BANK3 44236 -> 44263 +27 BANK4 9922 -> 11536 +1614 BANK5 10309 -> 11781 +1472

  • Oleg Endo Oleg Endo posted a comment on ticket #2221

    Seems to be not a problem anymore on current SVN r14334

  • Oleg Endo Oleg Endo posted a comment on ticket #3554

    This looks like [#3607] to me, which has been fixed.

  • Oleg Endo Oleg Endo posted a comment on discussion Open Discussion

    This looks like [#3607] to me, which has been fixed.

  • Oleg Endo Oleg Endo posted a comment on ticket #3622

    In the large memory model the memory is assumed to be <=64kB. Especially the const and xdata is supposed to be in that range. Functions can be marked as banked and so can function pointers. But if you want your variables to live in banked memory as well you must use the huge memory model. @maartenbrock is there a working example somewhere that demonstrates how to use that? From what I have tried and see, there is no way to get the correct full 23-bit pointer to a const data. Please reconsider to...

  • Oleg Endo Oleg Endo created ticket #3640

    mcs51 regression: missed constant evaluations

  • Oleg Endo Oleg Endo posted a comment on ticket #462

    @maartenbrock While rebasing the patch onto the current SVN trunk, I've noticed that there is a problem with the --all-callee-saves option. The following test code: void bleh_10 (uint32_t x); void bleh_11 (uint32_t x) { bleh_10 (0x11223344ul); bleh_10 (x); } Compiled with sdcc -c --model-large --std-sdcc2x -mmcs51 --stack-auto --nogcse --fverbose-asm --stack-probe --all-callee-saves 000000 120 _bleh_11_stack_probe_err: 000000 12r00r00 [24] 121 lcall __stack_probe_err 000003 122 _bleh_11: 000007 123...

  • Oleg Endo Oleg Endo posted a comment on ticket #462

    @maartenbrock Sorry, got worked up with another thing. I've revised the patch, to address your point: You can at least take into account that you pushed ACC here. If more pushes were done in between, it would be best if they are taken into account as well. Who knows, it might have overflown back into valid stack space! The patch is a bit shorter now.

  • Oleg Endo Oleg Endo posted a comment on ticket #883

    After extending the peephole code to support the virtual bit registers, I can confirm that simple peephole patterns like // dead store of virtual bit registers replace restart { mov %1,c } by { ; Peephole 305 mov %1,c removed } if same(%1 'b0' 'b1' 'b2' 'b3' 'b4' 'b5' 'b6' 'b7'), deadMove(%1) will remove the temporary virtual bit register stores within the function, unless they are otherwise used (e.g. by more complex bit arithmetic). However, after the peephole optimizations, the function will still...

  • Oleg Endo Oleg Endo posted a comment on ticket #884

    Looking at a smaller snippet of a larger function: .... mov dptr,#(_var + 0x0014) movx a,@dptr mov r6,a cjne a,#0x02,00276$ sjmp 00124$ 00276$: mov dptr,#(_var + 0x0014) clr a movx @dptr,a sjmp 00124$ 00109$: mov dptr,#(_var + 0x000d) movx a,@dptr mov r6,a <<<<<<<<<<<<<< dead move cjne a,#0x01,00277$ sjmp 00124$ 00277$: mov dptr,#(_var + 0x000d) clr a movx @dptr,a 00124$: mov dpl,r7 ret When the above marked mov r6,a is checked whether it's a dead move or not, the doTermScan / scan4op code visits...

  • Oleg Endo Oleg Endo posted a comment on ticket #3622

    When compiling the example as huge-model it yields s the same result. It doesn't pass the correct __code pointer to the print function. The top byte is always stuck at 0x80. Also as per SDCC manual: The huge model compiles all functions as banked4.1.3 and is otherwise equal to large for now. All other models compile the functions without bankswitching by default. Please consider to re-open this issue.

  • Oleg Endo Oleg Endo posted a comment on ticket #883

    @maartenbrock I've been wading through the peephole code of mcs51. It's actually already tracking register read/write for each insn in the function. The code in mcs51/peep.c scan4op is using that. However, although the array inreg_info regs8051[] = { in mcs51/ralloc contains 26 entries, mcs51_nRegs is set to only 16 (or sometimes 8). Not sure why that is, but it prevents the dead-store code from checking those registers above index 16. The virtual bit registers are recorded there, too, as well as...

  • Oleg Endo Oleg Endo posted a comment on ticket #462

    @maartenbrock replying in new message (it's getting too cluttered here) Specifying noreturn should prevent the user from creating a function that does return since it has no proper return address. It should probably also be added to the intrinsic functions prototypes in SDCC so it will always protest when noreturn is left out. OK, makes sense to me. am also not sure about checking stack end != 0xFF. It can happen, but is unlikely. But at least by using l_SSEG it is possible. There is no __end__stack...

  • Oleg Endo Oleg Endo created ticket #884

    mcs51: deadMove check in peephole not always working

  • Oleg Endo Oleg Endo posted a comment on ticket #883

    Do you also consider r0-r7 to be allocating global __data memory? I do not and to me bits is no different. Remember that bits is in overlayed memory. Well ... if you put it that way ... ;) My sole point here being, in this case it's triggering the allocation of 1 byte in the bits space (if I'm not mistaken), even though it's a dead store. Unfortunately, no it isn't. Remember that mcs-51 still uses the old register allocator and is single pass, unlike most other ports. It has no idea what following...

  • Oleg Endo Oleg Endo posted a comment on ticket #883

    b0 is not a global bit variable, it is a virtual bit register. There are 8 virtual bit registers (b0-b7) in bits that can be pushed to the stack. Yes, but effectively, those end up being allocated as "global bit variables". I've seen that it's caused by virtual bit register usage. The B register cannot be used unless the compiler would know that B is free. (Even PSW would be a better suggestion.) B is e.g. used for passing 24/32 bit parameters/results, multiplication and most importantly in this...

  • Oleg Endo Oleg Endo posted a comment on ticket #3286

    A slightly reduced case: extern void func_1 (void) __banked; __xdata uint32_t g_x = 0; void func_0_ng (uint64_t x) __banked { if (g_x == 0) if ((uint32_t)x - g_x > 10) func_1 (); }

  • Oleg Endo Oleg Endo created ticket #3623

    mcs51: unnecessary stores of intermediate results to global bit variables

  • Oleg Endo Oleg Endo posted a comment on ticket #3622

    It seems there is no way to edit the original description. There is a mistake: ; bug.c:15: const char* c = "test moon -- please ignore\n"; ; bug.c:16: print_something (c); ; genCall mov dpl,#___str_0 mov dph,#(___str_0 >> 8) mov b,#0x80 <<<< high byte of pointer to const data is wrong should be #((___str_0 >> 8) + 0x80) should be ; bug.c:15: const char* c = "test moon -- please ignore\n"; ; bug.c:16: print_something (c); ; genCall mov dpl,#___str_0 mov dph,#(___str_0 >> 8) mov b,#0x80 <<<< high byte...

  • Oleg Endo Oleg Endo posted a comment on ticket #462

    Usually, in such cases, we go with the solution that yields lower code size by default, but use the faster one if --opt-code-size is specified by the user. -opt-code-speed vs. --opt-code-sizetypo? Like I wrote in my other reply, I would suggest to add an alternative space optimizing implementation of stack probing afterwards. I'm not going to propose to enable stack probing by default, so there will be no code size regression/increase at all, unless stack probing is enabled by the user specifica...

  • Oleg Endo Oleg Endo posted a comment on ticket #462

    Instead of __start__stack you can also use s_SSEG (they are equal). It is accompanied by l_SSEG which holds the length of the stack segment. And since the linker doesn't support adding two relocatable symbols either, I consider adding e_SSEG. I'm not sure how to use the area/section/segment names? What would that improve? Can you give an example? OK, I've tried it out, I can replace __start__stack with s_SSEG. But as such, it doesn't change much for the current implementation. Support for stack end...

  • Oleg Endo Oleg Endo posted a comment on ticket #462

    The epilogue is what happens at the end (of the function). The prologue is what happens at the beginning. Your code is in the prologue. I know. Just a typo. Fixed. The _stack_probe_err can be marked _Noreturn or [[ noreturn ]] to make sure it will not return. Yes, that's what I have in my app code. I wasn't sure if that's OK for the device libs. I've copied the code from assert.c which doesn't specify noreturn. Since the call site always uses 'lcall' I think specifying noreturn will do effectively...

  • Oleg Endo Oleg Endo posted a comment on ticket #3622

    Another case: extern const char* bleh_ptr; void bleh (void) { bleh_ptr = "bleh\n"; } compiles to: _bleh: ar7 = 0x07 ar6 = 0x06 ar5 = 0x05 ar4 = 0x04 ar3 = 0x03 ar2 = 0x02 ar1 = 0x01 ar0 = 0x00 ; bug.c:40: bleh_ptr = "bleh\n"; ; genCast mov dptr,#_bleh_ptr mov a,#___str_0 movx @dptr,a mov a,#(___str_0 >> 8) inc dptr movx @dptr,a mov a,#0x80 <<<< wrong high byte inc dptr movx @dptr,a 00101$: ; bug.c:41: } ret .area BANK1 (CODE) .area BANK1 (CODE) .area BANK1 (CODE) ___str_0: .ascii "bleh" .db 0x0a...

  • Oleg Endo Oleg Endo created ticket #3622

    mcs51: codeseg/constseg lost high address byte

  • Oleg Endo Oleg Endo posted a comment on ticket #3563

    Could it be related to the recent https://sourceforge.net/p/sdcc/bugs/3607/ ?

  • Oleg Endo Oleg Endo modified a comment on ticket #462

    @maartenbrock thanks for checking it! There is no 'push a' and 'push acc' is 2 bytes It assembles and links just fine here it seems. The assembler seems to take care of it. But I in other places SDCC emits 'acc', so perhaps it's good to be consistent. I know that that insn is 2 bytes in size. stack_probe_restore_acc_code_line is still unused Oops. I forgot to remove this leftover piece. Internal stack can also be probed when xstack is specified as it is still used Oh, OK. I've never used xstack myself,...

  • Oleg Endo Oleg Endo modified a comment on ticket #462

    Let's count all bytes. OK, perhaps easier to follow. The non-probing case for i>9 is not right. Good point. I've refined it. Code can be shortened a bit. And similarly in the stack probing case, when A is pushed because there is no free reg, the offset changes. Should be fixed. Come to think of it, all R0-R7 registers should always be free at function entry, unless it's switching register bank. All my software never hits those cases when accIsFree == false || freereg == NULL. I also thought it's...

  • Oleg Endo Oleg Endo posted a comment on ticket #462

    Let's count all bytes. OK, perhaps easier to follow. The non-probing case for i>9 is not right. Good point. I've refined it. Code can be shortened a bit. And similarly in the stack probing case, when A is pushed because there is no free reg, the offset changes. Should be fixed. Come to think of it, all R0-R7 registers should always be free at function entry, unless it's switching register bank. All my software never hits those cases when accIsFree == false || freereg == NULL. I also thought it's...

1 >