Menu

#2045 mcs51 long switch/case blocks cause jmp into oblivion

closed-fixed
5
2013-05-25
2012-06-14
No

I have the following snippet of code at the end of several switch/case blocks:
#ifdef SDCC
/*
* This is here for what appears to be a bug in SDCC.
* I suspect a problem with the jump table optimisation for switch
* statements, the disparate case number will result in less optimised
* code.
* Without this workaround multiple calls to the function lock up the
* controller. The first call always works. As far as I've been able to
* determine the end of the function is reached, but the line after
* the second function call never is.
*/
case 0xFF:
RESET_RMAP();
break;
#endif

It reads a little weird.

My details:
> sdcc -v
SDCC : mcs51/gbz80/z80/z180/r2k/ds390/pic16/pic14/TININative/ds400/hc08 3.1.0 #7066 (Feb 12 2012) (FreeBSD)

Compile:
sdcc -mmcs51 --xram-loc 0xF000 --xram-size 3072 -I inc/ -o bin.sdcc/hsk_flash/hsk_flash.rel -c src/hsk_flash/hsk_flash.c
[...]

Link:
sdcc -mmcs51 --xram-loc 0xF000 --xram-size 3072 -I inc/ -o bin.sdcc/main.hex bin.sdcc/hsk_timers/hsk_timer01.rel bin.sdcc/hsk_icm7228/hsk_icm7228.rel bin.sdcc/hsk_pwc/hsk_pwc.rel bin.sdcc/hsk_pwm/hsk_pwm.rel bin.sdcc/hsk_adc/hsk_adc.rel bin.sdcc/hsk_can/hsk_can.rel bin.sdcc/hsk_boot/hsk_boot.rel bin.sdcc/main.rel bin.sdcc/hsk_isr/hsk_isr.rel bin.sdcc/hsk_flash/hsk_flash.rel

The code runs on the Infineon XC878-16FF.

Discussion

  • Kamikaze Dominic Fandrey

    What I forgot, I cannot debug this, because the debugger looses contact to the µC when the issue occurs.

     
  • Maarten Brock

    Maarten Brock - 2012-06-22

    How are we to check this bug if you only provide small snippets of code?
    Can you please attach an example file that reproduces the bug you mentioned?

     
  • Kamikaze Dominic Fandrey

    Hmm, that's strange, I never got an e-mail that there was a comment.

    I'll wrap something up in a couple of days.

     
  • Kamikaze Dominic Fandrey

    Example code, returns right into oblivion after the second hsk_ex_port_open(call)

     
  • Erik Petrich

    Erik Petrich - 2012-09-29
    • milestone: --> non_bugs
    • assigned_to: nobody --> epetrich
     
  • Erik Petrich

    Erik Petrich - 2012-09-29

    I've carefully looked through this for many hours and as far as I can tell, SDCC's handling of the switch statement is correct for an 8051/2 compatible processor.

    I suspect the problem you are seeing is due to a slight incompatibility introduced with how the XC800 series processors implement their banked memory support (the XC878 only has a single bank implemented, but the documentation shows the banking registers are still there and presumably the associated hardware is as well). The processor has a secondary stack (Memory Extension Stack) and stack pointer register (MEXSP) that are updated by the LCALL, ACALL, RET, and RETI instructions. This allows the processor to seamlessly transition between functions spread over a 1M address space (again, I realize the XC878 only implements a small portion of this memory).

    In the switch statements that you've encountered problems, SDCC has pushed the address of the appropriate case onto the regular stack and then executed a RET instruction. The RET instruction forms an address from the top two bytes popped from the regular stack and then jumps to this address. However, the XC800 series additionally pops a byte from the Memory Extension Stack and stores this to the MEX1 register, which potentially changes the currently selected bank of memory. The problem is that this RET instruction is not matched with a previous ACALL or LCALL instruction that would have pushed the correct byte onto the Memory Extension Stack (and since this does not exist on an 8051/2, SDCC has no idea that this needs to be taken into account).

    This is consistent with your description of the function working correctly on the first invocation, but failing to return on the second. There has been a call from the start-up code to main(), from main() to hsk_ex_port_open(), and perhaps another function in between. So by the time the switch statement is processed, there are several bytes on the Memory Extension Stack. The RET instructions in the two switch statements each take a byte of this stack, as does the RET at the end of the function. Although the MEX1 register is being reloaded in each instance, the value is always the same (indicating bank 0 is selected) so the correct code continues to execute. However, after the first call to hsk_ex_port_open(), MEXSP is pointing closer to the start of the Memory Extension Stack than it should. During the second invocation, MEXSP underflows past the start of the Memory Extension Stack, and MEX1 is loaded with some indeterminate value while processing one of the RET instructions, thus selecting a bank of memory that does not exist.

    SDCC uses a similar code sequence to invoke a function via a function pointer and would probably result in the same sort of crash.

    I think you could probably make SDCC compatible with the XC800 series by using a few custom peephole rules. This one should keep the Memory Extension Stack from underflowing during a switch statement:
    replace {
    push acc
    ret
    } with {
    push acc
    lcall .+3
    dec sp
    dec sp
    ret
    }

    I think the dispatching via a function pointer could be handled similarly, but I've been staring at this too long today to attempt a peephole rule for that right now. Please try out the above peephole rule and confirm whether or not it fixed the problem you are seeing.

     
  • Erik Petrich

    Erik Petrich - 2012-09-29

    Oops, that peephole rule should be:
    replace {
    push acc
    ret
    } by {
    push acc
    lcall .+3
    dec sp
    dec sp
    ret
    }

     
  • Kamikaze Dominic Fandrey

    Will try this next week (I'm visiting my folks and didn't take any hardware with me). And thanks for the insightful reply!

     
  • Maarten Brock

    Maarten Brock - 2012-09-29

    Hmm, this is another good reason to want to use JMP @A+DPTR in crtcall.asm for all function pointers. But that requires ACC and DPTR to be free for the call and currently they're used for passing the first parameter. Thus crtcall.asm can only used for (void) functions.

    The jump table probably can always use JMP @A+DPTR but will give slightly larger code. The <112 case can use MOV DPL/DPH,A instead of PUSH ACC, but requires an extra CLR A. The <256 case requires PUSH ACC and an extra POP DPL on top of that. I have no objections against such a modification.

    Maarten

     
  • Kamikaze Dominic Fandrey

    @epetrich
    I have removed my workaround from all files and can confirm that your peephole rule works. Without it the system freezes, with it works as intended.

    Thank you very much!

    As for making SDCC XC800 compatible, that's the only issue I've encountered so far. Will you add an -mxc800 flag for the xc800 specific peephole rules? It seems a little harsh to inflict the burden of additional cycles on unaffected platforms.

     
  • Kamikaze Dominic Fandrey

    Also, I'm making excessive use of function pointers in ISRs. There is no issue with them. Or if there is, it's masked by the reti.

     
  • Erik Petrich

    Erik Petrich - 2012-10-04

    Calling via function pointer should work fine for functions that have no parameters. In this case, there is a lcall to a library function _sdcc_call_dptr() which then does a jmp to the actual function entry point; this lcall will keep the Memory Extension Stack balanced with the ret at the end of the function.

    The problematic case that's not covered by my peephole rule is using a function pointer with functions that have parameters. As Maarten points out, the above strategy can not be used in this case due to conflicting register needs. So the compiler instead uses the same method that was used to jump to the cases in the larger switch statements and is failing for this microcontroller.

    It looks like currently the function pointers in hsk use no parameters, so the one peephole rule I've given you is sufficient for now. However, I'd like to make sure we also have a working solution for function pointers with parameters so that all the cases are covered. So I will leave this report open a little longer.

    Thanks for confirming that the peephole rule fixed the problem; this was certainly obscure.

     
  • Kamikaze Dominic Fandrey

    quote: epetrich
    ...the XC878 only has a single bank implemented, but the
    documentation shows the banking registers are still there and presumably
    the associated hardware is as well ...

    The banking actually has a purpose. E.g. on the 16-FF variant (64k flash) you can access the XRAM and bootloader from bank 2. So you can execute functions provided by the boot loader and execute code from XRAM. This is useful to bootstrap the µC for live flashing.

    There's probably a way to bank in external RAM/Flash, too. But I never looked into this. 64k of flash are plenty for my purposes (so far I'm only using up to 20k).

     
  • Erik Petrich

    Erik Petrich - 2012-11-17
    • labels: --> mcs51(8051) target
    • status: open --> closed-fixed
     
  • Erik Petrich

    Erik Petrich - 2012-11-17

    There's now (as of revision #8221) a new option for the mcs51 backend "--no-ret-without-call" that should ensure that everywhere that SDCC uses ret to do a computed jump it will also use a call instruction to keep the Memory Extension Stack from underflowing. With this option you should no longer need the custom peephole rule (it's been integrated into the code generator) and it takes care of both the switch/case problem you were seeing as well as the call via function pointer to function with parameters problem that you hadn't encountered yet (and would not have been fixed by the peephole rule)

     
  • Maarten Brock

    Maarten Brock - 2012-11-17

    I think the function pointer case is not fixed yet because there are two RET's involved.

    mov a,#LSB(return_label)
    push acc
    mov a,#MSB(return_label)
    push acc
    push LSB(destination)
    push MSB(destination)
    lcall .+3
    dec sp
    dec sp
    ret
    ---> destination: ret
    return_label:

    I suggest to replace this code with the following which is even smaller!

    lcall make_call
    ljmp return_label
    make_call:
    lcall .+3
    dec sp
    dec sp
    push LSB(destination)
    push MSB(destination)
    ret
    ---> destination: ret
    return_label:

    The labels should all be temporaries of course, but have names for clarity here.
    The peephole optimizer will normally replace LJMP with JR.
    Also by pushing destination after the dummy call there is less stack space needed.

    And even though a little bit bigger I think I prefer to always use JMP @A+DPTR for switch statements.

     
  • Maarten Brock

    Maarten Brock - 2012-11-17
    • status: closed-fixed --> open-fixed
     
  • Maarten Brock

    Maarten Brock - 2012-11-17

    I just had another idea about the parameter passing for function pointers. Currently functions being called with function pointers are required to be reentrant when they have multiple parameters so these are passed on the stack. Only functions with a single parameter can be non-reentrant and still be used for a function pointer because the first parameter is passed in DPTR,B,A.

    Now if we change the calling convention for reentrant functions to pass ALL parameters on stack, we can always use LCALL __sdcc_call_dptr for function pointers. Along with my proposed switch implmentation this means this new command line option can dissapear again.

     
  • Maarten Brock

    Maarten Brock - 2012-11-18

    Should be fixed now in SDCC 3.2.1 #8223.

    I've changed the switch to always use JMP @A+DPTR.
    I've also replaced the pushes for the return address with an LCALL as explained below. This required a bit more work as the peephole optimer didn't recognize it.

    I hope I haven't offended Erik for taking this over and I also hope the users won't mind the slight code increase.

     
  • Maarten Brock

    Maarten Brock - 2012-11-18
    • status: open-fixed --> closed-fixed
     
  • Erik Petrich

    Erik Petrich - 2012-11-18

    It's fine with me; I tried to implement this last night, but ran into problems with the lcall confusing the peephole dead code elimination routines and was pondering how to proceed. So I am glad you found a solution.

    I think it would be a bad idea to change the calling convention for reentrant functions to pass all parameters on the stack just to eliminate the --no-ret-without-call option. Besides breaking backwards compatibility, this would also increase code size and run time in many cases.

     
  • Kamikaze Dominic Fandrey

    Just out of curiosity, why is only the first parameter passed in registers, why not simple the first four bytes of whatever parameters are there?

     

Log in to post a comment.