Compiled Code Different for No Reason

wdolson1
2011-04-04
2013-03-12
  • wdolson1
    wdolson1
    2011-04-04

    I am using SDCC to compile code for the Cypress FX2 (8051 code set).  I have been using it for several months without problems.  The other day I made a couple of small changes, recompiled and the firmware no longer worked properly.  I reverted all the code and makefile to a backup known to work and when I compiled that, the .hex file is now different between the two versions.

    I am compiling under Windows XP with the following environment variables set for SDCC:

    SDCCLIB_AS as80511 -los
    SDCCLIB_CC sdcc -c
    Path = E:\Program Files\SDCC\bin

    The two SDCCLIB environment variables above are called out in the make file.  The original project was a port from the Keil compiler I found on someone's blog about how to use the SDCC compiler for the FX2.  All of Cypress's examples are for the Keil compiler.

    I made no changes to any configuration files since the last good build.  I have rolled back all source and all make related files to those from the known working backup.  I reinstalled the latest released version of SDCC (3.0).  The differences between the known working .hex and the one currently being built is consistent with the most recent code and the back up as well as before and after SDCC reinstall.

    Three lines at line 301-303 are different and only handful of bytes on those lines.

    Has anybody ever seen anything like this and how do I fix it?

    Thanks,
    Bill

     
  • Do you have a copy of the last good hex file? Have you compared (or better yet - do a diff) between the one that works and the one that doesn't?
    If you have copies of the last good asm and/or object files, you can do the same - do a diff and describe what the differences are.
    If you revert back to a previous version of SDCC does everything work as expected?

    One other thought, split up the compiling, assembly, and linking stages…
    SDCC <your flags> -S myCode.c
    gpasm -o myCode.o -c myCode.asm
    gplink -o myCode.hex -m myCode.o

    how does that compare with doing it all in one and allowing SDCC to determine libraries and such…
    sdcc -verbose <your flags> myCode.c

    I might not be able to help, but any info you can weed out the better.

     
  • wdolson1
    wdolson1
    2011-04-04

    Yes, I have a complete backup of the last known working version.  I have done a diff and there are three lines different between the two versions of the hex files.  The differences are only on lines 301, 302, and 303.

    The project has a couple of assembly and two C files.  The Build directory has differences for the C files only.  For one file the only difference is the date of compile in the ASM and LST files created.  The other file has some differences in the registers used for some lines of code. 

    I just found the problem, but I don't know why it was happening.  The source (c files) between the backup and the development version was identical, but when I looked at the ASM file, it had one line different from the source file.  It was an area of the code where I was having some problems. 

    This brings up another issue that I've been basically working around for a couple of weeks.

    Having compared the ASM files, the primary difference was in which register was used.  The piece of code reads from an I/O port and does a short delay via a for loop that does nothing.  When I had the read before the for loop, the firmware crashes.  When it reads the register in the for loop, the compiler assigns r2, r3, and r4 differently than when the read is before and it works. 

    It doesn't look like the ASM code saves any of the registers when it does a function call.  Though the state of the r registers does not appear to be critical.  The issue appears in a function call called ReadByte.  The calling code becomes:

    ; MT.c:274: ret = ReadByte(SETUPDAT);
    mov dptr,#(_SETUPDAT + 0x0005)
    movx a,@dptr
    mov dpl,a
    lcall _ReadByte
    mov r2,dpl

    The critical section of ReadByte looks like this on the working version:
    ; MT.c:405: for(i = 0; i < 4; i++)  //Delay: 1 tick here is approximately 1 us, this will be slightly more than 4 us
    mov r2,#0x04
    mov r3,#0x00
    00103$:
    ; MT.c:406: ret = IOB;
    mov r4,_IOB
    dec r2
    cjne r2,#0xff,00109$
    dec r3
    00109$:
    ; MT.c:405: for(i = 0; i < 4; i++)  //Delay: 1 tick here is approximately 1 us, this will be slightly more than 3 us
    mov a,r2
    orl a,r3

    On the non-working version it looks like this:
    ; MT.c:405: ret = IOB;
    mov r2,_IOB
    ; MT.c:406: for(i = 0; i < 4; i++);  //Delay: 1 tick here is approximately 1 us, this will be slightly more than 4 us
    mov r3,#0x04
    mov r4,#0x00
    00103$:
    dec r3
    cjne r3,#0xff,00109$
    dec r4
    00109$:
    mov a,r3
    orl a,r4

    The problem appears to be with the registers used rather than the extra delay caused from reading IOB multiple times.  The loop originally was i<3, increasing it to 4 made no difference.  The problem appears to be happening at this point in this function right now, but I have had mysterious crashes when other parts of the code was changed.

    Could this be a compiler bug?  I was assuming there was something I did, but looking at the assembly, it looks like it's much more dependent on the way the r registers are used by the compiler. 

    Bill

     
  • wdolson1
    wdolson1
    2011-04-04

    Jut an update.  I took the ASM file created by the code that didn't work and hand edited it to assign the registers the same way as the version that worked.  The C code was the same as the version that did not work, the assembly only has which register is doing what changed.  That version works.

    There is something weird going on with which register does what.

    Bill

     
  • Maarten Brock
    Maarten Brock
    2011-04-04

    Bill,

    Using loops for delays in a high level language is always dangerous. The loop does "nothing" and could be completely optimized out by the compiler. Furthermore I'm pretty sure that it is the value from IOB that crashes your application as a result of being read too early. How are IOB and i declared?

    Maarten

     
  • wdolson1
    wdolson1
    2011-04-05

    i is just an int.  The Cypress FX2 has 40 I/O lines organized into 5 8 bit registers labeled A-F.  IOB is the B port register.  It's declaration is
    __sfr __at 0x90 IOB;

    The problem is not IOB being read too early.  I hand edited the ASM file created from the .c and changed which r register did what in that block of code.  IOB was read in the same place in the code as it was in the version that didn't work.  The only difference was that r4 was used to read IOB instead of r2 and the loop was done with r2 and r3 instead of r3 and r4.   With just those changes and nothing else, the firmware ran.

    There is something going on in what registers the compiler chooses.

    I don't usually use for loops for delays, but I analyzed the assembly and it worked out to almost exactly 1us.  The code converted from Cypress's examples uses NOPs for delays, but that didn't work as well for some reason I can't remember now.  The code is not getting optimized because I can see the delay from the loop when I watch activity with the logic analyzer.

    Bill

     
  • Maarten Brock
    Maarten Brock
    2011-04-06

    In general using NOP is better as the compiler can see that your intention is to create a delay as it has no other purpose.

    Are you planning to always use the same compiler and the same <b>version</b> of it to compile your code? Or are you going to inspect the generated assembly after every compiler upgrade or change in settings? I stand by my comment not to use simple loops for creating delays.

    In the calling code what are the registers r3 and r4 used for? (Apparently r2==ret) Are they alive across the function call? And are they saved around the function call?

    Maarten

     
  • wdolson1
    wdolson1
    2011-04-07

    I changed it to use NOPs.  It appears to work, though it took 33 of them to do it (I put them in a macro).  The calling code doesn't appear to push or pop anything when calling sub-functions.  r4 is not used at all by the calling function and r2 is only used after ReadByte has returned.  r3 is used in several places, but everywhere I see it used after the call to ReadByte, it is first loaded with something, so I don't see where it might be getting corrupted.

    If r3 was getting corrupted by a function call, that would be a compiler bug.  The code is C code and the compiler should be smart enough to not corrupt the registers on function calls.