Menu

#203 _divslong, _divulong, etc is placed at wrong locations by SDCC

None
open
nobody
None
5
2025-01-21
2025-01-19
bengalack
No

As reported in thread: https://sourceforge.net/p/sdcc/discussion/1864/thread/6569af6748/ here is a super short sample that demonstrates how some _CODE lands after _DATA.

Example code app.c:

// Lots of uninitialized variables, yes I know
unsigned long l1, l2;
unsigned short n1, n2;
float f1, f2;
unsigned char u1, u2;

#pragma disable_warning 283
unsigned char main()
{
    return (unsigned char)( l1/u2/0.5*f1/n1*u1*f1/f2 );
}

crt file:

    .module crt0

    .globl  _main

    .area _HEADER (ABS)
    .org    0x4000

;----------------------------------------------------------
;   ROM Header
    .db     #0x41               ; ROM ID
    .db     #0x42               ; ROM ID
    .dw     #init               ; Program start
    .dw     #0x0000             ; BASIC's CALL instruction not expanded
    .dw     #0x0000             ; BASIC's IO DEVICE not expanded
    .dw     #0x0000             ; BASIC program
    .dw     #0x0000             ; Reserved
    .dw     #0x0000             ; Reserved
    .dw     #0x0000             ; Reserved

init::                          ; will enter in DI initially!
    jp      _main               

;----------------------------------------------------------
;   Segments order
;----------------------------------------------------------
    .area _HOME
    .area _CODE
    .area _GSINIT
    .area _GSFINAL
    .area _INITIALIZER
    .area _BSEG
    .area _DATA
    .area _INITIALIZED
    .area _HEAP

Built with:

sdasz80 -o -s -p -w objs\rom4000_crt.rel rom4000_crt.s
sdcc --code-loc 0x4013 --data-loc 0xC000 -mz80 --no-std-crt0 --opt-code-speed objs\rom4000_crt.rel app.c -o objs\app.ihx

This simple example for z80 shows these being located after --data-loc 0xC000:

Area                                    Addr        Size        Decimal Bytes (Attributes)
--------------------------------        ----        ----        ------- ----- ------------
_HOME                               0000C016    000007D7 =        2007. bytes (REL,CON)

      Value  Global                              Global Defined In Module
      -----  --------------------------------   ------------------------
     0000C016  ___fsmul                           _fsmul
     0000C415  ___ulong2fs                        _ulong2fs
     0000C4BF  __divulong                         _divulong
     0000C537  ___uint2fs                         _uint2fs
     0000C53E  ___fsdiv                           _fsdiv
     0000C5FE  ___uchar2fs                        _uchar2fs
     0000C606  ___fs2uchar                        _fs2uchar
     0000C618  ___fs2ulong                        _fs2ulong
     0000C6EF  ___fslt                            _fslt

But the real world example also showed others, like:

     0000C3E6  ___fssub                           _fssub
     0000C854  ___fsadd                           _fsadd
     0000CC8E  __divslong                         _divslong
     0000CD97  ___sint2fs                         _sint2fs
     0000CD9F  ___fs2uint                         _fs2uint
     0000CE81  ___slong2fs                        _slong2fs

Experienced on these SDCC versions:
SDCC : mcs51/z80/z180/r2k/r2ka/r3ka/sm83/tlcs90/ez80_z80/z80n/ds390/pic16/pic14/TININative/ds400/hc08/s08/stm8/pdk13/pdk14/pdk15/mos6502 4.2.2 #13476 (MINGW64)
SDCC : mcs51/z80/z180/r2k/r2ka/r3ka/sm83/tlcs90/ez80_z80/z80n/r800/ds390/pic16/pic14/TININative/ds400/hc08/s08/stm8/pdk13/pdk14/pdk15/mos6502/mos65c02/f8 TD- 4.5.0 #15224 (MINGW64)

Sample code with map file++ in attachment.

1 Attachments

Discussion

  • Tony Pavlov

    Tony Pavlov - 2025-01-19

    that's because the C library of SDCC is inconsistently and dirty written. all those support functions like __fssub() and so on must be marked nonbanked, or written in assembly in assembly source files, which explicitly specify .AREA _HOMEinstead of garbage like this:

    //float __fssub (float a, float b) __reentrant
    static void dummy(void) __naked
    {
        __asm
        .globl  ___fssub
    ___fssub:
        mov r0, sp
        dec r0
        dec r0
        xch a, @r0
        cpl acc.7
        xch a, @r0
        ljmp    ___fsadd
        __endasm;
    }
    

    the other third party libraries, like gbdk-2020 carefully handles the location of the each C supporting function. nonbanked stuff land into _HOME and that MAKE SENSE. so it has nothing to do with the compiler itself, but only with the library.

     
    • Philipp Klaus Krause

      Well, the support functions are currently marked __nonbanked, and put into _HOME.

      However, if they should be in _HOME is a different question. I guess it boils down to "What is _HOME?". To me, _HOME was always some kind of header-like thing, i.e. a small amount of stuff to be placed in a fixed location.

       
      👍
      1
      👎
      1
      • Tony Pavlov

        Tony Pavlov - 2025-01-19

        yes it is. _HOME is always some kind of header-like thing, i.e. a small amount of stuff to be placed in a fixed location. and that is how it works in the GBDK-2020 library. for example BANK TRAMPOLINE functions land there. interrupt service routines land there. and stuff which is called from the other banks which you have no control over, like ___fssub() must land there, but as little as possible (thanks to smart linking). you write c = a - b; in some (banked) code and compiler calls ___fssub(), and it MUST BE AVAILABLE. on the other hand _CODE section which normally contain the USER code, and where main() land, may be in the switchable memory, depending on the architecture of the library. i can write library the way main() is banked. can't imagine why it is not obvious!

         

        Last edit: Tony Pavlov 2025-01-19
        • Aoineko

          Aoineko - 2025-01-19

          This is “your” use of the _HOME section, but that doesn't mean it's the only “right” one. For me _HOME is the place for a program's entry point (as its name suggests), and has nothing to do with the bank system.

          In any case, as Philipp has already said, the best thing to do would be to review the management of the sections and create those that seem necessary to satisfy everyone's needs.

           
          • bbbbbr

            bbbbbr - 2025-01-20

            HOME does have an explicit intended purpose which is not ambiguous, it is described in the SDCC manual: to locate code in a non-banked common area so it is always accessible.

            https://sdcc.sourceforge.net/doc/sdccman.pdf#subsubsection.4.1.3.2
            Bankswitching -> Software

            Normally all functions you write end up in the segment CSEG. If you want a function explicitly to reside in the common area put it in segment HOME. This applies for instance to interrupt service routines as they should not be banked.

             
            👍
            1
            • bengalack

              bengalack - 2025-01-20

              Thank you for pointing out this one. But in this case, I had already planned to have all my _CODEin page 1 as non-banked. The segment will stay there through the lifespan of the application. That is why it is strange that the compiler moved some code away from the rest of my code and after DATA (eating up my precious ram above 0xC000, it just seemed out of my control. But that is also why I asked, to get hints about how to steer away from this magical move :)

               
              • Janko Stamenović

                Just don't combine compilation and linking (see my answers), then _HOME will not appear after data.

                 
              • Tony Pavlov

                Tony Pavlov - 2025-01-20

                COMPILER does not "move your code". NEVER. the only case when it assigns sections for the other than _CODE(_<N>)is when you explicitly mark your function as __nonbanked or use specific pragmas, which you did not. all other sections come from the LIBRARY, including crt0, and the order and location is completely controlled by the LINKER. sdld* linker family has its own quirks and design limitations. you should carefully read the linker manual before designing your own libraries and runtime, because there is a lot of unobvious behavior, for example the order of files being linked matter, and so on. read the manual.

                 
            • Philipp Klaus Krause

              However, section 4.1.3.2 part of 4.1, i.e. the mcs51-specific part of the manual. We need to clean up this banking vs. section name stuff, and come up with something that works reasonably well for multiple ports, but that will take time.

               
          • Tony Pavlov

            Tony Pavlov - 2025-01-20

            the gbdk-2020 design is quite convenient, and proven flexibility and extensibility, when porting to the number of platforms (consoles and computers) _CODE is for the user code and the library code. _HOME is for the non-switchable code (trampolines, C support functions, ISR's, user non-switchable code), entry point and interrupt vectors, if applicable, is in the fixed _HEADER section (marked with absolute attribute). such design make sense, at least for the most of the 8-bit platforms.

             

            Last edit: Tony Pavlov 2025-01-20
  • Tony Pavlov

    Tony Pavlov - 2025-01-19

    as for the bug itself, i suggest reporter not use the --code-loc 0x4013 --data-loc 0xC000 which are some ugly workarounds, but give the direct instructions to the linker itself with

    -b <section>=<address>
    

    also the linker part is responsible for gluing the sections and the section parts together and positioning them. the linker is very old and have some related problems in its design, so something may be improved on that side as well.

     

    Last edit: Tony Pavlov 2025-01-19
    • bengalack

      bengalack - 2025-01-19

      Thanks!

      As a simple user of SDCC I'm unable to follow in all the details here, but I'm very interested in finding a good way to have all my code separated from the data. The contents of my CRT is something that has come out of a best effort after looking at other CRTs for MSX. Like, I have _HEADER in there, but I don't really know what it does, because I have not found any documentation on it.

      --code-loc and --data-loc on the other hand, are documented, so at a glance those seems safer to use. And to be frank, every SDCC code I have seen from fellow MSX programmers, use these parameters too.

      Right now, I need the first section (ROM header) to be in place at 0x4000. It does not matter what comes next, of _HOME or _CODE. If I set -Wl-b_HOME=0x00004000 I must also set where _CODE starts, but I don't really know the size of _HOME. And vice versa (if I wanted to have _HOME right after _CODE, using the -b parameter).

      One solution may be to guess that my _HOME never becomes bigger than ~0x1000, and start _CODE after that, potentially wasting lots of space and needing to always keep an eye on potential overlaps which may sneak in?

      I now have an "empty" crt like this:

          .module crt0
          .area _HEADER (ABS)
          .area _HOME
          .area _CODE
          .area _GSINIT
          .area _GSFINAL
          .area _INITIALIZER
          .area _BSEG
          .area _DATA
          .area _INITIALIZED
          .area _HEAP
      

      and I put my msx rom header in a separate file (using area _HEADER and _HOME) and a commandline like below, which seems to work:

      sdcc -mz80 --no-std-crt0 -Wl-b_HOME=0x00004000 -Wl-b_CODE=0x00005000 -Wl-b_DATA=0x0000C000 objs\rom4000_crt.rel objs\msx_rom_header.rel app.c -o objs\app.ihx
      

      This does not seem very dynamic and future-proof, but it will work for my current sitation, and I can progress ahead :-)

       

      Last edit: bengalack 2025-01-19
  • Aoineko

    Aoineko - 2025-01-19
     

    Last edit: Aoineko 2025-01-19
    • bengalack

      bengalack - 2025-01-19

      For sure it looks like it!

      Looking forward to the day this is improved and it has made it to the SDCC manual :)

       
  • Janko Stamenović

    The "landing after _DATA" of _HOME happens only because the compilation and linking aren't split in that example. As soon as these steps are separate, as here:

    sdasz80 -o -s -p -w objs/rom4000_crt.rel rom4000_crt.s
    sdcc  -mz80 --opt-code-speed -o objs/app.rel -c app.c
    sdcc --code-loc 0x4013 --data-loc 0xC000 -mz80 --no-std-crt0 --opt-code-speed -o objs/app.ihx objs/rom4000_crt.rel objs/app.rel
    

    the resulting locations are:

    .  .ABS.                            00000000    00000000 =           0. bytes (ABS,CON)
    _CODE                               00004013    0000034D =         845. bytes (REL,CON)
    _HEADER0                            00000000    00000013 =          19. bytes (ABS,CON)
    _HOME                               00004360    0000068B =        1675. bytes (REL,CON)
    _DATA                               0000C000    00000016 =          22. bytes (REL,CON)
    _CODE = 0x4013
    _DATA = 0xc000
    
     
    • Tony Pavlov

      Tony Pavlov - 2025-01-20

      because the order of linking matter. the order how you pass the object files into the linker through the command line or the link file. on almost every system, including modern, order of linking matter.

       

      Last edit: Tony Pavlov 2025-01-20
      • Janko Stamenović

        Yes, even if he in his "problem" example has the line in which he specifies .rel file first:

        sdcc --code-loc 0x4013 --data-loc 0xC000 -mz80 --no-std-crt0 --opt-code-speed objs\rom4000_crt.rel app.c -o objs\app.ihx
        

        the implicit order produced by SDCC when it has to first compile that .c is that the "app.rel" is the first one. When only .rel files are the input:

        sdcc --code-loc 0x4013 --data-loc 0xC000 -mz80 --no-std-crt0 --opt-code-speed -o objs/app.ihx objs/rom4000_crt.rel objs/app.rel
        

        then the linking order is the order in which the .rel files are specified.

        This can be clearly seen by comparing the produced app.lkfiles:

        bad .lk has:

        objs/app.rel
        objs/rom4000_crt.rel
        

        good .lk has:

        objs/rom4000_crt.rel
        objs/app.rel
        

        So why is then _HOME after data in "bad" case? Because the linker saw app.rel before the rel of crt, and we can see the order from the app.rel in the produced .asm : the first occurrence of _HOME is clearly after the first .area _DATA:

        ;--------------------------------------------------------
        ; File Created by SDCC : free open source ISO C Compiler
        ; Version 4.5.0 #15224 (MINGW64)
        ;--------------------------------------------------------
            .module app
        
            .optsdcc -mz80 sdcccall(1)
        ;--------------------------------------------------------
        ; Public variables in this module
        ;--------------------------------------------------------
            .globl _main
            .globl _u2
            .globl _u1
            .globl _f2
            .globl _f1
            .globl _n2
            .globl _n1
            .globl _l2
            .globl _l1
        ;--------------------------------------------------------
        ; special function registers
        ;--------------------------------------------------------
        ;--------------------------------------------------------
        ; ram data
        ;--------------------------------------------------------
            .area _DATA
        _l1::
            .ds 4
        _l2::
            .ds 4
        _n1::
            .ds 2
        _n2::
            .ds 2
        _f1::
            .ds 4
        _f2::
            .ds 4
        _u1::
            .ds 1
        _u2::
            .ds 1
        ;--------------------------------------------------------
        ; ram data
        ;--------------------------------------------------------
            .area _INITIALIZED
        ;--------------------------------------------------------
        ; absolute external ram data
        ;--------------------------------------------------------
            .area _DABS (ABS)
        ;--------------------------------------------------------
        ; global & static initialisations
        ;--------------------------------------------------------
            .area _HOME
            .area _GSINIT
            .area _GSFINAL
            .area _GSINIT
        ;--------------------------------------------------------
        ; Home
        ;--------------------------------------------------------
            .area _HOME
            .area _HOME
        ;--------------------------------------------------------
        ; code
        ;--------------------------------------------------------
            .area _CODE
        ;app.c:8: unsigned char main()
        ;   ---------------------------------
        ; Function main
        ; ---------------------------------
        _main::
        ;app.c:10: return (unsigned char)( l1/u2/0.5*f1/n1*u1*f1/f2 );
        etc.
        
         

        Last edit: Janko Stamenović 2025-01-20
        • bengalack

          bengalack - 2025-01-20

          Oh oh!

          This fixes everything. Thank you so much! Everything fell into place now. -Following the ordering I wrote in the crt!

          Happy chap here now!

          As for the bug itself... We see from the forum thread, as well as this bug, that there are lots of ideas of how to improve this "area". This bug does not really describe anything about that, so it should prolly be closed and replaced by a proper task.

           
          • Janko Stamenović

            IMO the SDCC behavior of pushing the .rel which is the result of the compilation in front of the specified sequence specified in the command line is something that could be changed, as the current behavior results in provably wrong and unusable ordering in the cases in which a rel for crt0 is specified in command lines like your initial one, and I think it can't be that anybody would need the current behavior.

            When the linking phase (just .rels) is separate, the problem, luckily, doesn't happen, still changing the behavior would allow simpler command lines for some small enough examples which use non-default crt0.

            Independently of that, I suggest you to maintain the C compilations and the linking invocations always separate in your project from now on, which also means that you don't have to wait for any change in SDCC to be implemented.

             

            Last edit: Janko Stamenović 2025-01-20
            • Benedikt Freisen

              That can indeed be done in the long run, but for the time being, I have added a few lines to the manual to clarify the current behavior. A change to the manual is much more likely to make it into the upcoming release, after all.

               
              • Janko Stamenović

                The "clarification" is still confusing as where you mention a custom CRT (it would be better to use: "custom crt0", as that's the convention in the rest of the document, and is a easier to eyes than that name clash with "cathode ray tube" CRT) the example is without that crt0.rel file, so it's conflicts with the text.

                instead of

                sdcc foomain.rel foo1.rel foo2.rel
                

                it should be then, as long as it is a continuation of the explanation of the use with the custom crt0, as it is now:

                sdcc custom_crt0.rel foomain.rel foo1.rel foo2.rel
                
                 
  • Philipp Klaus Krause

    Ticket moved from /p/sdcc/bugs/3819/

    Can't be converted:

    • _category: other
     

Log in to post a comment.

MongoDB Logo MongoDB