Menu

GC linker issue

2022-02-10
2022-02-11
  • Gregory A Failing

    Ubuntu 20.04 LTS

    GNUCobol:
    cobc (GnuCOBOL) 3.2-dev.0
    Built Feb 09 2022 07:29:58
    Packaged Feb 07 2022 12:36:51 UTC
    C version "9.3.0"

    I have written a small function that pauses execution for milliseconds instead of seconds. However the test program fails with a module not found error. Using 'ldd' on the test program shows that the shared object
    (libapasutil.so) which contains the timer code is not in the list displayed.

    Here are some clues:

    The compile command:

    ... building test-gnupause
    LD_RUN_PATH=/test/bin; export LD_RUN_PATH
    /usr/local/bin/cobc -x -I /home/fcsisat/APAS/src -L /test/bin -L /usr/local/lib -Q "-Wl,-rpath=/test/bin" -o /test/bin/test-gnupause -lapaslibs -lmesg8583 -ltext8583 -lapasdeta -lapasutil -ldetaxcob -lrt -lncurses /home/fcsisat/APAS/src/test-gnupause.cbl
    chmod 775 /test/bin/test-gnupause
    

    The list of shared objects in the compiled test program:

    ldd /test/bin/test-gnupause
            linux-vdso.so.1 (0x00007ffe33bf8000)
            libcob.so.4 => /usr/local/lib/libcob.so.4 (0x00007f798e725000)
            libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f798e533000)
            libgmp.so.10 => /lib/x86_64-linux-gnu/libgmp.so.10 (0x00007f798e4af000)
            libxml2.so.2 => /lib/x86_64-linux-gnu/libxml2.so.2 (0x00007f798e2f5000)
            libncurses.so.6 => /lib/x86_64-linux-gnu/libncurses.so.6 (0x00007f798e2cc000)
            libtinfo.so.6 => /lib/x86_64-linux-gnu/libtinfo.so.6 (0x00007f798e29c000)
            libdb-18.1.so => /usr/local/BerkeleyDB.18.1/lib/libdb-18.1.so (0x00007f798e090000)
            libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f798e06d000)
            libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f798e067000)
            /lib64/ld-linux-x86-64.so.2 (0x00007f798e7c1000)
            libicuuc.so.66 => /lib/x86_64-linux-gnu/libicuuc.so.66 (0x00007f798de81000)
            libz.so.1 => /lib/x86_64-linux-gnu/libz.so.1 (0x00007f798de65000)
            liblzma.so.5 => /lib/x86_64-linux-gnu/liblzma.so.5 (0x00007f798de3c000)
            libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f798dceb000)
            libicudata.so.66 => /lib/x86_64-linux-gnu/libicudata.so.66 (0x00007f798c22a000)
            libstdc++.so.6 => /lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f798c048000)
            libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f798c02d000)
    

    Partial list of modules in libapasutil.so:

    nm /test/bin/libapasutil.so | grep ' T '
    0000000000002ac7 T apasnodoff
    0000000000002b7b T getisrct
    ...
    0000000000002a0f T gnupause
    ...
    0000000000003120 T reverse_str
    00000000000030f5 T strTrim
    

    What ubuntu 'file' thinks of the test program:

    file /test/bin/test-gnupause
    test-gnupause: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=e8e4a51ef4a1f2aaccca72d9c9a1bf6e2a039016, for GNU/Linux 3.2.0, not stripped
    

    Note: not sure why ubuntu thinks the test program is a shared object ... maybe that is the issue but I don't think so.

    I enclose the test program and the timer code that is in libapasutil.so in the enclosed zip file.

    I appreciate any guidance. I am sure it is something simple ...

    Gregory

     
    • Vincent (Bryan) Coen

      When was the last time you did a sudo ldconfig   ?

      It is worth a try  :)

      Vince

      mod edit for some reply-to

       

      Last edit: Brian Tiffin 2022-02-11
      • Gregory A Failing

        many, many times ...

         
    • Simon Sobisch

      Simon Sobisch - 2022-02-11

      Many system linkers check if you actually (seem to) need the link and drop it otherwise.

      Just change your CALL to CALL STATIC and this likely will work.
      If not, tell the linker to boot do this with the no-as-needed flag (see the FAQ).

      And after testing both for learning purposes: exchange your call to CONTINUE AFTER 0.0004 SECONDS (COBOL 202x), CALL 'CBL_GC_NANOSLEEP' (or the OC pendant), depending on the GnuCOBOL version in use

       
      • Brian Tiffin

        Brian Tiffin - 2022-02-11

        Yeah, CONTINUE AFTER, nice phrase. ;-)

        This is kinda brute force don't think about it anymore, Gregory, but my dev boxes all have:

        ::sh
        # GnuCOBOL on Ubuntu
        if [[ ! $COB_LDFLAGS =~ as-needed ]]; then
            COB_LDFLAGS+='-Wl,--no-as-needed'
            export COB_LDFLAGS
        fi
        

        in .bashrc for login inits. That setting means all libraries mentioned will be dragged into the ldd display, the image RPATH, including ones that are referenced in other shared libraries but maybe not used (well, needed, for symbol lookups). Linking to -lgtk for instance can cause quite a few unexpected names to show up in RPATH. Image size is rarely effected much, as these are all shared libraries anyway, but the actual symbol resolution might have to manage a bigger list of entries when searching by string name. With binary search, that might be one or two extra steps in a loop. Might show up in a performance graph, but it'll be a blip.

        As far as I know, Ubuntu is still the only big player GNU/Linux distro that changed the binutils buildld settings to be non-inclusive in the RPATH inside ELF binaries by default.

        Supposedly "good" developers are supposed to specifically set the --no-as-needed and --as-needed flags in the linkage phase for inclusive library needs. But that team overlooked the very valid use case of CALL by string name with dlsym, and not just by linker symbol name when they made the change back in v14ish. ;-) A valid attempt at DCE, Dead Code Elimination, but a little over aggressive for cobc dynamic calls by string name compared to cc which is better able to test actual linker symbol names during ld.

        Took months of binging on google and then some well spent nerd points on a StackOverflow bounty to figure out[1] that Canonical had made that change to GNU binutils default build settings.

        We could change cobc tool chain calls to properly order and wrap all programmer explicit -l library entries to cobc commands with --no-as-needed ... --as-needed pairs passed to the cc and ld phases on Ubuntu, but we haven't, yet (if ever)[2]. --no-as-needed can be a little wasteful when -ling to top level libraries that have huge dependency chains, like GTK, Q, JVM or things like libPython. But again, mostly in the RPATH field of the ELF object code (and the few extra compares that might occur during symbol lookups at compile, link, and run time).
        I think I'm explaining this right. Never have traced through the entire technical spec on as-needed linkage in GCC as it relates to Ubuntu and the hit on size/performance with the --no-as-needed mode when it's global to the CC phase. Initial freaking out glances during discovery of a potential fix, and then a little bit of deeper study left me to believe I'm explaining this right. There is much voodoo in link loaders. ;-)

        [1] "figured out" by being explicitly told by a friendly, informed passerby on SO. :-)
        [2] We may never change the internal commands used by cobc, Gregory. During and since the kerfuffle, Simon and team figured out some codegen and tool chain techniques that stabilized static linkage, providing friendlier to use CALL STATIC than earlier Open COBOLs sported.

        Have good, make well,
        Blue

         

        Last edit: Brian Tiffin 2022-02-11
        • Gregory A Failing

          Brian,

          Very cool suggestion about installing that option in .bashrc.

          IMHO the startup time penalty is not as important as saving time in the big loop where the work gets done. As our big app processes several hundred thousand online reuqesta daily, it is a big deal. A few milli-secs working thru a library list during start-up is not an issue.

          Thanks

          G

           
      • Gregory A Failing

        Simon,

        Thank you for the suggestions. Here is a recap ...

        1) Adding '--no-as-needed' allows the original code to work.

        2) Changing source code to 'CALL STATIC' without '--no-as-needed' also works.

        3) 'CONTINUE AFTER' seems to work but the delay varies a fair amount. I
        presume that how busy the system is may come into play.

        4) 'CALL "CBL_GC_NANOSLEEP" ' also works.

        I'm not sure which way I will go yet. [1] allows my existing code to work with only minor changes. [2] is the least atractive as there a LOT of CALLs in our big app programs.

        In any case it now works. It is good to have options, Thanks!

        Gregory

        PS: how about a 'CALL "CBL_GC_MILLISLEEP" ... a little less confusing
        when setting delay values. Also any delay under 100 ms regardless of how
        it is invoked is virtually undetectable.

         
        • Simon Sobisch

          Simon Sobisch - 2022-02-11

          For 2: you'd only change the CALLs to the entries of the libraries you link to, likely not that much.

          Actually for these kind of CALLs I commonly prefer: no link at all then add the libraries necessary to COB_PRE_LOAD.

          3+4 use the same code internally with 3 being "more direct". There shouldn't be much variance if you don't go too precise.

          We won't add another own sleep option - there is our own CALL routine and also C$SLEEP for RM/ACU/MF compatibility and for code that is not limited by dialect the COBOL 202x format of the CONTINUE statement.

           

Anonymous
Anonymous

Add attachments
Cancel