Menu

#60 use +flto

Time Permitting
closed
None
5
2014-11-06
2014-07-16
No

evaluate the changes required to support +flto & co.

Discussion

  • Justyn

    Justyn - 2014-10-04

    I've had some success getting my STM32F0 project to compile using Link Time Optimisation with -flto.

    My first attempts passing the options "-flto -Ox" (where x was 1,2 or 3) to the compile stage (linker stage will work it out) appeared to compile and link correctly but wouldn't actually run because the Reset and ISR handlers were being optimised away by the LTO process, as explained here:
    http://www.coocox.org/forum/topic.php?id=3002

    To fix I added the attribute "used" to all the functions in cortexm/exception_handlers.c.

    This now works when I compile with "-flto -O1" and "-flto -O2", and I see a great improvement in code size.

    However it fails for "-flto -O3" with the error (shortened for post):

    In function Reset_Handler': ../system/src/cortexm/exception_handlers.c:36:(.after_vectors+0x10): relocation truncated to fit: R_ARM_THM_JUMP11 against symbol_start' defined in .after_vectors section in /tmp/ccgJ4h7q.ltrans0.ltrans.o

    I'm not sure why.

    I should point out that although my project was originally based on a gnuarmeclipse STM32F031 template project it is using a custom Makefile, so may differ from a default setup.

    I'm using arm-none-eabi-gcc 4.8.4 20140526 from https://launchpad.net/gcc-arm-embedded.

     
  • Justyn

    Justyn - 2014-10-04

    Forgot to mention also changed the line:
    attribute ((section(".isr_vector")))

    to:
    attribute ((used,section(".isr_vector")))

    in cmsis/vectors_stm32f0xx.c

     
  • Liviu Ionescu (ilg)

    thank you for your suggestions, I'll consider them when testing the +flto option.

     
  • Justyn

    Justyn - 2014-10-04

    I ran into the Reset_Handler "relocation truncated to fit: R_ARM_THM_JUMP11 against symbol_start" error again, even with -O1 and -O2.

    There are two versions of Reset_Handler depending on whether you are compiling with DEBUG defined or not.

    The DEBUG version is just a pure c call to _start ().

    The non-DEBUG (Release) version contains embedded assembly:

    "b _start \n"

    With -flto enabled at least, _start may to be too far away to do a direct Branch (b) to it, depending on your program.

    I think the pure c version allows the linker to compensate for this (a quick look at the disassembly shows Reset_Handler contains push, bl, and nop) and it cannot do this with the assembly.

    The easy fix is to ensure the pure c version is always used.

     
  • Liviu Ionescu (ilg)

    before giving up the assembly jump optimisation, can you check how would the code look like when using a long jump in assembly? for example loading the pc via another register?

    the C call is there simply to help the debugger identify the proper stack frame, otherwise it is just a waste of stack space.

     
  • Liviu Ionescu (ilg)

    I just added -flto as an explicit configuration option in the common Optimizations section.

    I also explicitly marked the vectors as "used" and I updated the branch in Reset_Handler to a long jump.

    Could you test the beta version available from updates-test?

     
  • Justyn

    Justyn - 2014-10-13

    Great, I will test this as soon as I can and get back to you.

     
  • Liviu Ionescu (ilg)

    • status: open-accepted --> closed
     
  • Liviu Ionescu (ilg)

    implemented since 2.4.1-201410142110.

     
  • Justyn

    Justyn - 2014-11-06

    Sorry for the delay in following up, just to confirm that I updated the system folder in my custom-makefile STM32F0 project with the latest release and LTO seems to be working fine with no errors.

    Thanks for implementing!

     
  • Liviu Ionescu (ilg)

    please be aware that LTO support is yet not fully functional, and you may still encounter cases when the linker will fail.

    also please see: https://bugs.launchpad.net/gcc-arm-embedded/+bug/1383856

     

    Last edit: Liviu Ionescu (ilg) 2014-11-06