The implementation of MAD_F_MLA for FPM_PPC with OPT_ACCURACY is slightly buggy.
The output operands should be marked as read/write instead of write-only.
The write-only modifiers result in a register scheduling bug when compiled with a recent gcc from the RTEMS project:
powerpc-rtems4.11-gcc (GCC) 4.7.2
The buggy sequence is below. The 64bit accumulator is in R25|R23:
mulhw r30,r27,r28 // get hi word of product
mullw r31,r27,r28 // get lo word of product
addc r25,r23,r31 // accumulate lo words .. OOPS! stored result in hi word
// of the accumulator!
adde r24,r25,r30 // attempt to accumulate hi words -but R25 is hosed :(
Hope that's enough info. I'm not an inline asm guru, so all I can say is that the patch makes 'sense' (for some value of 'sense'), and fixes the register scheduling issue.
I can only assume that earlier versions of gcc (which did not result in a buggy stream) were not as aggressive with optimisation.
Log in to post a comment.