#8 GCC 4.5.0 bugs

open
nobody
None
5
2010-10-29
2010-10-29
Franck Charlet
No

Any conditional branch or dbcc placed inside inline assembly result in error message: "Error: unknown pseudo-op: `.stad'" or "Error: unknown pseudo-op: `.stbd'.

Also it looks like fpu is turned on by default when selected cpu is 68020 which wasn't the case before, but when turned off with -msoft-float there's undefined symbols errors: like "undefined reference to `___floatunsidf" (i think Amiga IEEE ROM libraries were used).

Discussion

  • Bernd Roesch
    Bernd Roesch
    2010-10-30

    >Any conditional branch or dbcc placed inside inline assembly result in error message: >"Error: unknown pseudo-op: `.stad'" or "Error: unknown pseudo-op: `.stbd'.

    can you post a example code that not work ?

    On GCC3 and gcc4 asm inline syntax is change from GCC team, you need always add after a command \n\t

    here is some 68k asm code of gcc 4.5 libstdcc++ (which is compiled with GCC 4.5)and work.

    if you change your code in that way it should work

    "jne 1b"

    i think you can too write jne 1 and the attached b only show that a short branch should use

    __exchange_and_add(volatile _Atomic_word* __mem, int __val) throw ()
    {
    register _Atomic_word __result = *__mem;
    register _Atomic_word __temp;
    __asm__ __volatile__ ("1: move%.l %0,%1\n\t"
    "add%.l %3,%1\n\t"
    "cas%.l %0,%1,%2\n\t"
    "jne 1b"
    : "=d" (__result), "=&d" (__temp), "=m" (*__mem)
    : "d" (__val), "0" (__result), "m" (*__mem));
    return __result;
    }
    for the use of GCC 4.5.0 without FPu i cant say nothing, because i think a FPU have every active Amiga User, so i dont do anything to get GCC with softfloat working.

    IEEE float is too slow.If you want write a program for non fpu amigas better use ffp lib of AOS.this is lots faster as the IEEE software float.

    I think gcc is in general broken when use software float on complex programs.because a -msoft build of ffmpeg do not work correct also on GCC 3.4.0

    and if you think to port a Linux program that work on a 68020 68881 system, that is really really unusable, because on other Desktop systems a programmer can use FPU without speed loss

     
  • Bernd Roesch
    Bernd Roesch
    2010-10-30

    I forget to say, without CPU options, gcc 4.x do create a 68040 build.this i do because then you need no extra parameter on configure or makefiles and you cant forget to set the CPU parameters for a fast build.

     
  • Franck Charlet
    Franck Charlet
    2010-11-01

    I'm using -m68020 in the parameters.

    Here a simple piece of code which produces an error when it shouldn't:

    asm("copy:\n"
    "\tdbf d7,copy\n"
    );

    "jne" isn't a 68k opcode and any conditional branch produce the same error.

    I'm working on a game for a fpuless Amiga so I can't use any FPU code, the GCC version i'm currently using (3.4.0) is automatically using the ROM libraries to handle floating points why isn't this one doing the same ?

     
  • Bernd Roesch
    Bernd Roesch
    2010-11-01

    please copy exact this example in a function.On my my programs i test it work with GCC 4.5.0

    asm("copy:\n"
    "\tdbf d7,copy\n"
    );

    >I'm working on a game for a fpuless Amiga so I can't use any FPU code, the
    >GCC version i'm currently using (3.4.0) libraries to handle floating points why isn't this one doing the same ?

    gcc use only for sin cos etc the amiga math libs.maybe you can do a test rogram, for add sub mul and div. gcc use own code that is in libgcc.

    I hope your game run fast enough.in previous days game programmer used fixed point 16 bit value and 16 after comma

    here is the asm code that is execute every ad/sub /multiplication inside GCC.hope a 68020 have enough power so it can take for very multiplication on float over 140 clock cycles.

    but when you want use float, best is use the amiga ffp lib.

    | __subsf3
    |=============================================================================

    | float __subsf3(float, float);
    FUNC(__subsf3)
    SYM (__subsf3):
    bchg IMM (31),sp@(8) | change sign of second operand
    | and fall through
    |=============================================================================
    | __addsf3
    |=============================================================================

    | float __addsf3(float, float);
    FUNC(__addsf3)
    SYM (__addsf3):
    #ifndef __mcoldfire__
    link a6,IMM (0) | everything will be done in registers
    moveml d2-d7,sp@- | save all data registers but d0-d1
    #else
    link a6,IMM (-24)
    moveml d2-d7,sp@
    #endif
    movel a6@(8),d0 | get first operand
    movel a6@(12),d1 | get second operand
    movel d0,a0 | get d0's sign bit '
    addl d0,d0 | check and clear sign bit of a
    beq Laddsf$b | if zero return second operand
    movel d1,a1 | save b's sign bit '
    addl d1,d1 | get rid of sign bit
    beq Laddsf$a | if zero return first operand

    | Get the exponents and check for denormalized and/or infinity.

    movel IMM (0x00ffffff),d4 | mask to get fraction
    movel IMM (0x01000000),d5 | mask to put hidden bit back

    movel d0,d6 | save a to get exponent
    andl d4,d0 | get fraction in d0
    notl d4 | make d4 into a mask for the exponent
    andl d4,d6 | get exponent in d6
    beq Laddsf$a$den | branch if a is denormalized
    cmpl d4,d6 | check for INFINITY or NaN
    beq Laddsf$nf
    swap d6 | put exponent into first word
    orl d5,d0 | and put hidden bit back
    Laddsf$1:
    | Now we have a's exponent in d6 (second byte) and the mantissa in d0. '
    movel d1,d7 | get exponent in d7
    andl d4,d7 |
    beq Laddsf$b$den | branch if b is denormalized
    cmpl d4,d7 | check for INFINITY or NaN
    beq Laddsf$nf
    swap d7 | put exponent into first word
    notl d4 | make d4 into a mask for the fraction
    andl d4,d1 | get fraction in d1
    orl d5,d1 | and put hidden bit back
    Laddsf$2:
    | Now we have b's exponent in d7 (second byte) and the mantissa in d1. '

    | Note that the hidden bit corresponds to bit #FLT_MANT_DIG-1, and we
    | shifted right once, so bit #FLT_MANT_DIG is set (so we have one extra
    | bit).

    movel d1,d2 | move b to d2, since we want to use
    | two registers to do the sum
    movel IMM (0),d1 | and clear the new ones
    movel d1,d3 |

    | Here we shift the numbers in registers d0 and d1 so the exponents are the
    | same, and put the largest exponent in d6. Note that we are using two
    | registers for each number (see the discussion by D. Knuth in "Seminumerical
    | Algorithms").
    #ifndef __mcoldfire__
    cmpw d6,d7 | compare exponents
    #else
    cmpl d6,d7 | compare exponents
    #endif
    beq Laddsf$3 | if equal don't shift '
    bhi 5f | branch if second exponent largest
    1:
    subl d6,d7 | keep the largest exponent
    negl d7
    #ifndef __mcoldfire__
    lsrw IMM (8),d7 | put difference in lower byte
    #else
    lsrl IMM (8),d7 | put difference in lower byte
    #endif
    | if difference is too large we don't shift (actually, we can just exit) '
    #ifndef __mcoldfire__
    cmpw IMM (FLT_MANT_DIG+2),d7
    #else
    cmpl IMM (FLT_MANT_DIG+2),d7
    #endif
    bge Laddsf$b$small
    #ifndef __mcoldfire__
    cmpw IMM (16),d7 | if difference >= 16 swap
    #else
    cmpl IMM (16),d7 | if difference >= 16 swap
    #endif
    bge 4f
    2:
    #ifndef __mcoldfire__
    subw IMM (1),d7
    #else
    subql IMM (1), d7
    #endif
    3:
    #ifndef __mcoldfire__
    lsrl IMM (1),d2 | shift right second operand
    roxrl IMM (1),d3
    dbra d7,3b
    #else
    lsrl IMM (1),d3
    btst IMM (0),d2
    beq 10f
    bset IMM (31),d3
    10: lsrl IMM (1),d2
    subql IMM (1), d7
    bpl 3b
    #endif
    bra Laddsf$3
    4:
    movew d2,d3
    swap d3
    movew d3,d2
    swap d2
    #ifndef __mcoldfire__
    subw IMM (16),d7
    #else
    subl IMM (16),d7
    #endif
    bne 2b | if still more bits, go back to normal case
    bra Laddsf$3
    5:
    #ifndef __mcoldfire__
    exg d6,d7 | exchange the exponents
    #else
    eorl d6,d7
    eorl d7,d6
    eorl d6,d7
    #endif
    subl d6,d7 | keep the largest exponent
    negl d7 |
    #ifndef __mcoldfire__
    lsrw IMM (8),d7 | put difference in lower byte
    #else
    lsrl IMM (8),d7 | put difference in lower byte
    #endif
    | if difference is too large we don't shift (and exit!) '
    #ifndef __mcoldfire__
    cmpw IMM (FLT_MANT_DIG+2),d7
    #else
    cmpl IMM (FLT_MANT_DIG+2),d7
    #endif
    bge Laddsf$a$small
    #ifndef __mcoldfire__
    cmpw IMM (16),d7 | if difference >= 16 swap
    #else
    cmpl IMM (16),d7 | if difference >= 16 swap
    #endif
    bge 8f
    6:
    #ifndef __mcoldfire__
    subw IMM (1),d7
    #else
    subl IMM (1),d7
    #endif
    7:
    #ifndef __mcoldfire__
    lsrl IMM (1),d0 | shift right first operand
    roxrl IMM (1),d1
    dbra d7,7b
    #else
    lsrl IMM (1),d1
    btst IMM (0),d0
    beq 10f
    bset IMM (31),d1
    10: lsrl IMM (1),d0
    subql IMM (1),d7
    bpl 7b
    #endif
    bra Laddsf$3
    8:
    movew d0,d1
    swap d1
    movew d1,d0
    swap d0
    #ifndef __mcoldfire__
    subw IMM (16),d7
    #else
    subl IMM (16),d7
    #endif
    bne 6b | if still more bits, go back to normal case
    | otherwise we fall through

    | Now we have a in d0-d1, b in d2-d3, and the largest exponent in d6 (the
    | signs are stored in a0 and a1).

    Laddsf$3:
    | Here we have to decide whether to add or subtract the numbers
    #ifndef __mcoldfire__
    exg d6,a0 | get signs back
    exg d7,a1 | and save the exponents
    #else
    movel d6,d4
    movel a0,d6
    movel d4,a0
    movel d7,d4
    movel a1,d7
    movel d4,a1
    #endif
    eorl d6,d7 | combine sign bits
    bmi Lsubsf$0 | if negative a and b have opposite
    | sign so we actually subtract the
    | numbers

    | Here we have both positive or both negative
    #ifndef __mcoldfire__
    exg d6,a0 | now we have the exponent in d6
    #else
    movel d6,d4
    movel a0,d6
    movel d4,a0
    #endif
    movel a0,d7 | and sign in d7
    andl IMM (0x80000000),d7
    | Here we do the addition.
    addl d3,d1
    addxl d2,d0
    | Note: now we have d2, d3, d4 and d5 to play with!

    | Put the exponent, in the first byte, in d2, to use the "standard" rounding
    | routines:
    movel d6,d2
    #ifndef __mcoldfire__
    lsrw IMM (8),d2
    #else
    lsrl IMM (8),d2
    #endif

    | Before rounding normalize so bit #FLT_MANT_DIG is set (we will consider
    | the case of denormalized numbers in the rounding routine itself).
    | As in the addition (not in the subtraction!) we could have set
    | one more bit we check this:
    btst IMM (FLT_MANT_DIG+1),d0
    beq 1f
    #ifndef __mcoldfire__
    lsrl IMM (1),d0
    roxrl IMM (1),d1
    #else
    lsrl IMM (1),d1
    btst IMM (0),d0
    beq 10f
    bset IMM (31),d1
    10: lsrl IMM (1),d0
    #endif
    addl IMM (1),d2
    1:
    lea pc@(Laddsf$4),a0 | to return from rounding routine
    PICLEA SYM (_fpCCR),a1 | check the rounding mode
    #ifdef __mcoldfire__
    clrl d6
    #endif
    movew a1@(6),d6 | rounding mode in d6
    beq Lround$to$nearest
    #ifndef __mcoldfire__
    cmpw IMM (ROUND_TO_PLUS),d6
    #else
    cmpl IMM (ROUND_TO_PLUS),d6
    #endif
    bhi Lround$to$minus
    blt Lround$to$zero
    bra Lround$to$plus
    Laddsf$4:
    | Put back the exponent, but check for overflow.
    #ifndef __mcoldfire__
    cmpw IMM (0xff),d2
    #else
    cmpl IMM (0xff),d2
    #endif
    bhi 1f
    bclr IMM (FLT_MANT_DIG-1),d0
    #ifndef __mcoldfire__
    lslw IMM (7),d2
    #else
    lsll IMM (7),d2
    #endif
    swap d2
    orl d2,d0
    bra Laddsf$ret
    1:
    moveq IMM (ADD),d5
    bra Lf$overflow

    Lsubsf$0:
    | We are here if a > 0 and b < 0 (sign bits cleared).
    | Here we do the subtraction.
    movel d6,d7 | put sign in d7
    andl IMM (0x80000000),d7

    subl d3,d1 | result in d0-d1
    subxl d2,d0 |
    beq Laddsf$ret | if zero just exit
    bpl 1f | if positive skip the following
    bchg IMM (31),d7 | change sign bit in d7
    negl d1
    negxl d0
    1:
    #ifndef __mcoldfire__
    exg d2,a0 | now we have the exponent in d2
    lsrw IMM (8),d2 | put it in the first byte
    #else
    movel d2,d4
    movel a0,d2
    movel d4,a0
    lsrl IMM (8),d2 | put it in the first byte
    #endif

    | Now d0-d1 is positive and the sign bit is in d7.

    | Note that we do not have to normalize, since in the subtraction bit
    | #FLT_MANT_DIG+1 is never set, and denormalized numbers are handled by
    | the rounding routines themselves.
    lea pc@(Lsubsf$1),a0 | to return from rounding routine
    PICLEA SYM (_fpCCR),a1 | check the rounding mode
    #ifdef __mcoldfire__
    clrl d6
    #endif
    movew a1@(6),d6 | rounding mode in d6
    beq Lround$to$nearest
    #ifndef __mcoldfire__
    cmpw IMM (ROUND_TO_PLUS),d6
    #else
    cmpl IMM (ROUND_TO_PLUS),d6
    #endif
    bhi Lround$to$minus
    blt Lround$to$zero
    bra Lround$to$plus
    Lsubsf$1:
    | Put back the exponent (we can't have overflow!). '
    bclr IMM (FLT_MANT_DIG-1),d0
    #ifndef __mcoldfire__
    lslw IMM (7),d2
    #else
    lsll IMM (7),d2
    #endif
    swap d2
    orl d2,d0
    bra Laddsf$ret

    | If one of the numbers was too small (difference of exponents >=
    | FLT_MANT_DIG+2) we return the other (and now we don't have to '
    | check for finiteness or zero).
    Laddsf$a$small:
    movel a6@(12),d0
    PICLEA SYM (_fpCCR),a0
    movew IMM (0),a0@
    #ifndef __mcoldfire__
    moveml sp@+,d2-d7 | restore data registers
    #else
    moveml sp@,d2-d7
    | XXX if frame pointer is ever removed, stack pointer must
    | be adjusted here.
    #endif
    unlk a6 | and return
    rts

    Laddsf$b$small:
    movel a6@(8),d0
    PICLEA SYM (_fpCCR),a0
    movew IMM (0),a0@
    #ifndef __mcoldfire__
    moveml sp@+,d2-d7 | restore data registers
    #else
    moveml sp@,d2-d7
    | XXX if frame pointer is ever removed, stack pointer must
    | be adjusted here.
    #endif
    unlk a6 | and return
    rts

    | If the numbers are denormalized remember to put exponent equal to 1.

    Laddsf$a$den:
    movel d5,d6 | d5 contains 0x01000000
    swap d6
    bra Laddsf$1

    Laddsf$b$den:
    movel d5,d7
    swap d7
    notl d4 | make d4 into a mask for the fraction
    | (this was not executed after the jump)
    bra Laddsf$2

    | The rest is mainly code for the different results which can be
    | returned (checking always for +/-INFINITY and NaN).

    Laddsf$b:
    | Return b (if a is zero).
    movel a6@(12),d0
    cmpl IMM (0x80000000),d0 | Check if b is -0
    bne 1f
    movel a0,d7
    andl IMM (0x80000000),d7 | Use the sign of a
    clrl d0
    bra Laddsf$ret
    Laddsf$a:
    | Return a (if b is zero).
    movel a6@(8),d0
    1:
    moveq IMM (ADD),d5
    | We have to check for NaN and +/-infty.
    movel d0,d7
    andl IMM (0x80000000),d7 | put sign in d7
    bclr IMM (31),d0 | clear sign
    cmpl IMM (INFINITY),d0 | check for infty or NaN
    bge 2f
    movel d0,d0 | check for zero (we do this because we don't '
    bne Laddsf$ret | want to return -0 by mistake
    bclr IMM (31),d7 | if zero be sure to clear sign
    bra Laddsf$ret | if everything OK just return
    2:
    | The value to be returned is either +/-infty or NaN
    andl IMM (0x007fffff),d0 | check for NaN
    bne Lf$inop | if mantissa not zero is NaN
    bra Lf$infty

    Laddsf$ret:
    | Normal exit (a and b nonzero, result is not NaN nor +/-infty).
    | We have to clear the exception flags (just the exception type).
    PICLEA SYM (_fpCCR),a0
    movew IMM (0),a0@
    orl d7,d0 | put sign bit
    #ifndef __mcoldfire__
    moveml sp@+,d2-d7 | restore data registers
    #else
    moveml sp@,d2-d7
    | XXX if frame pointer is ever removed, stack pointer must
    | be adjusted here.
    #endif
    unlk a6 | and return
    rts

    Laddsf$ret$den:
    | Return a denormalized number (for addition we don't signal underflow) '
    lsrl IMM (1),d0 | remember to shift right back once
    bra Laddsf$ret | and return

    | Note: when adding two floats of the same sign if either one is
    | NaN we return NaN without regard to whether the other is finite or
    | not. When subtracting them (i.e., when adding two numbers of
    | opposite signs) things are more complicated: if both are INFINITY
    | we return NaN, if only one is INFINITY and the other is NaN we return
    | NaN, but if it is finite we return INFINITY with the corresponding sign.

    Laddsf$nf:
    moveq IMM (ADD),d5
    | This could be faster but it is not worth the effort, since it is not
    | executed very often. We sacrifice speed for clarity here.
    movel a6@(8),d0 | get the numbers back (remember that we
    movel a6@(12),d1 | did some processing already)
    movel IMM (INFINITY),d4 | useful constant (INFINITY)
    movel d0,d2 | save sign bits
    movel d1,d3
    bclr IMM (31),d0 | clear sign bits
    bclr IMM (31),d1
    | We know that one of them is either NaN of +/-INFINITY
    | Check for NaN (if either one is NaN return NaN)
    cmpl d4,d0 | check first a (d0)
    bhi Lf$inop
    cmpl d4,d1 | check now b (d1)
    bhi Lf$inop
    | Now comes the check for +/-INFINITY. We know that both are (maybe not
    | finite) numbers, but we have to check if both are infinite whether we
    | are adding or subtracting them.
    eorl d3,d2 | to check sign bits
    bmi 1f
    movel d0,d7
    andl IMM (0x80000000),d7 | get (common) sign bit
    bra Lf$infty
    1:
    | We know one (or both) are infinite, so we test for equality between the
    | two numbers (if they are equal they have to be infinite both, so we
    | return NaN).
    cmpl d1,d0 | are both infinite?
    beq Lf$inop | if so return NaN

    movel d0,d7
    andl IMM (0x80000000),d7 | get a's sign bit '
    cmpl d4,d0 | test now for infinity
    beq Lf$infty | if a is INFINITY return with this sign
    bchg IMM (31),d7 | else we know b is INFINITY and has
    bra Lf$infty | the opposite sign

    |=============================================================================
    | __mulsf3
    |=============================================================================

    | float __mulsf3(float, float);
    FUNC(__mulsf3)
    SYM (__mulsf3):
    #ifndef __mcoldfire__
    link a6,IMM (0)
    moveml d2-d7,sp@-
    #else
    link a6,IMM (-24)
    moveml d2-d7,sp@
    #endif
    movel a6@(8),d0 | get a into d0
    movel a6@(12),d1 | and b into d1
    movel d0,d7 | d7 will hold the sign of the product
    eorl d1,d7 |
    andl IMM (0x80000000),d7
    movel IMM (INFINITY),d6 | useful constant (+INFINITY)
    movel d6,d5 | another (mask for fraction)
    notl d5 |
    movel IMM (0x00800000),d4 | this is to put hidden bit back
    bclr IMM (31),d0 | get rid of a's sign bit '
    movel d0,d2 |
    beq Lmulsf$a$0 | branch if a is zero
    bclr IMM (31),d1 | get rid of b's sign bit '
    movel d1,d3 |
    beq Lmulsf$b$0 | branch if b is zero
    cmpl d6,d0 | is a big?
    bhi Lmulsf$inop | if a is NaN return NaN
    beq Lmulsf$inf | if a is INFINITY we have to check b
    cmpl d6,d1 | now compare b with INFINITY
    bhi Lmulsf$inop | is b NaN?
    beq Lmulsf$overflow | is b INFINITY?
    | Here we have both numbers finite and nonzero (and with no sign bit).
    | Now we get the exponents into d2 and d3.
    andl d6,d2 | and isolate exponent in d2
    beq Lmulsf$a$den | if exponent is zero we have a denormalized
    andl d5,d0 | and isolate fraction
    orl d4,d0 | and put hidden bit back
    swap d2 | I like exponents in the first byte
    #ifndef __mcoldfire__
    lsrw IMM (7),d2 |
    #else
    lsrl IMM (7),d2 |
    #endif
    Lmulsf$1: | number
    andl d6,d3 |
    beq Lmulsf$b$den |
    andl d5,d1 |
    orl d4,d1 |
    swap d3 |
    #ifndef __mcoldfire__
    lsrw IMM (7),d3 |
    #else
    lsrl IMM (7),d3 |
    #endif
    Lmulsf$2: |
    #ifndef __mcoldfire__
    addw d3,d2 | add exponents
    subw IMM (F_BIAS+1),d2 | and subtract bias (plus one)
    #else
    addl d3,d2 | add exponents
    subl IMM (F_BIAS+1),d2 | and subtract bias (plus one)
    #endif

    We are now ready to do the multiplication. The situation is as follows:
    denormalized to start with!), which means that in the product
    bit 2*(FLT_MANT_DIG-1) (that is, bit 2*FLT_MANT_DIG-2-32 of the
    high long) is set.

    | To do the multiplication let us move the number a little bit around ...
    movel d1,d6 | second operand in d6
    movel d0,d5 | first operand in d4-d5
    movel IMM (0),d4
    movel d4,d1 | the sums will go in d0-d1
    movel d4,d0

    | now bit FLT_MANT_DIG-1 becomes bit 31:
    lsll IMM (31-FLT_MANT_DIG+1),d6

    | Start the loop (we loop #FLT_MANT_DIG times):
    moveq IMM (FLT_MANT_DIG-1),d3
    1: addl d1,d1 | shift sum
    addxl d0,d0
    lsll IMM (1),d6 | get bit bn
    bcc 2f | if not set skip sum
    addl d5,d1 | add a
    addxl d4,d0
    2:
    #ifndef __mcoldfire__
    dbf d3,1b | loop back
    #else
    subql IMM (1),d3
    bpl 1b
    #endif

    | Now we have the product in d0-d1, with bit (FLT_MANT_DIG - 1) + FLT_MANT_DIG
    | (mod 32) of d0 set. The first thing to do now is to normalize it so bit
    | FLT_MANT_DIG is set (to do the rounding).
    #ifndef __mcoldfire__
    rorl IMM (6),d1
    swap d1
    movew d1,d3
    andw IMM (0x03ff),d3
    andw IMM (0xfd00),d1
    #else
    movel d1,d3
    lsll IMM (8),d1
    addl d1,d1
    addl d1,d1
    moveq IMM (22),d5
    lsrl d5,d3
    orl d3,d1
    andl IMM (0xfffffd00),d1
    #endif
    lsll IMM (8),d0
    addl d0,d0
    addl d0,d0
    #ifndef __mcoldfire__
    orw d3,d0
    #else
    orl d3,d0
    #endif

    moveq IMM (MULTIPLY),d5

    btst IMM (FLT_MANT_DIG+1),d0
    beq Lround$exit
    #ifndef __mcoldfire__
    lsrl IMM (1),d0
    roxrl IMM (1),d1
    addw IMM (1),d2
    #else
    lsrl IMM (1),d1
    btst IMM (0),d0
    beq 10f
    bset IMM (31),d1
    10: lsrl IMM (1),d0
    addql IMM (1),d2
    #endif
    bra Lround$exit

    Lmulsf$inop:
    moveq IMM (MULTIPLY),d5
    bra Lf$inop

    Lmulsf$overflow:
    moveq IMM (MULTIPLY),d5
    bra Lf$overflow

    Lmulsf$inf:
    moveq IMM (MULTIPLY),d5
    | If either is NaN return NaN; else both are (maybe infinite) numbers, so
    | return INFINITY with the correct sign (which is in d7).
    cmpl d6,d1 | is b NaN?
    bhi Lf$inop | if so return NaN
    bra Lf$overflow | else return +/-INFINITY

    | If either number is zero return zero, unless the other is +/-INFINITY,
    | or NaN, in which case we return NaN.
    Lmulsf$b$0:
    | Here d1 (==b) is zero.
    movel a6@(8),d1 | get a again to check for non-finiteness
    bra 1f
    Lmulsf$a$0:
    movel a6@(12),d1 | get b again to check for non-finiteness
    1: bclr IMM (31),d1 | clear sign bit
    cmpl IMM (INFINITY),d1 | and check for a large exponent
    bge Lf$inop | if b is +/-INFINITY or NaN return NaN
    movel d7,d0 | else return signed zero
    PICLEA SYM (_fpCCR),a0 |
    movew IMM (0),a0@ |
    #ifndef __mcoldfire__
    moveml sp@+,d2-d7 |
    #else
    moveml sp@,d2-d7
    | XXX if frame pointer is ever removed, stack pointer must
    | be adjusted here.
    #endif
    unlk a6 |
    rts |

    | If a number is denormalized we put an exponent of 1 but do not put the
    | hidden bit back into the fraction; instead we shift left until bit 23
    | (the hidden bit) is set, adjusting the exponent accordingly. We do this
    | to ensure that the product of the fractions is close to 1.
    Lmulsf$a$den:
    movel IMM (1),d2
    andl d5,d0
    1: addl d0,d0 | shift a left (until bit 23 is set)
    #ifndef __mcoldfire__
    subw IMM (1),d2 | and adjust exponent
    #else
    subql IMM (1),d2 | and adjust exponent
    #endif
    btst IMM (FLT_MANT_DIG-1),d0
    bne Lmulsf$1 |
    bra 1b | else loop back

    Lmulsf$b$den:
    movel IMM (1),d3
    andl d5,d1
    1: addl d1,d1 | shift b left until bit 23 is set
    #ifndef __mcoldfire__
    subw IMM (1),d3 | and adjust exponent
    #else
    subql IMM (1),d3 | and adjust exponent
    #endif
    btst IMM (FLT_MANT_DIG-1),d1
    bne Lmulsf$2 |
    bra 1b | else loop back

    |=============================================================================
    | __divsf3
    |=============================================================================