|
From: Carl E. L. <ce...@li...> - 2012-01-19 19:36:28
|
On Thu, 2012-01-19 at 19:12 +0100, Julian Seward wrote:
> > If there is value in adding the DFP type, I am willing to change my
> > preliminary POWER7 implementation to add the DFP types rather then using
> > the floating point types as I currently have. Having the DFP type may
> > make the overall code clearer then having Iops with the D32, D64 and D128
> > in the name but actually operate on F32, F64 and F128 values.
> >
> > Thoughts on adding DFP type?
>
> Yes, please do. It does cause the code to be a little longer, but it's
> also clearer and easier to verify as being correct. In the long run the
> latter points are much more important.
>
>
>
> > The POWER7 and s390 machines support the additional four rounding modes for
> > DFP (total of 8 not 9 modes) in the IEEE specification. After some
> > thought, I think it would be better to explicitly change the rounding mode
> > specifier from IRRoundingMode(I32) to IRRoundingModeDFP(I32). The concern
> > is if we extend the existing rounding mode specifier then there will be
> > rounding modes for which the existing binary floating point instructions
> > are not specified to support. Yes, it is some replication of code to
> > create a second super-set of rounding modes for DFP but from an overall
> > consistency, clarity and accuracy standpoint I think it might be best.
>
> I agree. +1 for IRRoundingModeDFP. It will have 8 possible values,
> yes?
>
>
>
> > OK, I had not specifically read the document above but was aware that
> > Valgrind only had limited rounding and exception support. I included
> > the full rounding and exception support for completeness. I thought it
> > better to include it at this point then to gloss over it. I added a
> > comment to the document ahead of the field definitions to that effect.
> > So, if there is no chance that the exception support will be added then
> > we can just drop the exception specification part.
> >
> > Thoughts?
>
> Drop the exception specification part. If we ever need to support FP
> exceptions then we will need to design a solution which works cleanly
> for both regular FP and DFP.
>
>
>
> > The immediate exponent value in the instruction is specific to the POWER
> > processor. I have spent some time reconsidering this. We really don't
> > need it so I will remove the IRRoundingModeAndEponent(I32).
>
> Good.
>
>
>
> > Comment from Florian:
> > > You did not describe InvOperationModes. But looking at insn LDETR
> > > (which is D32 -> D64 conversion) I gather that InvOperationModes
> > > controls whether or not the IEEE-invalid-operation-exception is
> > > delivered. It's essentially a Boolean value. I propose to ignore it
> > > and deliver the exception unconditionally (which is what we do for
> > > binary floating point).
> >
> > Julian do you agree we should remove the InvOperationModes(I32) from
> > the Iop specification? I am OK with it.
>
> Yes, please remove.
>
>
>
> > > > InvOperationModes(I32) x D64 -> D128
> > > > Iop_D64toD128
> > >
> > > ditto
> > >
> > > > IRRoundingExceptionModes(I32) x D64 -> D32
> > > > Iop_RoundD64toD32
> > > >
> > > > IRRoundingExceptionModes(I32) x D128 -> D64
> > > > Iop_RoundD128toD64
> > > >
> > > > IRRoundingExceptionModes(I32) x I64 -> D64
> > > > Iop_I64StoD64
> > >
> > > ok
> >
> > Comment from Florian:
> > > These two should be renamed to Iop_D64toD32 and Iop_D128toD64 for
> > > symmetry in naming with binary floating point ops.
> >
> > I disagree. The drsp instruction is analogous to the binary FP frsp
> > instruction, which (for ppc64) uses the Iop Iop_RoundF64ToF32. The
> > distinction is that an Iop that does not have "Round" in the name is
> > a conversion from one general type (i.e., floating point) to a different
> > general type (i.e., integer); whereas an Iop that *does* have "Round"
> > in the name is a narrowing of the same general type.
>
> Florian is correct, assuming that the types for these operations
> (D64 -> D32 etc) are really what you (Carl) intended. Problem is
> that it's unclear what behaviour you want. Your options are
>
> round and convert to a different format
> in which case you want the (eg) Iop_D64toD32 name style
> and the IR types for src and dst will then be different
> (eg D64 and D32 in this case)
>
> round (to a smaller range) but remain in the same format
> in which case you want the Iop_RoundD64toD32 names
> in this case the src and dst types are the same (D64)
> and D32 merely indicates the range to which the value is
> rounded
>
> so which is it?
>
>
The intention is the Iop_RoundD128toD64 Iop takes a value with the
physical D128 bit encoding and generates a result with the physical D64
bit encoding. Similarly for the Iop_RoundD64toD32 Iop. So based on
your definitions above, the first one (round and convert to a different
format) is correct and the Iops need to be renamed. The above
clarification was really helpful to understand the naming convention.
Thanks. I will make the Iop name changes.
>
>
> > Comment from Florian:
> > > For s390 we also need:
> > > IROp description s390 insn
> > >
> > > Iop_I64StoD128 IRRoundingMode(I32) x signed I64 -> D128 CXGTR
> > > Iop_I32StoD64 signed I32 -> D64 CDFTR
> > > Iop_I32StoD128 signed I32 -> D128 CXFTR
> > > Iop_I64UtoD64 IRRoundingMode(I32) x unsigned I64 -> D64 CDLGTR
> > > Iop_I64UtoD128 IRRoundingMode(I32) x unsigned I64 -> D128 CXLGTR
> > > Iop_I32UtoD64 unsigned I32 -> D64 CDLFTR
> > > Iop_I32UtoD128 unsigned I32 -> D128 CXLFTR
> > >
> > > We need both: conversion to signed and unsigned int
> > >
> > > IROp description s390 insn
> > >
> > > Iop_D64toI64S IRRoundingMode(I32) x D64 -> signed I64 CGDTR(A)
> > > Iop_D128toI64S IRRoundingMode(I32) x D128 -> signed I64 CGXTR(A)
> > > Iop_D64toI32S IRRoundingMode(I32) x D64 -> signed I32 CFDTR
> > > Iop_D128toI32S IRRoundingMode(I32) x D128 -> signed I32 CFXTR
> > > Iop_D64toI64U IRRoundingMode(I32) x D64 -> unsigned I64 CLGDTR
> > > Iop_D128toI64U IRRoundingMode(I32) x D128 -> unsigned I64 CLGXTR
> > > Iop_D64toI32U IRRoundingMode(I32) x D64 -> unsigned I32 CLFDTR
> > > Iop_D128toI32U IRRoundingMode(I32) x D128 -> unsigned I32 CLFXTR
> > >
> > > Note, the new IRops for conversion to 32-bit wide results and from D128.
> >
> > Comment from Christian Borntraege:
> > > > The PFPO insn is used to convert between binary floating point and
> > > > decimal floating point. Since we have 3 formats each, that makes 9
> > > > conversion ops for each direction:
> > > >
> > > > Iop_D32toF32 IRRoundingMode(I32) x D32 -> F32
> > > > Iop_D32toF64 IRRoundingMode(I32) x D32 -> F64
> > > > Iop_D32toF128 IRRoundingMode(I32) x D32 -> F128
> > > > Iop_D64toF32 IRRoundingMode(I32) x D64 -> F32
> > > > Iop_D64toF64 IRRoundingMode(I32) x D64 -> F64
> > > > Iop_D64toF128 IRRoundingMode(I32) x D64 -> F128
> > > > Iop_D128toF32 IRRoundingMode(I32) x D128 -> F32
> > > > Iop_D128toF64 IRRoundingMode(I32) x D128 -> F64
> > > > Iop_D128toF128 IRRoundingMode(I32) x D128 -> F128
> > > >
> > > > Iop_F32toD32 IRRoundingMode(I32) x F32 -> D32
> > > > Iop_F32toD64 IRRoundingMode(I32) x F32 -> D64
> > > > Iop_F32toD128 IRRoundingMode(I32) x F32 -> D128
> > > > Iop_F64toD32 IRRoundingMode(I32) x F64 -> D32
> > > > Iop_F64toD64 IRRoundingMode(I32) x F64 -> D64
> > > > Iop_F64toD128 IRRoundingMode(I32) x F64 -> D128
> > > > Iop_F128toD32 IRRoundingMode(I32) x F128 -> D32
> > > > Iop_F128toD64 IRRoundingMode(I32) x F128 -> D64
> > > > Iop_F128toD128 IRRoundingMode(I32) x F128 -> D128
> > >
> > > If you look at pfpo, then the instruction has the same tricky
> > > behaviour as EXecute. Since a self checking prefix and 18 Iops is
> > >
> > > pretty expensive I think that pfpo qualifies for a helper.
> >
> > These conversion modes are specifically support by s390 but not POWER.
> > POWER supports a subset of the conversions supported by s390. This is
> > one of the areas where I feel we should defer the decision on adding more
> > Iops, using a helper function or emulating these conversions with the
> > subset of conversions that have been proposed until the s390 team is ready
> > to add the s390 support to Valgrind. This is beyond the scope of what is
> > needed for me to add the POWER support. But I think it is important that
> > the issue has been raised to understand the full scope of what is needed
> > by both architectures.
>
> Ok. I agree .. just add the conversions needed for POWER, unless
> Christian/Florian want to also add the S390 needed conversions now
> (I am unclear if you do, or not)
I have no need to support all these on POWER. I will put the following
into version 2 of the proposal so it is clear we may need additional
Iops in the future for s390.
"Additional format conversion Iops for converting to/from
decimal floating point and binary floating point may need to be
added later for the s390 support."
I think it is important that we state some additional Iops maybe needed
for other architectures. But I would like to defer the final decision
on exactly how these conversions are supported via additional Iops or a
helper function to when the s390 support is done.
>
>
>
> >
> > > > IRRoundingExceptionModes(I32) x D64 -> I64
> > > > Iop_D64toI64
> > >
> > > this is underspecified .. you need to decide whether that's a
> > > conversion to signed or unsigned I64 (or maybe you need both)
> > > and call them Iop_D64toI64S or Iop_D64toI64U respectively.
> > > (I think you comment about this further down in the doc.)
> > > I mention this partly because sorting out such ambiguity in the
> > > past for the Fxx->Ixx conversions required a lot of hoop
> > > jumping, so we might as well get it straightened out up front.
> >
> > OK, should be convert to signed integer. Changed the name to
> > Iop_D64toI64S.
>
> Good.
>
>
>
> > > > ROUNDING INSTRUCTIONS
> > > > -----------------------
> > > > IRRoundingMode(I32) x D64 -> D64
> > > > Iop_RoundD64
> > > >
> > > > IRRoundingMode(I32) x D128 -> D128
> > > > Iop_RoundD128
> > >
> > > ok
> >
> > Comment from Florian:
> > > These should be named Iop_RoundD64toInt and Iop_RoundD128toInt for
> > > symmetry in naming with binary floating point ops.
> >
> > I am OK with the name change. Comments?
>
> /me agrees with Florian.
>
>
>
> >
> > > > COMPARE INSTRUCTIONS
> > > > -----------------------
> > > > D64 x D64 -> IRCmpF64Result(I32)
> > > > Iop_CmpD64
> > > >
> > > > D128 x D128 -> IRCmpF64Result(I32)
> > > > Iop_CmpD128
> > >
> > > ok
> >
> > Comment from Florian:
> > > OK. I would use IRCmpD64Result and IRCmpD128Result. That allows
> > > us to use a different encoding, which may be desirable.
> >
> > I am OK with the name change. Comments?
>
> Fine by me.
>
>
>
> > > > QUANTIZE AND ROUND INSTRUCTIONS
> > > > -------------------------------
> > > > IRRoundingMode(I32) x D64-> D64
> > > > Iop_QuantizeID64,
> > > >
> > > > IRRoundingMode(I32) x D128-> D128
> > > > Iop_QuantizeID128
> > > >
> > > > IRRoundingMode(I32) x D64 x D64 -> D64
> > > > Iop_QuantizeD64
> > > >
> > > > IRRoundingMode(I32) x D128 x D128 -> D128
> > > > Iop_QuantizeD128
> > >
> > > I'm not clear what the ID vs D signifies in these names. Can
> > > they instead be called Iop_Quantize{Un,Bin}{D64,D128} to denote
> > > unary vs binary ness (ignoring the rounding mode arg which is
> > > present in all 4 cases).
> >
> > The first two with the I are for immediate value of the exponent for the
> > quantization operation. Remember above, I had the
> > IRRoundingModeAndEponent(I32) specification. It was also there for
> > specifying the immediate value. I didn't realize that I had a redundant
> > way of specifying the immediate exponent value. So, as said above, we drop
> > IRRoundingModeAndEponent(I32) specification.
> >
> > > What is quantization, anyway (in the context of DFP I mean)?
> > > Does it have any analogue in traditional IEEE754 FP ?
> >
> > Quantization is the process of rounding a number to a specified exponent as
> > mentioned earlier. For example, the U.S. specifies a dollar amount with at
> > most two fractional digits, for example $12.98. When doing a computation
> > involving money, the quantization instruction would be used to round the
> > result to two fractional digits.
> >
> > The immediate value to specify the desired exponent is specific to POWER.
> > Both s390 and POWER have instructions for changing the exponent of
> > (quantizing) a value in register A to match the exponent of a value in
> > register B. I think we should remove the Iops Iop_QuantizeID64 and
> > Iop_QuantizeID128 for specify an immediate exponent. For the POWER
> > instruction that specifies the immediate exponent, I can just generate a
> > DFP number with the desired exponent and pass that as the value in
> > register B as the target exponent. This will help minimize the number of
> > new Iops.
>
> Ok by me.
>
>
>
> > > > IRRoundingMode(I32) x D64 x D64 -> D64
> > > > Iop_SignificanceRoundD64
> > > >
> > > > IRRoundingMode(I32) x D128 x D128 -> D128
> > > > Iop_SignificanceRoundD128
> > > >
> > > >
> > > > EXTRACT AND INSERT INSTRUCTIONS
> > > > -------------------------------
> > > > D64 -> I64
> > > > Iop_ExtractExpD64
> > > >
> > > > D128 -> I64
> > > > Iop_ExtractExpD128
> > >
> > > The exponent really needs 64 bits? Can it be 32 bits? That
> > > might allow for more efficient code generation for 32 bit targets.
> >
> > The biased exponent could be stored as a 32 bit biased integer. The
> > functionality of the specific instruction that this Iop was intended for
> > specifies the exponent will be stored in a floating point register in a
> > signed integer format. I agree we can make the target be I32 and then
> > just handle the I32 to I64 conversion as needed for the specific
> > instruction. The exponent for any DFP number will easily fit in less then
> > 31 bits.
> >
> > I will change the above two Iops to return an I32 value.
>
> Good.
>
>
>
>
> >
> > Comment from Florian:
> > > Do we need to support these at all? In other words, does GCC issue these
> > > or do they show up in hand crafted assembler shipped with GCC/GLIBC?
> > > I don't know but will find out (for s390).
> >
> > Yes, the libdfp and binutils use them.
> >
> > > > I64 x I64 -> D64
> > > > Iop_InsertExpD64
> > > >
> > > > I64 x I128 -> D128
> > > > Iop_InsertExpD128
> > >
> > > ditto comment
> >
> > ditto above that we do need these Iops for instructions that are used.
>
> So (unclear) as with ExtractExp, you'll change the exp value to be an I32
> instead of an I64, yes?
Sorry, missed that. Yes, we should change this to be consistent with
the ExtractExp. So, it will be:
I32 x I64 -> D64
Iop_InsertExpD64
I32 x I128 -> D128
Iop_InsertExpD128
where the I32 is the exponent and the I64/128 is the significand.
> > > > SHIFT SIGNIFICAND INSTRUCTIONS
> > > > -------------------------------
> > > > U16 x D64 -> D64
> > > > Iop_ShlD64, Iop_ShrD64
> > > >
> > > > U16 x D128 -> D128
> > > > Iop_ShlD128, Iop_ShrD128
> > >
> > > two things: (1) does the shift amount need to be 16 bits?
> > > For all the other shifting style ops we have, the shift amount
> > > is encoded in 8 bits (Ity_I8) and I would prefer to stick with
> > > that for consistency, if possible. (2) pls put the shift amount
> > > as the second argument, not the first, as that too is consistent
> > > with all other shift ops we have (eg, Iop_Shr64)
> >
> > OK, changed shift to U8 and made it D64 x U8 -> D64; D128 x U8 -> D128
>
> Good.
>
>
>
> > > ---------------
> > >
> > > > This section give the detailed mapping of Power and s390
> > > >
> > > > instructions to the proposed DFP Iops or how they would be implemented
> > > > with the existing Iops and the proposed DFP Iops.
> > >
> > > I'll comment on this second half of the proposal tomorrow.
> > >
> > > J
> >
> > I have not seen any additional comments. It is probably best to focus on
> > resolving the above issues first as the second part is just a more detailed
> > discussion of the above summary of the Iops. Once we have agreement on the
> > above discussion points, I will update and post version 2 of the proposal.
>
> Yes, good plan.
>
> J
|