|
From: Carl E. L. <ce...@li...> - 2011-12-28 17:46:54
|
On Tue, 2011-12-27 at 23:17 +0100, Julian Seward wrote:
> Thanks for making a plausible looking proposal. Looks like it's
> heading in the right direction. There are some details to iron
> out, though. Please look at all of them.
Julian:
Thanks for taking the time to review the document. I have read over the
comments and they all seem very reasonable and easily addressed. I will
work on them and get back to you soon. Just an FYI, we did run the
proposal by the s390 team for a sanity check. It looked reasonable to
them but as they said, they have not really dug into the details of
doing the work so there may be some lingering issues. I have been
working on the POWER implementation. Currently I have 44 of the 49
POWER instructions implemented. I don't foresee any issues with the
remaining instructions. I wanted to have a good proof of concept
implementation to go with the proposal as a sanity check.
Thanks again and take care. Talk with you soon.
Carl Love
>
> Comments in order of reading the doc:
>
> General: how much has this been checked out by the s390 folks
> (Florian, Christian, Divya) ?
>
> General: please give a reference, including URL, to a publically available
> standard that defines the basic arithmetic (IEEE 754-2008 ?)
>
>
> > The IBM Power and s390 systems support 32-bit, 64-bit and 128-bit
> > decimal floating point (DFP) numbers. The DFP instructions use the
> > existing floating point registers. The DFP 128-bit operands are stored in
> > registers i and i+1 where i must be even. For example the DFP 128-bit add
> > is done as follows:
> >
> > lfd 12,48(31) // load the upper 64 bits of the first 128 value
> > in reg 12 lfd 13,56(31) // load the lower 64 bits of the first 128
> > value in reg 13 addi 0,31,64
> > mr 9,0
> > lfd 0,0(9) // load the upper 64 bits of the 128 second value
> > in reg 0 lfd 1,8(9) // load the lower 64 bits of the 128 second
> > value in reg 1 daddq 0,12,0 // do the add of the 128 DFP numbers,
> > result is stored in // registers 0 and 1. Note, registers 1 and 13 are
> > not // explicitly listed but are implied
>
> It feels like there's possible some confusion between types in the IR
> (that is, IRType) and how values are represented in ppc registers. These
> concepts are distinct, and are related only in the sense that it is necessary
> to choose types that don't cause the ppc->IR and IR->ppc translations to
> be inefficient.
>
> AFAICS (also, from reading the rest of the doc) you want three new types,
> Ity_D32, Ity_I64 and Ity_D128. (yes? that sounds right to me)
>
> Note that many of the back ends already convert F32-typed expression
> trees into 64-bit floating point code (eg, the host_ppc_isel.c) so
> doing so for D32 would be considered "normal".
>
>
>
> > Notation Description
> >
> > --------------------------------------------------------------------------
> > - DFP: Decimal Floating Point number possibly 32-bit, 64-bit, or
> > 128-bit format
> >
> > D32: 32-bit decimal floating point format These values
> > use F64 registers. The D32 term is used to distinguish the value from the
> > standard 32-bit floating point value.
> >
> > D64: 64-bit decimal floating point format. These values
> > use F64 registers. The D64 term is used to distinguish the value from the
> > standard 64-bit floating point value.
> >
> > D128: 128-bit decimal floating point format. These
> > values use a pair of two 64 bit floating point registers (F64). The
> > instruction only references the first register of the register pair. The
> > second register is implied.
>
> As per comments above, the comments re the PPC register layouts isn't
> directly relevant to what you need in the IR. (I don't care; I just want
> to be sure we have our concepts straight here)
>
>
> > IRRoundingMode(I32): Indicates the I32 argument is used
> > to hold the bits that specify the rounding mode to be used by the
> > instruction.
>
> Fine; as per existing code.
>
>
> > IRRoundingExceptionModes(I32): Indicates the I32 argument is used
> > to hold the bits that specify the rounding mode and the bits that specify
> > the exception mode to be used by the instruction. The s390 instructions
> > specify both modes whereas POWER does not specify either mode.
>
> Hmm. Have you read (in detail) the limitations re floating point
> described at
> http://valgrind.org/docs/manual/manual-core.html#manual-core.limits
>
> The point is that IR and Valgrind generally doesn't provide support
> for non-default exception modes, and silently assumes that exceptions
> are to be fixed up using the default IEEE fixup actions. So there's
> no point at the moment in adding exception action information into
> the IR. None of the other front ends (xxx_to_IR.c) do it.
>
>
> > IRRoundingModeAndEponent(I32): Indicates the I32 argument will
> > contain bits to specify the rounding mode. POWER also has bits to specify
> > the desired exponent. The s390 instructions only specify the rounding
> > mode.
>
> Euh, can you elaborate on the encoding/meaning of "desired exponent" ?
> Sounds a bit like wiring a POWER-ism into the IR spec, which isn't good.
>
>
> > ARITHMETIC INSTRUCTIONS
> > -----------------------
> > IRRoundingMode(I32) X D64 X D64 -> D64
> > Iop_AddD64, Iop_SubD64, Iop_MulD64, Iop_DivD64
> >
> > IRRoundingMode(I32) X D128 X D128 -> D128
> > Iop_AddD128, Iop_SubD128, Iop_MulD128, Iop_DivD128
>
> fine
>
>
> > FORMAT CONVERSION INSTRUCTIONS
> > ------------------------------
> > InvOperationModes(I32) x D32 -> D64
> > Iop_D32toD64
>
> what's InvOperationModes? It's not specified anywhere in your doc.
>
> > InvOperationModes(I32) x D64 -> D128
> > Iop_D64toD128
>
> ditto
>
> > IRRoundingExceptionModes(I32) x D64 -> D32
> > Iop_RoundD64toD32
> >
> > IRRoundingExceptionModes(I32) x D128 -> D64
> > Iop_RoundD128toD64
> >
> > IRRoundingExceptionModes(I32) x I64 -> D64
> > Iop_I64StoD64
>
> ok
>
> > IRRoundingExceptionModes(I32) x D64 -> I64
> > Iop_D64toI64
>
> this is underspecified .. you need to decide whether that's a
> conversion to signed or unsigned I64 (or maybe you need both)
> and call them Iop_D64toI64S or Iop_D64toI64U respectively.
> (I think you comment about this further down in the doc.)
> I mention this partly because sorting out such ambiguity in the
> past for the Fxx->Ixx conversions required a lot of hoop
> jumping, so we might as well get it straightened out up front.
>
>
> > ROUNDING INSTRUCTIONS
> > -----------------------
> > IRRoundingMode(I32) x D64 -> D64
> > Iop_RoundD64
> >
> > IRRoundingMode(I32) x D128 -> D128
> > Iop_RoundD128
>
> ok
>
>
> > COMPARE INSTRUCTIONS
> > -----------------------
> > D64 x D64 -> IRCmpF64Result(I32)
> > Iop_CmpD64
> >
> > D128 x D128 -> IRCmpF64Result(I32)
> > Iop_CmpD128
>
> ok
>
>
> > D64 x D64 -> 1 if the condition is TRUE, 0 otherwise
> > Iop_CmpEQD64, Iop_CmpLTD64, Iop_CmpGTD64
> >
> > D128 x D128 -> 1 if the condition is TRUE; 0 otherwise;
> > Iop_CmpEQD128, Iop_CmpLTD128, Iop_CmpGTD128
>
> why are these 6 necessary? Isn't their functionality a subset of
> Iop_CmpD64 and Iop_CmpD128 ?
>
>
> > QUANTIZE AND ROUND INSTRUCTIONS
> > -------------------------------
> > IRRoundingMode(I32) x D64-> D64
> > Iop_QuantizeID64,
> >
> > IRRoundingMode(I32) x D128-> D128
> > Iop_QuantizeID128
> >
> > IRRoundingMode(I32) x D64 x D64 -> D64
> > Iop_QuantizeD64
> >
> > IRRoundingMode(I32) x D128 x D128 -> D128
> > Iop_QuantizeD128
>
> I'm not clear what the ID vs D signifies in these names. Can
> they instead be called Iop_Quantize{Un,Bin}{D64,D128} to denote
> unary vs binary ness (ignoring the rounding mode arg which is
> present in all 4 cases).
>
> What is quantization, anyway (in the context of DFP I mean)?
> Does it have any analogue in traditional IEEE754 FP ?
>
> > IRRoundingMode(I32) x D64 x D64 -> D64
> > Iop_SignificanceRoundD64
> >
> > IRRoundingMode(I32) x D128 x D128 -> D128
> > Iop_SignificanceRoundD128
> >
> >
> > EXTRACT AND INSERT INSTRUCTIONS
> > -------------------------------
> > D64 -> I64
> > Iop_ExtractExpD64
> >
> > D128 -> I64
> > Iop_ExtractExpD128
>
> The exponent really needs 64 bits? Can it be 32 bits? That
> might allow for more efficient code generation for 32 bit targets.
>
> > I64 x I64 -> D64
> > Iop_InsertExpD64
> >
> > I64 x I128 -> D128
> > Iop_InsertExpD128
>
> ditto comment
>
> > SHIFT SIGNIFICAND INSTRUCTIONS
> > -------------------------------
> > U16 x D64 -> D64
> > Iop_ShlD64, Iop_ShrD64
> >
> > U16 x D128 -> D128
> > Iop_ShlD128, Iop_ShrD128
>
> two things: (1) does the shift amount need to be 16 bits?
> For all the other shifting style ops we have, the shift amount
> is encoded in 8 bits (Ity_I8) and I would prefer to stick with
> that for consistency, if possible. (2) pls put the shift amount
> as the second argument, not the first, as that too is consistent
> with all other shift ops we have (eg, Iop_Shr64)
>
> ---------------
>
> > This section give the detailed mapping of Power and s390
> > instructions to the proposed DFP Iops or how they would be implemented
> > with the existing Iops and the proposed DFP Iops.
>
> I'll comment on this second half of the proposal tomorrow.
>
> J
>
|