|
From: Julian S. <js...@ac...> - 2011-12-27 22:18:23
|
Thanks for making a plausible looking proposal. Looks like it's heading in the right direction. There are some details to iron out, though. Please look at all of them. Comments in order of reading the doc: General: how much has this been checked out by the s390 folks (Florian, Christian, Divya) ? General: please give a reference, including URL, to a publically available standard that defines the basic arithmetic (IEEE 754-2008 ?) > The IBM Power and s390 systems support 32-bit, 64-bit and 128-bit > decimal floating point (DFP) numbers. The DFP instructions use the > existing floating point registers. The DFP 128-bit operands are stored in > registers i and i+1 where i must be even. For example the DFP 128-bit add > is done as follows: > > lfd 12,48(31) // load the upper 64 bits of the first 128 value > in reg 12 lfd 13,56(31) // load the lower 64 bits of the first 128 > value in reg 13 addi 0,31,64 > mr 9,0 > lfd 0,0(9) // load the upper 64 bits of the 128 second value > in reg 0 lfd 1,8(9) // load the lower 64 bits of the 128 second > value in reg 1 daddq 0,12,0 // do the add of the 128 DFP numbers, > result is stored in // registers 0 and 1. Note, registers 1 and 13 are > not // explicitly listed but are implied It feels like there's possible some confusion between types in the IR (that is, IRType) and how values are represented in ppc registers. These concepts are distinct, and are related only in the sense that it is necessary to choose types that don't cause the ppc->IR and IR->ppc translations to be inefficient. AFAICS (also, from reading the rest of the doc) you want three new types, Ity_D32, Ity_I64 and Ity_D128. (yes? that sounds right to me) Note that many of the back ends already convert F32-typed expression trees into 64-bit floating point code (eg, the host_ppc_isel.c) so doing so for D32 would be considered "normal". > Notation Description > > -------------------------------------------------------------------------- > - DFP: Decimal Floating Point number possibly 32-bit, 64-bit, or > 128-bit format > > D32: 32-bit decimal floating point format These values > use F64 registers. The D32 term is used to distinguish the value from the > standard 32-bit floating point value. > > D64: 64-bit decimal floating point format. These values > use F64 registers. The D64 term is used to distinguish the value from the > standard 64-bit floating point value. > > D128: 128-bit decimal floating point format. These > values use a pair of two 64 bit floating point registers (F64). The > instruction only references the first register of the register pair. The > second register is implied. As per comments above, the comments re the PPC register layouts isn't directly relevant to what you need in the IR. (I don't care; I just want to be sure we have our concepts straight here) > IRRoundingMode(I32): Indicates the I32 argument is used > to hold the bits that specify the rounding mode to be used by the > instruction. Fine; as per existing code. > IRRoundingExceptionModes(I32): Indicates the I32 argument is used > to hold the bits that specify the rounding mode and the bits that specify > the exception mode to be used by the instruction. The s390 instructions > specify both modes whereas POWER does not specify either mode. Hmm. Have you read (in detail) the limitations re floating point described at http://valgrind.org/docs/manual/manual-core.html#manual-core.limits The point is that IR and Valgrind generally doesn't provide support for non-default exception modes, and silently assumes that exceptions are to be fixed up using the default IEEE fixup actions. So there's no point at the moment in adding exception action information into the IR. None of the other front ends (xxx_to_IR.c) do it. > IRRoundingModeAndEponent(I32): Indicates the I32 argument will > contain bits to specify the rounding mode. POWER also has bits to specify > the desired exponent. The s390 instructions only specify the rounding > mode. Euh, can you elaborate on the encoding/meaning of "desired exponent" ? Sounds a bit like wiring a POWER-ism into the IR spec, which isn't good. > ARITHMETIC INSTRUCTIONS > ----------------------- > IRRoundingMode(I32) X D64 X D64 -> D64 > Iop_AddD64, Iop_SubD64, Iop_MulD64, Iop_DivD64 > > IRRoundingMode(I32) X D128 X D128 -> D128 > Iop_AddD128, Iop_SubD128, Iop_MulD128, Iop_DivD128 fine > FORMAT CONVERSION INSTRUCTIONS > ------------------------------ > InvOperationModes(I32) x D32 -> D64 > Iop_D32toD64 what's InvOperationModes? It's not specified anywhere in your doc. > InvOperationModes(I32) x D64 -> D128 > Iop_D64toD128 ditto > IRRoundingExceptionModes(I32) x D64 -> D32 > Iop_RoundD64toD32 > > IRRoundingExceptionModes(I32) x D128 -> D64 > Iop_RoundD128toD64 > > IRRoundingExceptionModes(I32) x I64 -> D64 > Iop_I64StoD64 ok > IRRoundingExceptionModes(I32) x D64 -> I64 > Iop_D64toI64 this is underspecified .. you need to decide whether that's a conversion to signed or unsigned I64 (or maybe you need both) and call them Iop_D64toI64S or Iop_D64toI64U respectively. (I think you comment about this further down in the doc.) I mention this partly because sorting out such ambiguity in the past for the Fxx->Ixx conversions required a lot of hoop jumping, so we might as well get it straightened out up front. > ROUNDING INSTRUCTIONS > ----------------------- > IRRoundingMode(I32) x D64 -> D64 > Iop_RoundD64 > > IRRoundingMode(I32) x D128 -> D128 > Iop_RoundD128 ok > COMPARE INSTRUCTIONS > ----------------------- > D64 x D64 -> IRCmpF64Result(I32) > Iop_CmpD64 > > D128 x D128 -> IRCmpF64Result(I32) > Iop_CmpD128 ok > D64 x D64 -> 1 if the condition is TRUE, 0 otherwise > Iop_CmpEQD64, Iop_CmpLTD64, Iop_CmpGTD64 > > D128 x D128 -> 1 if the condition is TRUE; 0 otherwise; > Iop_CmpEQD128, Iop_CmpLTD128, Iop_CmpGTD128 why are these 6 necessary? Isn't their functionality a subset of Iop_CmpD64 and Iop_CmpD128 ? > QUANTIZE AND ROUND INSTRUCTIONS > ------------------------------- > IRRoundingMode(I32) x D64-> D64 > Iop_QuantizeID64, > > IRRoundingMode(I32) x D128-> D128 > Iop_QuantizeID128 > > IRRoundingMode(I32) x D64 x D64 -> D64 > Iop_QuantizeD64 > > IRRoundingMode(I32) x D128 x D128 -> D128 > Iop_QuantizeD128 I'm not clear what the ID vs D signifies in these names. Can they instead be called Iop_Quantize{Un,Bin}{D64,D128} to denote unary vs binary ness (ignoring the rounding mode arg which is present in all 4 cases). What is quantization, anyway (in the context of DFP I mean)? Does it have any analogue in traditional IEEE754 FP ? > IRRoundingMode(I32) x D64 x D64 -> D64 > Iop_SignificanceRoundD64 > > IRRoundingMode(I32) x D128 x D128 -> D128 > Iop_SignificanceRoundD128 > > > EXTRACT AND INSERT INSTRUCTIONS > ------------------------------- > D64 -> I64 > Iop_ExtractExpD64 > > D128 -> I64 > Iop_ExtractExpD128 The exponent really needs 64 bits? Can it be 32 bits? That might allow for more efficient code generation for 32 bit targets. > I64 x I64 -> D64 > Iop_InsertExpD64 > > I64 x I128 -> D128 > Iop_InsertExpD128 ditto comment > SHIFT SIGNIFICAND INSTRUCTIONS > ------------------------------- > U16 x D64 -> D64 > Iop_ShlD64, Iop_ShrD64 > > U16 x D128 -> D128 > Iop_ShlD128, Iop_ShrD128 two things: (1) does the shift amount need to be 16 bits? For all the other shifting style ops we have, the shift amount is encoded in 8 bits (Ity_I8) and I would prefer to stick with that for consistency, if possible. (2) pls put the shift amount as the second argument, not the first, as that too is consistent with all other shift ops we have (eg, Iop_Shr64) --------------- > This section give the detailed mapping of Power and s390 > instructions to the proposed DFP Iops or how they would be implemented > with the existing Iops and the proposed DFP Iops. I'll comment on this second half of the proposal tomorrow. J |