Re: [Open64-devel] bug912 fix review request resent
Brought to you by:
ributzka,
suneeljain
From: Sun C. <sun...@gm...> - 2011-11-29 09:02:40
|
I thought whether CVT is needed is decided with the table (I forgot exactly where that is now). This is a machine dependent operation (zero or sign extended when loaded from memory). E.g. IA is zero extended, Mips is sign extended. Or am I in the wrong context? Sun On Tue, Nov 29, 2011 at 4:29 PM, Gang Yu <yug...@gm...> wrote: > Hi, list and Sun > > Let me re-organize the review request for bug912 fix. The bug case: > > struct go7007 { > int sensor_framerate; > int fps_scale; > }; > int vti_bitlen(struct go7007 *go) > { > unsigned int i, max_time_incr = go->sensor_framerate / go->fps_scale; > for (i = 31; (max_time_incr & ((1 << i) - 1)) == max_time_incr; --i); > return i + 1; > } > > opencc O2 compile will assert: > > ### Assertion failure at line 2109 of > /fc/proj/ctires/open64/o64guru/src/Sat/trunk/osprey/be/cg/cgemit.cxx: > ### Compiler Error in file bug912-cutdown.c during Assembly phase: > ### literal for operand 1 is not in range > > If we check the whirl output, we get the initial condition check for for > loop is as follows: > > U4U4LDID 50 <1,8,.preg_U4> T<8,.predef_U4,4> # max_time_incr > U4U4LDID 50 <1,8,.preg_U4> T<8,.predef_U4,4> # max_time_incr > I8INTCONST -2147483649 (0xffffffff7fffffff) > U4BAND > I4U4EQ > FALSEBR L2562 {line: 0/0} > > this is already wrong, the expected whirl output for the loop entry > condition should be: > > U4U4LDID 50 <1,8,.preg_U4> T<8,.predef_U4,4> # max_time_incr > U4INTCONST 2147483647 > U4BAND > I4U4EQ > FALSEBR L2562 {line: 0/0} > and investigation shows before the phase of "Create CODEMAP > Representation" > > this condition can be expressed as: > U4U4LDID 0 <st 19> T<8,.predef_U4,4> > I4INTCONST 1 (0x1) > I4I4LDID 0 <st 20> T<4,.predef_I4,4> > I4SHL > I4INTCONST -1 (0xffffffffffffffff) > I4ADD > U4I4CVT > U4U4LDID 0 <st 19> T<8,.predef_U4,4> > U4BAND > I4U4EQ > FALSEBR L1538 {line: 0/0} {freq: 0, ln: 0, col: 0} > st19 is max_time_incr, st20 is i, this is still right. > > After the CODEMAP phase, the expression is transformed to > > LDID U4 U4 sym3v3 0 ty=802 <u=4 cr12> flags:0x0 b=-1 #max_time_incr > > LDID U4 U4 sym3v3 0 ty=802 <u=4 cr12> flags:0x0 b=-1 #max_time_incr > > LDC I8 -2147483649 <u=1 cr17> flags:0x0 b=-1 > > U4BAND <u=1 cr18> isop_flags:0x0 flags:0x0 b=-1 > > I4U4EQ <u=1 cr19> isop_flags:0x0 flags:0x0 b=-1 > >OPR_FALSEBR 1538 b=-1 flags:0x2 pj2 Sid-1 > > The expression > I4INTCONST 1 (0x1) > I4I4LDID 0 <st 20> T<4,.predef_I4,4> > I4SHL > I4INTCONST -1 (0xffffffffffffffff) > I4ADD > U4I4CVT > U4U4LDID 0 <st 19> T<8,.predef_U4,4> > is unexpected tranformed to > > LDC I8 -2147483649 <u=1 cr17> flags:0x0 b=-1 > > it should rightly transformed be > > > LDC U4 2147483647 <u=1 cr19> flags:0x0 b=-1 > > So, we can make sure there is something unexpected done in codemap phase. > > More investigation shows when control goes to Canon_CVT, it get the > representation of > I4INTCONST 1 (0x1) > I4I4LDID 0 <st 20> T<4,.predef_I4,4> > I4SHL > I4INTCONST -1 (0xffffffffffffffff) > I4ADD > to > LDC I8 -2147483649 <u=1 cr17> flags:0x0 b=-1 > > This is still right, instead of get I4 CONST, we get I8 const because a > canonical const scale is 64bit default, while fold the coderep > > LDC U4 1 <u=1 cr15> flags:0x0 b=-1 > > LDC U4 31 <u=2 cr13> flags:0x0 b=-1 > >I4SHL <u=0 cr0> isop_flags:0x0 flags:0x0 b=-1 > is sign extended to 0xffffffff10000000, so we get > > LDC I8 -2147483649 <u=1 cr17> flags:0x0 b=-1 > > The right solution here is we should not delete the U4I4CVT in > this expression. > > So, my suggested patch is to change the control flow in Canon_cvt to avoid > ignore U4I4CVT in such case. > > The patch below: > > osprey/be/opt/opt_htable.cxx -- 1fc05d0..0b7a14a 100644 > --- a/osprey/be/opt/opt_htable.cxx > +++ b/osprey/be/opt/opt_htable.cxx > @@ -2271,7 +2271,11 @@ CODEMAP::Canon_cvt(WN *wn, > #endif > if ((Get_mtype_class(OPCODE_rtype(op)) & > Get_mtype_class(OPCODE_desc(op))) != 0 && > - MTYPE_size_min(OPCODE_rtype(op)) == MTYPE_size_min(OPCODE_desc(op))) > + MTYPE_size_min(OPCODE_rtype(op)) == MTYPE_size_min(OPCODE_desc(op)) > && > + // bug912 open64.net. Do not delete U4I4CVT if his kid is a > constant > + (!(OPCODE_rtype(op) == MTYPE_U4 && > + OPCODE_desc(op) == MTYPE_I4 && > + ccr->Tree() == NULL))) > return propagated; > if ( WOPT_Enable_Cvt_Folding && > > This new patch does not lost the chances to U4I4CVT optimisation, while it > makes the case of U4I4CVT const right. > > If you still have other questions, let me know. > > Regards > Gang > |