Re: [Open64-devel] FW: one problem about software pipelined loop body
Brought to you by:
ributzka,
suneeljain
From: shuxin y. <sx...@ic...> - 2004-01-04 11:52:52
|
On Saturday 03 January 2004 06:39, Ju, Roy wrote: o. Loop invariant code motion is performed automatically by PRE in the WO= PT.=20 However, some loop invariant are created in the process of IR lowerin= g.=20 In ORC 2.1, there is a phase in CG dedicated to the LICM.=20 o. When the edge-prof does not present, the LICM-er simplily skip perform= ing=20 LICM because the BB_freq(loop-head) =3D 0; o. When the edge-prof present (the loop iteration =3D some 1380), the LIC= M-er =09 also could not move the load out of its enclosing loop because=20 CG_DEP_Mem_Ops_Alias() say the load alias with the only store in=20 the loop. - it may be a defect of alias analysis, however, I have not=20 get a closer look into the component, and I am also *EXTREMELY* =20 clumsy/stupid at that phase.=20 sxyang. > I suspect that the compiler thinks there is an alias between the load > and the store in the loop body so it does not treat the load as LICM. > This is a good case arguing for data speculation in swp loop. > > Roy > > =09-----Original Message----- > =09From: ope...@li... > [mailto:ope...@li...] On Behalf Of Lin, > Jason H > =09Sent: Thursday, January 01, 2004 10:30 PM > =09To: ipf...@li...; Open64 > =09Subject: [Open64-devel] FW: one problem about software pipelined > loop body > > > > > > =09Best regards, > =09Jason > =09-----Original Message----- > =09From: Kevin Li [mailto:liw...@ma...] > =09Sent: Wednesday, December 31, 2003 10:30 AM > =09To: Lin, Jason H > =09Subject: Fw: one problem about software pipelined loop body > > > =09why I can't send messages to these two mailsgroup, I subscribe > them, but everytime when I send mail, it is rejected. > > > =09----- Original Message ----- > =09From: Kevin Li <mailto:liw...@ma...> > =09To: ipf...@li... > =09Cc: ope...@li... > =09Sent: Wednesday, December 31, 2003 10:25 AM > =09Subject: one problem about software pipelined loop body > > =09hi, guys, > > =09 I found a quesion in the software pipelined loop body, there > is an operation which does the same thing in each loop iteration, and > for this kind of operation, it should be moved to outside and only > executing once is enough for the loop. The following is the scenario > > =09for (i=3D1;i<=3Dnx;i++) > > =09 for (j=3D0;j<=3Dny;j++) > > =09 chanx_track[i][j] =3D -1; > > =09this loop is extracted from spec2000, and the following is the > corresponding assembly code. > > =09//<swps> > > =09//<swps> 4 cycles per 1 iteration in steady state > > =09//<swps> 1 pipeline stages > > =09//<swps> > > =09//<swps> min 1 cycles required by resources > > =09//<swps> min 4 cycles required by recurrences > > =09//<swps> min 4 cycles required by resources/recurrence > > =09//<swps> min 4 cycles (actual 4 cycles) required to > schedule one iteration > > =09{ .mfi > > =09 .loc 1 989 0 > > =09 (p17) st4 [r8]=3Dr3,4 // [0*II+0] > id:177 > > =09 nop.f 0 // [0*II+0] > > =09 nop.i 0 ;; // [0*II+0] > > =09 } { .mfi > > =09 (p17) ld4 r32=3D[r2] // [0*II+1] id:174 > ny+0x0 > > =09 nop.f 0 // [0*II+1] > > =09 nop.i 0 ;; // [0*II+1] > > =09 } { .mfi > > =09 (p17) adds r33=3D1,r34 // [0*II+2] > > =09 nop.f 0 // [0*II+2] > > =09 nop.i 0 ;; // [0*II+2] > > =09 } > > =09.tag_23_135: > > =09 { .mfb > > =09 (p17) cmp4.le p16,p0=3Dr33,r32 // [0*II+3] > > =09 nop.f 0 // [0*II+3] > > =09 (p16) br.wtop.dptk.few .Lt_23_102 ;; // [0*II+3] > > =09 } > > > > =09The highlighted instruction is executed in every iteration and > it does the same function. If it is removed , the performance of this > loop could be further improved, right? > > =09why such phenomenon occurs in orc(I test serveral versions of > orc, the results are same). > > > > =09There are a lot of such scenarios, especially in spec int2000 > benchmarks. > > > > =09best regards |