Re: [Open64-devel] FW: one problem about software pipelined loop body

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

On Saturday 03 January 2004 06:39, Ju, Roy wrote:

o. Loop invariant code motion is performed automatically by PRE in the WO=
PT.=20
    However, some loop invariant are created in the process of IR lowerin=
g.=20
    In ORC 2.1, there is a phase in CG dedicated to the LICM.=20

o. When the edge-prof does not present, the LICM-er simplily skip perform=
ing=20
   LICM because the BB_freq(loop-head) =3D 0;

o. When the edge-prof present (the loop iteration =3D some 1380), the LIC=
M-er =09
   also could not move the  load out of its enclosing loop because=20
   CG_DEP_Mem_Ops_Alias() say the load alias with the only store in=20
   the loop. - it may be a defect of alias analysis, however, I have not=20
   get a closer look into the component, and I am also *EXTREMELY*   =20
   clumsy/stupid  at that phase.=20

sxyang.

> I suspect that the compiler thinks there is an alias between the load
> and the store in the loop body so it does not treat the load as LICM.
> This is a good case arguing for data speculation in swp loop.
>
> Roy
>
> =09-----Original Message-----
> =09From: ope...@li...
> [mailto:ope...@li...] On Behalf Of Lin,
> Jason H
> =09Sent: Thursday, January 01, 2004 10:30 PM
> =09To: ipf...@li...; Open64
> =09Subject: [Open64-devel] FW: one problem about software pipelined
> loop body
>
>
>
>
>
> =09Best regards,
> =09Jason
> =09-----Original Message-----
> =09From: Kevin Li [mailto:liw...@ma...]
> =09Sent: Wednesday, December 31, 2003 10:30 AM
> =09To: Lin, Jason H
> =09Subject: Fw: one problem about software pipelined loop body
>
>
> =09why I can't send messages to these two mailsgroup, I subscribe
> them, but everytime when I send mail, it is rejected.
>
>
> =09----- Original Message -----
> =09From: Kevin Li <mailto:liw...@ma...>
> =09To: ipf...@li...
> =09Cc: ope...@li...
> =09Sent: Wednesday, December 31, 2003 10:25 AM
> =09Subject: one problem about software pipelined loop body
>
> =09hi, guys,
>
> =09   I found a quesion in the software pipelined loop body, there
> is an operation which does the same thing in each loop iteration, and
> for this kind of operation, it should be moved to outside and only
> executing once is enough for the loop. The following is the scenario
>
> =09for (i=3D1;i<=3Dnx;i++)
>
> =09       for (j=3D0;j<=3Dny;j++)
>
> =09          chanx_track[i][j] =3D -1;
>
> =09this loop is extracted from spec2000, and the following is the
> corresponding assembly code.
>
> =09//<swps>
>
> =09//<swps>   4 cycles per 1 iteration in steady state
>
> =09//<swps>   1 pipeline stages
>
> =09//<swps>
>
> =09//<swps>      min 1 cycles required by resources
>
> =09//<swps>      min 4 cycles required by recurrences
>
> =09//<swps>      min 4 cycles required by resources/recurrence
>
> =09//<swps>      min 4 cycles (actual 4 cycles) required to
> schedule one iteration
>
> =09{ .mfi
>
> =09       .loc  1     989  0
>
> =09       (p17) st4 [r8]=3Dr3,4                     // [0*II+0]
> id:177
>
> =09             nop.f 0                              // [0*II+0]
>
> =09             nop.i 0 ;;                      // [0*II+0]
>
> =09 } { .mfi
>
> =09       (p17) ld4 r32=3D[r2]                     // [0*II+1] id:174
> ny+0x0
>
> =09             nop.f 0                              // [0*II+1]
>
> =09             nop.i 0 ;;                      // [0*II+1]
>
> =09 } { .mfi
>
> =09       (p17) adds r33=3D1,r34                  // [0*II+2]
>
> =09             nop.f 0                              // [0*II+2]
>
> =09             nop.i 0 ;;                      // [0*II+2]
>
> =09 }
>
> =09.tag_23_135:
>
> =09 { .mfb
>
> =09       (p17) cmp4.le p16,p0=3Dr33,r32           // [0*II+3]
>
> =09             nop.f 0                              // [0*II+3]
>
> =09       (p16) br.wtop.dptk.few .Lt_23_102 ;;         // [0*II+3]
>
> =09 }
>
>
>
> =09The highlighted instruction is executed in every iteration and
> it does the same function. If it is removed , the performance of this
> loop could be further improved, right?
>
> =09why such phenomenon occurs in orc(I test serveral versions of
> orc, the results are same).
>
>
>
> =09There are a lot of such scenarios, especially in spec int2000
> benchmarks.
>
>
>
> =09best regards