#118 Loop unrolling: improve rampup when bounds are constants

closed
nobody
6
2012-09-21
2005-02-25
Dave Grove
No

OPT_LoopUnrolling doesn't special case rampup loop
generation when the loop bounds are compile time
constants (and thus the number of iterations of rampup
before enterting the unrolled loop can be determined at
compile time). In particular, consider the following
simple test loop:

int test(int x) {
for (int i=0; i<1000; i++) {
x += 10;
}
return x;
}

We currently generate the following final HIR:

* START OF IR DUMP Final HIR FOR
LoopTest.test (I)I
-13 LABEL0 Frequency: 1.0
-2 EG ir_prologue

l0sa(LLoopTest;,x,d), t27pi(I,d) =
0 G yieldpoint_prologue
-1 int_move t16si(I) = 3
-1 int_move t19psi(I) = 4
-9 int_move t26pi(Z) = 0
-1 bbend BB0 (ENTRY)
5 LABEL1 Frequency: 1.8
5 G yieldpoint_backedge
5 int_add t27pi(I) = t27pi(I), 10
8 int_add t26pi(I) = t26pi(I), 1
-1 int_ifcmp t24sv(GUARD) =
t26pi(I), t19psi(I), <, LABEL1, Probability: 0.5
-1 bbend BB1
5 LABEL2 Frequency: 0.9
-1 int_ifcmp t25sv(GUARD) =
t26pi(I), 1000, >=, LABEL4, Probability: 0.100000024
-1 bbend BB2
5 LABEL3 Frequency: 2.0249994
5 G yieldpoint_backedge
5 int_add t27pi(I) = t27pi(I), 40
8 int_add t26pi(I) = t26pi(I), 4
-1 int_ifcmp t36sv(GUARD) =
t26pi(I), 1000, <, LABEL3, Probability: 0.5999999
-1 bbend BB3
18 LABEL4 Frequency: 1.0
-10 G yieldpoint_epilogue
-3 return t27pi(I)
-1 bbend BB4
*
END OF IR DUMP Final HIR FOR
LoopTest.test (I)I

Since the loop bounds are known at compile time (0,
1000) we should know 1000 % 4 == 0, therefore we can
generate something more or less like: :

* START OF IR DUMP Final HIR FOR
LoopTest.test (I)I
-13 LABEL0 Frequency: 1.0
-2 EG ir_prologue

l0sa(LLoopTest;,x,d), t27pi(I,d) =
0 G yieldpoint_prologue
-1 int_move t16si(I) = 3
-1 int_move t19psi(I) = 4
-9 int_move t26pi(Z) = 0
-1 bbend BB0 (ENTRY)
5 LABEL3 Frequency: 2.0249994
5 G yieldpoint_backedge
5 int_add t27pi(I) = t27pi(I), 40
8 int_add t26pi(I) = t26pi(I), 4
-1 int_ifcmp t36sv(GUARD) =
t26pi(I), 1000, <, LABEL3, Probability: 0.5999999
-1 bbend BB3
18 LABEL4 Frequency: 1.0
-10 G yieldpoint_epilogue
-3 return t27pi(I)
-1 bbend BB4
*
END OF IR DUMP Final HIR FOR
LoopTest.test (I)I

I think we should detect when # of iterations is a
compile time constant and at least special case needing
0 or 1 iterations of the rampup loop. We should also
avoid the test to see if the rampup loop has done all
of the iterations. Furthermore, if the number of
iterations is known to be small, we should completely
unroll the loop.

General scope of the feature: notice when # of
iterations is known at compile time and be less stupid
about the unrolled loop + associated rampup/rampdown code.

Discussion

  • Ian Rogers

    Ian Rogers - 2005-03-04

    Logged In: YES
    user_id=308843

    I and Jisheng Zhou are working on a number of loop
    optimizations for the Jikes RVM and have implemented a more
    advanced loop unroller. In particular it addresses this
    issue as well as optimizing loops of the form:

    for(int i=0; i < N; i++){
    A[i] = -1;
    }

    where N is unknown to

    int i=0;
    for(; i < (N-3);){
    A[i] = -1; i++;
    A[i] = -1; i++;
    A[i] = -1; i++;
    A[i] = -1; i++;
    }
    if(i < (N-1)){
    A[i] = -1; i++;
    A[i] = -1; i++;
    }
    if(i < N){
    A[i] = -1; i++;
    }

    we are working on this work as part of a loop optimization
    framework and using the SSA form. This will currently have a
    higher compilation cost than is desirable.

     
  • Dave Grove

    Dave Grove - 2007-07-05

    Logged In: YES
    user_id=1215435
    Originator: YES

    closing SF tracker as out of date. We should open JIRA items for the O3 opts (and new opts) that we are planning to pull into O2.

     

Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks