OPT_LoopUnrolling doesn't special case rampup loop

generation when the loop bounds are compile time

constants (and thus the number of iterations of rampup

before enterting the unrolled loop can be determined at

compile time). In particular, consider the following

simple test loop:

int test(int x) {

for (int i=0; i<1000; i++) {

x += 10;

}

return x;

}

We currently generate the following final HIR:

***** START OF IR DUMP Final HIR FOR**

LoopTest.test (I)I

-13 LABEL0 Frequency: 1.0

-2 EG ir_prologue

l0sa(LLoopTest;,x,d), t27pi(I,d) =

0 G yieldpoint_prologue

-1 int_move t16si(I) = 3

-1 int_move t19psi(I) = 4

-9 int_move t26pi(Z) = 0

-1 bbend BB0 (ENTRY)

5 LABEL1 Frequency: 1.8

5 G yieldpoint_backedge

5 int_add t27pi(I) = t27pi(I), 10

8 int_add t26pi(I) = t26pi(I), 1

-1 int_ifcmp t24sv(GUARD) =

t26pi(I), t19psi(I), <, LABEL1, Probability: 0.5

-1 bbend BB1

5 LABEL2 Frequency: 0.9

-1 int_ifcmp t25sv(GUARD) =

t26pi(I), 1000, >=, LABEL4, Probability: 0.100000024

-1 bbend BB2

5 LABEL3 Frequency: 2.0249994

5 G yieldpoint_backedge

5 int_add t27pi(I) = t27pi(I), 40

8 int_add t26pi(I) = t26pi(I), 4

-1 int_ifcmp t36sv(GUARD) =

t26pi(I), 1000, <, LABEL3, Probability: 0.5999999

-1 bbend BB3

18 LABEL4 Frequency: 1.0

-10 G yieldpoint_epilogue

-3 return t27pi(I)

-1 bbend BB4

*** END OF IR DUMP Final HIR FOR

LoopTest.test (I)I

Since the loop bounds are known at compile time (0,

1000) we should know 1000 % 4 == 0, therefore we can

generate something more or less like: :

***** START OF IR DUMP Final HIR FOR**

LoopTest.test (I)I

-13 LABEL0 Frequency: 1.0

-2 EG ir_prologue

l0sa(LLoopTest;,x,d), t27pi(I,d) =

0 G yieldpoint_prologue

-1 int_move t16si(I) = 3

-1 int_move t19psi(I) = 4

-9 int_move t26pi(Z) = 0

-1 bbend BB0 (ENTRY)

5 LABEL3 Frequency: 2.0249994

5 G yieldpoint_backedge

5 int_add t27pi(I) = t27pi(I), 40

8 int_add t26pi(I) = t26pi(I), 4

-1 int_ifcmp t36sv(GUARD) =

t26pi(I), 1000, <, LABEL3, Probability: 0.5999999

-1 bbend BB3

18 LABEL4 Frequency: 1.0

-10 G yieldpoint_epilogue

-3 return t27pi(I)

-1 bbend BB4

*** END OF IR DUMP Final HIR FOR

LoopTest.test (I)I

I think we should detect when # of iterations is a

compile time constant and at least special case needing

0 or 1 iterations of the rampup loop. We should also

avoid the test to see if the rampup loop has done all

of the iterations. Furthermore, if the number of

iterations is known to be small, we should completely

unroll the loop.

General scope of the feature: notice when # of

iterations is known at compile time and be less stupid

about the unrolled loop + associated rampup/rampdown code.