#5 packet waits in LS_FRONT_END_FAN_IN for 4000+ cycles

closed-fixed
Fuat Keceli
7
2008-12-13
2008-12-10
George Caragea
No

This is the jacobi benchmark. A TCU executes a loop. At each iteration, 4 load requests are sent. The values are added then one store is sent.

In this particular version, loop prefetching is enabled, and thus 4 prefetch instructions are executed at each iteration as well.

I am looking in particular at the execution of TCU 50. If you look at the generated tracedump (tcu50_memoperations.tracedump.txt), at clock cycle 7180, a packet for instr 111:pref is put in CLSTR_2_LS_FRONT_END_FAN_IN. After that, the packet stays there until cycle 11878 (when it reaches INW_SEND_INPUT_PORT_2) -- stuck there for 4698 cycles!!

However, in the meantime, other instructions from the same TCU are sent through the LS unit without any wait: 113:pref(packet created @7182), 114:pref(packet created @7183, 111:pref(packet created @7217).

How to reproduce:
$ xmtsim -v
XMT Simulator
Simulator Version: 0.81.98.r6216
Java Version: 1.6.0_10
(i used the devel simulator)

$ $ xmtsim -cycle -count -timer 30 -conf ./fpga8 -trace=directives -traceout jacobi_int_byrow.drampref.tracedump jacobi_int_byrow.drampref.sim -binload jacobi_int_byrow.drampref.b

Discussion

  • George Caragea
    George Caragea
    2008-12-10

    assembly and data file. also source C and xbo files

     
    Attachments
  • George Caragea
    George Caragea
    2008-12-13

    fixed by revisions 6219,6221 in the keceli-devel branch. propagated to power branch and george-devel

     
  • George Caragea
    George Caragea
    2008-12-13

    • status: open --> closed-fixed