|
From: Dominic A. <zer...@go...> - 2009-04-18 10:39:38
Attachments:
dm_merging_patch.txt
|
Hi, I have prepared a patch (see the attachment) and tested it against my simple "test.c" I had posted earlier. I also ran the CacheGrind-regression suit successfully. The patch has been done against the current svn-version. The patch should fix the way Dm-events are handled. Currently Dm-events are treated as if they were Dr-events which leads to less writes/Dw-events being recorded. This should lead to more or less inaccurate results depending on the instruction mix and frequency. If you look at the function "cg_fini" in "cg_main.c" which dumps the cache statistics you will see that "D_total" is derived from "Dw_total". Thus the number of total writes may be off as well as the miss-rates reported. I hope I did not introduce new bugs here. Please, review my patch. Ciao Dominic |
|
From: Josef W. <Jos...@gm...> - 2009-04-18 22:56:10
|
On Saturday 18 April 2009, Dominic Account wrote:
> I hope I did not introduce new bugs here. Please, review
> my patch.
Just a minor remark: inlining the patch would have made it easier.
> +void log_1I_1Dm_cache_access(InstrInfo* n, Addr data_addr, Word data_size)
> +{
> + //VG_(printf)("1I_1Dm: CCaddr=0x%010lx, iaddr=0x%010lx, isize=%lu\n"
> + // " daddr=0x%010lx, dsize=%lu\n",
> + // n, n->instr_addr, n->instr_len, data_addr, data_size);
> + cachesim_I1_doref(n->instr_addr, n->instr_len,
> + &n->parent->Ir.m1, &n->parent->Ir.m2);
> + n->parent->Ir.a++;
> +
> + cachesim_D1_doref(data_addr, data_size,
> + &n->parent->Dr.m1, &n->parent->Dr.m2);
> + cachesim_D1_doref(data_addr, data_size,
> + &n->parent->Dw.m1, &n->parent->Dw.m2);
Given the cache model and the fact that no other thread can access the cache
inbetween, the second call into the simulator should not be needed, as it always
will be a L1 hit. Same for other handlers.
It would be interesting to see the performance hit introduced by your patch.
Josef
> +
> + n->parent->Dr.a++;
> + n->parent->Dw.a++;
> +}
|
|
From: Dominic A. <zer...@go...> - 2009-04-19 10:59:20
|
Hi Josef,
Yes, thank you. The second call to the cache model is indeed not
necessary as it will
always hit. Thus the performance loss should be even lower.
My extended cache-model will still need to be called though - but that
is a different story.
Performance will probably stay about the same unless your compiler/architecture
moves to a vastly different operating point ;-) Something I have
noticed recently
with different gcc-options!
-Dominic
On Sun, Apr 19, 2009 at 12:55 AM, Josef Weidendorfer
<Jos...@gm...> wrote:
> On Saturday 18 April 2009, Dominic Account wrote:
>> I hope I did not introduce new bugs here. Please, review
>> my patch.
>
> Just a minor remark: inlining the patch would have made it easier.
>
>> +void log_1I_1Dm_cache_access(InstrInfo* n, Addr data_addr, Word data_size)
>> +{
>> + //VG_(printf)("1I_1Dm: CCaddr=0x%010lx, iaddr=0x%010lx, isize=%lu\n"
>> + // " daddr=0x%010lx, dsize=%lu\n",
>> + // n, n->instr_addr, n->instr_len, data_addr, data_size);
>> + cachesim_I1_doref(n->instr_addr, n->instr_len,
>> + &n->parent->Ir.m1, &n->parent->Ir.m2);
>> + n->parent->Ir.a++;
>> +
>> + cachesim_D1_doref(data_addr, data_size,
>> + &n->parent->Dr.m1, &n->parent->Dr.m2);
>> + cachesim_D1_doref(data_addr, data_size,
>> + &n->parent->Dw.m1, &n->parent->Dw.m2);
>
> Given the cache model and the fact that no other thread can access the cache
> inbetween, the second call into the simulator should not be needed, as it always
> will be a L1 hit. Same for other handlers.
> It would be interesting to see the performance hit introduced by your patch.
>
> Josef
>
>> +
>> + n->parent->Dr.a++;
>> + n->parent->Dw.a++;
>> +}
>
|