|
From: Paul Y. <yin...@gm...> - 2009-08-14 09:31:23
|
Hi all,
I used cachegrind to evaluate the cache behavior. But the Dw number is
very strange.
//test1.c
#include <stdio.h>
#include <stdlib.h>
#define N 20000
int *parray[N];
int main ()
{
int i;
for (i = 0; i < N; i++)
parray[i] = (int *) malloc (10 * sizeof (int));
for (i = 0; i < N; i++)
parray[i] = (int *) (i);
return 0;
}
valgrind --tool=cachegrind ./test1
1) On 64-bit Intel Xeon CPU 3050 2.13GHz dual core: cache line size
is 64 bytes.
gcc -O2 -g -o test1 test1.c
sizeof (int*) = 8.
Ir I1mr I2mr Dr D1mr D2mr Dw D1mw D2mw
----------------------------------------------------
39,998 0 0 0 0 0 0 0 0 for (i = 0; i < N; i++)
80,000 1 1 0 0 0 40,000 2,501 2,501 parray[i] = (int
*) malloc (10 * sizeof (int));
39,998 0 0 0 0 0 0 0 0 for (i = 0; i < N; i++)
40,000 0 0 0 0 0 20,000 2,501 0 parray[i] = (int *) (i);
Question: Why is Dw is 40,000 for the line of malloc()? The
corresponding assembly is "movq %rax, pdarray(,%rbx,8)". The D1mw is
reasonable. 20,000 * 8 / 64 = 2500.
2) On Dual Core AMD Opteron Processor 270: cache line size is 64 bytes.
gcc -m32 -O2 -g -o test1 test1.c
sizeof (int*) = 4.
Ir I1mr I2mr Dr D1mr D2mr Dw D1mw D2mw
----------------------------------------------------
80,002 0 0 0 0 0 0 0 0 for (i = 0; i < N; i++)
80,000 0 0 0 0 0 60,000 1,250 1,250
parray[i] = (int *) malloc (10 * sizeof (int));
60,002 0 0 0 0 0 0 0 0 for (i = 0; i < N; i++)
20,000 0 0 0 0 0 20,000 1,250 0
parray[i] = (int *) (i);
Both Ir and Dw numbers are wrong. The D1mw is reasonable. 20,000 * 4 /
64 = 1250.
Any suggestion is welcome.
--
Regards,
Paul Yuan (袁鹏)
|
|
From: Nicholas N. <n.n...@gm...> - 2009-08-14 11:05:53
|
On Fri, Aug 14, 2009 at 7:31 PM, Paul Yuan<yin...@gm...> wrote: > Hi all, > > I used cachegrind to evaluate the cache behavior. But the Dw number is > very strange. I suspect the assembly code doesn't look like you think it does -- that it is doing more memory writes than you think, particularly for the malloc() calls; perhaps this is due to an argument being passed on the stack? I suggest annotating the assembly code rather than the C code to really understand what's happening; see http://www.valgrind.org/docs/manual/cg-manual.html#cg-manual.assembler for details how. Nick > > //test1.c > #include <stdio.h> > #include <stdlib.h> > > #define N 20000 > int *parray[N]; > > int main () > { > int i; > > for (i = 0; i < N; i++) > parray[i] = (int *) malloc (10 * sizeof (int)); > for (i = 0; i < N; i++) > parray[i] = (int *) (i); > > return 0; > } > > valgrind --tool=cachegrind ./test1 > > 1) On 64-bit Intel Xeon CPU 3050 2.13GHz dual core: cache line size > is 64 bytes. > gcc -O2 -g -o test1 test1.c > sizeof (int*) = 8. > > Ir I1mr I2mr Dr D1mr D2mr Dw D1mw D2mw > ---------------------------------------------------- > 39,998 0 0 0 0 0 0 0 0 for (i = 0; i < N; i++) > 80,000 1 1 0 0 0 40,000 2,501 2,501 parray[i] = (int > *) malloc (10 * sizeof (int)); > > 39,998 0 0 0 0 0 0 0 0 for (i = 0; i < N; i++) > 40,000 0 0 0 0 0 20,000 2,501 0 parray[i] = (int *) (i); > > > Question: Why is Dw is 40,000 for the line of malloc()? The > corresponding assembly is "movq %rax, pdarray(,%rbx,8)". The D1mw is > reasonable. 20,000 * 8 / 64 = 2500. > > 2) On Dual Core AMD Opteron Processor 270: cache line size is 64 bytes. > gcc -m32 -O2 -g -o test1 test1.c > sizeof (int*) = 4. > > Ir I1mr I2mr Dr D1mr D2mr Dw D1mw D2mw > ---------------------------------------------------- > 80,002 0 0 0 0 0 0 0 0 for (i = 0; i < N; i++) > 80,000 0 0 0 0 0 60,000 1,250 1,250 > parray[i] = (int *) malloc (10 * sizeof (int)); > > 60,002 0 0 0 0 0 0 0 0 for (i = 0; i < N; i++) > 20,000 0 0 0 0 0 20,000 1,250 0 > parray[i] = (int *) (i); > > Both Ir and Dw numbers are wrong. The D1mw is reasonable. 20,000 * 4 / > 64 = 1250. > > Any suggestion is welcome. > > -- > Regards, > Paul Yuan (袁鹏) > > ------------------------------------------------------------------------------ > Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day > trial. Simplify your report design, integration and deployment - and focus on > what you do best, core application coding. Discover what's new with > Crystal Reports now. http://p.sf.net/sfu/bobj-july > _______________________________________________ > Valgrind-users mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-users > |