|
From: Felipe A. <fel...@gm...> - 2008-06-10 04:54:08
|
Hello everybody,
I got a problem with valgrind's tool cachegrind, but first I'd like to
enlighten some points:
- I'm a computer science student (UFSM - Brazil, which explains my lack of
english skills) who is just now trying to get familiar with Linux (yeah, I
know how ridiculous it is a CC professional who doesn't use Linux), so
excuse any basic mistake I might be making
- I'm reading the manual and searching the internet for hours now, so please
excuse if I missed some basic step, also, I didn't find this mailing list's
search feature very userfriendly :P It doesn't seem to recognize some
characters (eg. underline)
So here is the question:
I have to test 3 kinds of cache mapping types (Direct Mapped Cache, Fully
Associative Cache, N-Way Set Associative Cache,* I think those are their
respective names in english*) while running different kinds of C/C++
programs and I have to build a comparison chart with all the data I collect.
So I compiled a C program and used the following command:
valgrind --tool=cachegrind --I1=65535,2,64 --D1=65535,2,64 --L2=262144,8,64
./prog.out
prog.out is the compiled program
It seems to work and I'm going to work on the results later on *(I1 605
misses, D1 1,007 misses, L2 1,612 misses, its a program that sets each
position of a a[20] vector with i*10 value, 'i' varies 1 unit per
loop-cycle)*, but I was trying to use the CG_ANNOTATION tool/feature and it
doesn't seem to work.
I tried the following lines:
cg_annotate output ./prog.out
It said "Cannot open output for reading" so I made a blank file named output
and tried again. The following error message returned:
"Line 1: missing command line"
When I tried using: cg_annotate output prog.c (the source code file) I got a
file with some strange data which I don't know if is correct or not:
17 1 0 0 0 0 0 1 0 0
18 1 0 0 0 0 0 0 0 0
19 1 0 0 0 0 0 0 0 0
22 1 0 0 0 0 0 0 0 0
23 1 0 0 1 0 0 0 0 0
24 1 0 0 0 0 0 0 0 0
25 1 0 0 0 0 0 0 0 0
41 1 0 0 0 0 0 1 0 0
42 1 0 0 0 0 0 0 0 0
43 1 0 0 0 0 0 1 0 0
44 1 0 0 0 0 0 0 0 0
45 1 0 0 0 0 0 0 0 0
48 1 0 0 0 0 0 0 0 0
fl=/build/buildd/glibc-2.7/build-tree/i386-libc/csu/crtn.S
fn=???
10 1 0 0 1 0 0 0 0 0
11 1 0 0 1 0 0 0 0 0
12 1 0 0 1 0 0 0 0 0
13 1 0 0 1 0 0 0 0 0
21 1 0 0 1 0 0 0 0 0
22 1 0 0 1 0 0 0 0 0
23 1 0 0 1 0 0 0 0 0
24 1 0 0 1 0 0 0 0 0
fl=???
fn=(below main)
0 64 4 4 19 0 0 26 0 0
fn=???
0 112527 566 565 40557 848 807 15466 152 151
fn=_Exit
0 3 1 1 2 0 0 1 0 0
fn=__cxa_atexit
0 27 2 2 8 0 0 9 0 0
fn=__libc_csu_init
0 21 1 1 5 0 0 6 0 0
fn=__libc_memalign
0 799 4 4 236 0 0 187 4 4
fn=_dl_allocate_tls_init
0 93 8 8 35 0 0 26 0 0
fn=_dl_debug_state
0 8 1 1 4 0 0 2 0 0
fn=_setjmp
0 16 2 2 5 0 0 7 0 0
fn=calloc
0 208 2 2 56 0 0 56 1 1
fn=exit
0 55 5 5 14 0 0 13 0 0
fn=main
0 217 0 0 85 0 0 24 0 0
fn=malloc
0 280 2 2 80 0 0 120 1 1
fn=rindex
0 68 6 6 8 1 0 2 0 0
The .c source code:
#include <stdio.h>
int main ()
{
int a[20], i;
for (i=0;i<20;i++)
{
a[i]=i*10;
}
return 0;
}
What am I missing?
Thanks in advance,
Felipe.
--
Felipe
|
|
From: Felipe A. <fel...@gm...> - 2008-06-12 18:36:32
|
Oops, math mistake :P I just finished my work. Thank you to both of you who helped me. Regards~ 2008/6/12 Hien Le <Hi...@me...>: There were 115773 instruction refs total, of which 611 references were I1 > misses. 661 / 115773 = 0.52%. > > > Thanks once again! >> >> Now, in the end of the test, I have to analize all the data collected and >> I >> found another barrier for succeeding in such task. >> >> I couldn't figure out how the miss rates work, because if there were >> 115773 >> references on I, no way 611 is 0.52%, it should be 1.89%, same goes for >> all >> the others miss rates. Am I missing something? >> ==6464== I refs: 115,773 >> ==6464== I1 misses: 611 >> ==6464== L2i misses: 609 >> ==6464== I1 miss rate: 0.52% *-> this should be 1,89%, isn't >> it? x=115773/(611*100)* >> ==6464== L2i miss rate: 0.52% >> ==6464== >> ==6464== D refs: 57,878 (41,851 rd + 16,027 wr) >> ==6464== D1 misses: 1,063 ( 899 rd + 164 wr) >> ==6464== L2d misses: 977 ( 821 rd + 156 wr) >> ==6464== D1 miss rate: 1.8% ( 2.1% + 1.0% ) >> ==6464== L2d miss rate: 1.6% ( 1.9% + 0.9% ) >> ==6464== >> ==6464== L2 refs: 1,674 ( 1,510 rd + 164 wr) >> ==6464== L2 misses: 1,586 ( 1,430 rd + 156 wr) >> ==6464== L2 miss rate: 0.9% ( 0.9% + 0.9% ) >> >> Kind regards >> > -- Felipe -- Felipe |
|
From: Felipe A. <fel...@gm...> - 2008-06-13 03:30:21
|
Oh well, here I am again, again with the math issue: ==6464== L2 refs: 1,674 ( 1,510 rd + 164 wr) ==6464== L2 misses: 1,586 ( 1,430 rd + 156 wr) ==6464== L2 miss rate: 0.9% ( 0.9% + 0.9% ) l2 refs 1674 l2 misses 1586, how come it says 0,9% miss rate? 2008/6/12 Felipe Athayde <fel...@gm...>: > Oops, math mistake :P > > I just finished my work. Thank you to both of you who helped me. > > Regards~ > > 2008/6/12 Hien Le <Hi...@me...>: > > There were 115773 instruction refs total, of which 611 references were I1 >> misses. 661 / 115773 = 0.52%. >> >> >> Thanks once again! >>> >>> Now, in the end of the test, I have to analize all the data collected and >>> I >>> found another barrier for succeeding in such task. >>> >>> I couldn't figure out how the miss rates work, because if there were >>> 115773 >>> references on I, no way 611 is 0.52%, it should be 1.89%, same goes for >>> all >>> the others miss rates. Am I missing something? >>> ==6464== I refs: 115,773 >>> ==6464== I1 misses: 611 >>> ==6464== L2i misses: 609 >>> ==6464== I1 miss rate: 0.52% *-> this should be 1,89%, isn't >>> it? x=115773/(611*100)* >>> ==6464== L2i miss rate: 0.52% >>> ==6464== >>> ==6464== D refs: 57,878 (41,851 rd + 16,027 wr) >>> ==6464== D1 misses: 1,063 ( 899 rd + 164 wr) >>> ==6464== L2d misses: 977 ( 821 rd + 156 wr) >>> ==6464== D1 miss rate: 1.8% ( 2.1% + 1.0% ) >>> ==6464== L2d miss rate: 1.6% ( 1.9% + 0.9% ) >>> ==6464== >>> ==6464== L2 refs: 1,674 ( 1,510 rd + 164 wr) >>> ==6464== L2 misses: 1,586 ( 1,430 rd + 156 wr) >>> ==6464== L2 miss rate: 0.9% ( 0.9% + 0.9% ) >>> >>> Kind regards >>> >> > > > -- > Felipe > > > > -- > Felipe -- Felipe |
|
From: Nicholas N. <nj...@cs...> - 2008-06-13 04:51:19
|
On Fri, 13 Jun 2008, Felipe Athayde wrote: > Oh well, here I am again, again with the math issue: > ==6464== L2 refs: 1,674 ( 1,510 rd + 164 wr) > ==6464== L2 misses: 1,586 ( 1,430 rd + 156 wr) > ==6464== L2 miss rate: 0.9% ( 0.9% + 0.9% ) > > l2 refs 1674 > l2 misses 1586, how come it says 0,9% miss rate? L2 miss rate is (L2 misses / total refs), not (L2 misses / L2 refs). Nick |
|
From: Nicholas N. <nj...@cs...> - 2008-06-10 22:20:07
|
On Tue, 10 Jun 2008, Felipe Athayde wrote: > So I compiled a C program and used the following command: > valgrind --tool=cachegrind --I1=65535,2,64 --D1=65535,2,64 --L2=262144,8,64 > ./prog.out > prog.out is the compiled program > > I tried the following lines: > cg_annotate output ./prog.out > It said "Cannot open output for reading" so I made a blank file named output > and tried again. The following error message returned: > "Line 1: missing command line" > When I tried using: cg_annotate output prog.c (the source code file) I got a > file with some strange data which I don't know if is correct or not: > 17 1 0 0 0 0 0 1 0 0 > 18 1 0 0 0 0 0 0 0 0 > 19 1 0 0 0 0 0 0 0 0 > 22 1 0 0 0 0 0 0 0 0 > 23 1 0 0 1 0 0 0 0 0 > 24 1 0 0 0 0 0 0 0 0 > 25 1 0 0 0 0 0 0 0 0 > 41 1 0 0 0 0 0 1 0 0 > 42 1 0 0 0 0 0 0 0 0 > 43 1 0 0 0 0 0 1 0 0 > 44 1 0 0 0 0 0 0 0 0 > 45 1 0 0 0 0 0 0 0 0 > 48 1 0 0 0 0 0 0 0 0 > fl=/build/buildd/glibc-2.7/build-tree/i386-libc/csu/crtn.S This "strange data" is the profiling data that Cachegrind collects. By default it goes in a file named "cachegrind.out.<pid>" where <pid> is the process ID of the running program. You can choose a different name with Cachegrind's --cachegrind-out-file option. To inspect the data, you use cg_annotate. Its usage is: cg_annotate [options] output-file [source-files] If your output file was cachegrind.out.12345, The simplest use would be: cg_annotate cachegrind.out.12345 That gives basic info like total cache hits/misses, and per-function hits/misses. You can also specify the names of source files you want annotated on a line-by-line basis. Eg. if you want to annotate files a.c and b.c: cg_annotate cachegrind.out.12345 a.c b.c Or, if you just want to annotate every source file in your program: cg_annotate --auto=yes cachegrind.out.12345 All this assumes you have Valgrind 3.3.0 or later. In earlier versions, the cg_annotate command line invocation was slightly different, but the above info should be enough for you to work it out (see also 'cg_annotate --help'). Hopefully this makes it clear. See section 5.2 of the manual for more details. Nick |
|
From: Felipe A. <fel...@gm...> - 2008-06-10 22:41:25
|
Hello, I can't express how thankful I am! Would you mind to answer me one last question? How do I choose which kind of mapping (Direct Mapped Cache, Fully Associative Cache, N-Way Set Associative Cache) is going to be used? Because I have to make a comparison between of al them. Thanks again! 2008/6/10 Nicholas Nethercote <nj...@cs...>: > On Tue, 10 Jun 2008, Felipe Athayde wrote: > > So I compiled a C program and used the following command: >> valgrind --tool=cachegrind --I1=65535,2,64 --D1=65535,2,64 >> --L2=262144,8,64 >> ./prog.out >> prog.out is the compiled program >> >> I tried the following lines: >> cg_annotate output ./prog.out >> It said "Cannot open output for reading" so I made a blank file named >> output >> and tried again. The following error message returned: >> "Line 1: missing command line" >> When I tried using: cg_annotate output prog.c (the source code file) I got >> a >> file with some strange data which I don't know if is correct or not: >> 17 1 0 0 0 0 0 1 0 0 >> 18 1 0 0 0 0 0 0 0 0 >> 19 1 0 0 0 0 0 0 0 0 >> 22 1 0 0 0 0 0 0 0 0 >> 23 1 0 0 1 0 0 0 0 0 >> 24 1 0 0 0 0 0 0 0 0 >> 25 1 0 0 0 0 0 0 0 0 >> 41 1 0 0 0 0 0 1 0 0 >> 42 1 0 0 0 0 0 0 0 0 >> 43 1 0 0 0 0 0 1 0 0 >> 44 1 0 0 0 0 0 0 0 0 >> 45 1 0 0 0 0 0 0 0 0 >> 48 1 0 0 0 0 0 0 0 0 >> fl=/build/buildd/glibc-2.7/build-tree/i386-libc/csu/crtn.S >> > > This "strange data" is the profiling data that Cachegrind collects. By > default it goes in a file named "cachegrind.out.<pid>" where <pid> is the > process ID of the running program. You can choose a different name with > Cachegrind's --cachegrind-out-file option. > > To inspect the data, you use cg_annotate. Its usage is: > > cg_annotate [options] output-file [source-files] > > If your output file was cachegrind.out.12345, The simplest use would be: > > cg_annotate cachegrind.out.12345 > > That gives basic info like total cache hits/misses, and per-function > hits/misses. > > You can also specify the names of source files you want annotated on a > line-by-line basis. Eg. if you want to annotate files a.c and b.c: > > cg_annotate cachegrind.out.12345 a.c b.c > > Or, if you just want to annotate every source file in your program: > > cg_annotate --auto=yes cachegrind.out.12345 > > All this assumes you have Valgrind 3.3.0 or later. In earlier versions, > the cg_annotate command line invocation was slightly different, but the > above info should be enough for you to work it out (see also 'cg_annotate > --help'). > > Hopefully this makes it clear. See section 5.2 of the manual for more > details. > > Nick > > -- Felipe |
|
From: Nicholas N. <nj...@cs...> - 2008-06-11 22:11:25
|
On Tue, 10 Jun 2008, Felipe Athayde wrote: > Would you mind to answer me one last question? How do I choose which kind of > mapping (Direct Mapped Cache, Fully Associative Cache, N-Way Set Associative > Cache) is going to be used? Because I have to make a comparison between of > al them. The 2nd value in the --I1/--D1/--L2 options is the associativity. For fully associative, just set it high enough according to the size and line-size of your cache. Nick |
|
From: Felipe A. <fel...@gm...> - 2008-06-12 13:35:33
|
Thanks once again! Now, in the end of the test, I have to analize all the data collected and I found another barrier for succeeding in such task. I couldn't figure out how the miss rates work, because if there were 115773 references on I, no way 611 is 0.52%, it should be 1.89%, same goes for all the others miss rates. Am I missing something? ==6464== I refs: 115,773 ==6464== I1 misses: 611 ==6464== L2i misses: 609 ==6464== I1 miss rate: 0.52% *-> this should be 1,89%, isn't it? x=115773/(611*100)* ==6464== L2i miss rate: 0.52% ==6464== ==6464== D refs: 57,878 (41,851 rd + 16,027 wr) ==6464== D1 misses: 1,063 ( 899 rd + 164 wr) ==6464== L2d misses: 977 ( 821 rd + 156 wr) ==6464== D1 miss rate: 1.8% ( 2.1% + 1.0% ) ==6464== L2d miss rate: 1.6% ( 1.9% + 0.9% ) ==6464== ==6464== L2 refs: 1,674 ( 1,510 rd + 164 wr) ==6464== L2 misses: 1,586 ( 1,430 rd + 156 wr) ==6464== L2 miss rate: 0.9% ( 0.9% + 0.9% ) Kind regards 2008/6/11 Nicholas Nethercote <nj...@cs...>: > On Tue, 10 Jun 2008, Felipe Athayde wrote: > > Would you mind to answer me one last question? How do I choose which kind >> of >> mapping (Direct Mapped Cache, Fully Associative Cache, N-Way Set >> Associative >> Cache) is going to be used? Because I have to make a comparison between of >> al them. >> > > The 2nd value in the --I1/--D1/--L2 options is the associativity. For > fully associative, just set it high enough according to the size and > line-size of your cache. > > Nick > -- Felipe |