You can subscribe to this list here.
| 2003 |
Jan
|
Feb
|
Mar
(58) |
Apr
(261) |
May
(169) |
Jun
(214) |
Jul
(201) |
Aug
(219) |
Sep
(198) |
Oct
(203) |
Nov
(241) |
Dec
(94) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2004 |
Jan
(137) |
Feb
(149) |
Mar
(150) |
Apr
(193) |
May
(95) |
Jun
(173) |
Jul
(137) |
Aug
(236) |
Sep
(157) |
Oct
(150) |
Nov
(136) |
Dec
(90) |
| 2005 |
Jan
(139) |
Feb
(130) |
Mar
(274) |
Apr
(138) |
May
(184) |
Jun
(152) |
Jul
(261) |
Aug
(409) |
Sep
(239) |
Oct
(241) |
Nov
(260) |
Dec
(137) |
| 2006 |
Jan
(191) |
Feb
(142) |
Mar
(169) |
Apr
(75) |
May
(141) |
Jun
(169) |
Jul
(131) |
Aug
(141) |
Sep
(192) |
Oct
(176) |
Nov
(142) |
Dec
(95) |
| 2007 |
Jan
(98) |
Feb
(120) |
Mar
(93) |
Apr
(96) |
May
(95) |
Jun
(65) |
Jul
(62) |
Aug
(56) |
Sep
(53) |
Oct
(95) |
Nov
(106) |
Dec
(87) |
| 2008 |
Jan
(58) |
Feb
(149) |
Mar
(175) |
Apr
(110) |
May
(106) |
Jun
(72) |
Jul
(55) |
Aug
(89) |
Sep
(26) |
Oct
(96) |
Nov
(83) |
Dec
(93) |
| 2009 |
Jan
(97) |
Feb
(106) |
Mar
(74) |
Apr
(64) |
May
(115) |
Jun
(83) |
Jul
(137) |
Aug
(103) |
Sep
(56) |
Oct
(59) |
Nov
(61) |
Dec
(37) |
| 2010 |
Jan
(94) |
Feb
(71) |
Mar
(53) |
Apr
(105) |
May
(79) |
Jun
(111) |
Jul
(110) |
Aug
(81) |
Sep
(50) |
Oct
(82) |
Nov
(49) |
Dec
(21) |
| 2011 |
Jan
(87) |
Feb
(105) |
Mar
(108) |
Apr
(99) |
May
(91) |
Jun
(94) |
Jul
(114) |
Aug
(77) |
Sep
(58) |
Oct
(58) |
Nov
(131) |
Dec
(62) |
| 2012 |
Jan
(76) |
Feb
(93) |
Mar
(68) |
Apr
(95) |
May
(62) |
Jun
(109) |
Jul
(90) |
Aug
(87) |
Sep
(49) |
Oct
(54) |
Nov
(66) |
Dec
(84) |
| 2013 |
Jan
(67) |
Feb
(52) |
Mar
(93) |
Apr
(65) |
May
(33) |
Jun
(34) |
Jul
(52) |
Aug
(42) |
Sep
(52) |
Oct
(48) |
Nov
(66) |
Dec
(14) |
| 2014 |
Jan
(66) |
Feb
(51) |
Mar
(34) |
Apr
(47) |
May
(58) |
Jun
(27) |
Jul
(52) |
Aug
(41) |
Sep
(78) |
Oct
(30) |
Nov
(28) |
Dec
(26) |
| 2015 |
Jan
(41) |
Feb
(42) |
Mar
(20) |
Apr
(73) |
May
(31) |
Jun
(48) |
Jul
(23) |
Aug
(55) |
Sep
(36) |
Oct
(47) |
Nov
(48) |
Dec
(41) |
| 2016 |
Jan
(32) |
Feb
(34) |
Mar
(33) |
Apr
(22) |
May
(14) |
Jun
(31) |
Jul
(29) |
Aug
(41) |
Sep
(17) |
Oct
(27) |
Nov
(38) |
Dec
(28) |
| 2017 |
Jan
(28) |
Feb
(30) |
Mar
(16) |
Apr
(9) |
May
(27) |
Jun
(57) |
Jul
(28) |
Aug
(43) |
Sep
(31) |
Oct
(20) |
Nov
(24) |
Dec
(18) |
| 2018 |
Jan
(34) |
Feb
(50) |
Mar
(18) |
Apr
(26) |
May
(13) |
Jun
(31) |
Jul
(13) |
Aug
(11) |
Sep
(15) |
Oct
(12) |
Nov
(18) |
Dec
(13) |
| 2019 |
Jan
(12) |
Feb
(29) |
Mar
(51) |
Apr
(22) |
May
(13) |
Jun
(20) |
Jul
(13) |
Aug
(12) |
Sep
(21) |
Oct
(6) |
Nov
(9) |
Dec
(5) |
| 2020 |
Jan
(13) |
Feb
(5) |
Mar
(25) |
Apr
(4) |
May
(40) |
Jun
(27) |
Jul
(5) |
Aug
(17) |
Sep
(21) |
Oct
(1) |
Nov
(5) |
Dec
(15) |
| 2021 |
Jan
(28) |
Feb
(6) |
Mar
(11) |
Apr
(5) |
May
(7) |
Jun
(8) |
Jul
(5) |
Aug
(5) |
Sep
(11) |
Oct
(9) |
Nov
(10) |
Dec
(12) |
| 2022 |
Jan
(7) |
Feb
(13) |
Mar
(8) |
Apr
(7) |
May
(12) |
Jun
(27) |
Jul
(14) |
Aug
(27) |
Sep
(27) |
Oct
(17) |
Nov
(17) |
Dec
|
| 2023 |
Jan
(10) |
Feb
(18) |
Mar
(9) |
Apr
(26) |
May
|
Jun
(13) |
Jul
(18) |
Aug
(5) |
Sep
|
Oct
|
Nov
|
Dec
|
| S | M | T | W | T | F | S |
|---|---|---|---|---|---|---|
|
|
|
1
(1) |
2
|
3
(1) |
4
(5) |
5
(16) |
|
6
|
7
(7) |
8
(2) |
9
|
10
|
11
|
12
(4) |
|
13
|
14
|
15
(2) |
16
(1) |
17
(11) |
18
|
19
|
|
20
|
21
(3) |
22
|
23
(4) |
24
(4) |
25
(1) |
26
|
|
27
|
28
|
29
|
30
|
31
|
|
|
|
From: Spikey D. <se...@ca...> - 2012-05-15 20:39:31
|
I initially forgot to use --prefix=/usr/bin when I invoked ./configure. I did that later, and then used make and make install. It would appear I typed in something wrong, as valgrind is actually installed in /usr/bin/bin. However, I've added that path to my .profile for bash so it finds the program. Now, I am getting an error: valgrind: Unknown/uninstalled VG_PLATFORM 'amd64-darwin' I've hunted around through some forums and a similar problem having to do with brew suggested that the permissions on various vgpreload_* files may not be correct. I found two instances of these files, both in: /usr/bin/lib/valgrind/ and also in /usr/bin/lib/valgrind/vgpreload_*.so.dSYM/Contents/Resources/DWARF/vgpreload*.so (occurs for each preload file). All of the ones in the first directory mentioned already have execute permissions; I forcefully added execute permissions to the ones in the latter directories and the error remains the same... Suggestions? |
|
From: David C. <dcc...@ac...> - 2012-05-12 19:13:06
|
On 5/12/2012 8:21 AM, Michael Andronov wrote: > David - > > Thank you for the speedy reply. > >> ...like 11.5% of the time, the value is not in the cache and must be >> retrieved from main memory. > That is my understanding too. I do not know how much gain I can get by > removing those 11.5%, but I would like to do that. As I said, without restructuring the code to make every data structure fit into cache, you're going to have that 11.5% cache miss rate. The only question is how much it slows down your program. Restructuring the code or data can reduce the miss rate, likely improving performance too. > >> ...dereferencing the variable named /right/ immediately after the >> prefetch, so the CPU will stall 11.5% of the time. > Ok… > Do I understand correctly that the 11.5% misses at the point > of dereferencing the variable named /right /means UNDOUBTEDLY that the > variable /right/ itself is not in L1 at this moment? > (In other words, the 64 bytes at $1 = (IndexValue &) @0x7ffff5ad13e0: > are not prefetched? ). > Also, do I understand correctly, that that if the /right /were > prefetched, putting additional prefetch command would NOT have stalled > CPU? Strictly speaking, the variable /right/ is a pointer to a data structure (the "&" is syntactic sugar). 11.5% of the time, the specific data structure you want at that moment (referenced by /right/) is not in the L1 cache. It is very likely that the pointer itself is in the L1 cache, because it had to be pushed onto the stack or loaded into a register. > Also, if - for testing purposes - I put some code between >>> __builtin_prefetch(&right); > and >>> __builtin_prefetch(*((void **)&right)); //?!! Why D1mr are observed there? > > it should eliminate the stall. ( Though not going to improve overall > performance, just eliminate the stall (D1mr misses)…) > I tried that, putting different length of code… But still getting the > same 11.5% at the same de-referrencing point… > That is when I get puzzled… ;) Depending on the number of data structures you have and their layout, an 11.5% rate may be not too bad. Cache memories (especially L1) are small. It could be very hard to improve that rate. Putting instructions between the prefetch and the usage will not reduce the miss rate; only a change in code or data layout can do that. But you can reduce the stall time, performing useful work while the miss is resolved. See http://gcc.gnu.org/onlinedocs/gcc-3.3.6/gcc/Other-Builtins.html - "if the prefetch is done early enough before the access then the data will be in the cache by the time it is accessed." If you can perform useful work during this period, overall performance will increase. If not, work to reduce the miss rate. Prefetching does not reduce cache misses. > > Finally, I modified the code like: > " > … > static SListOp compare(const IndexValue& left,IndexValue& > right,ulong ity) { > __builtin_prefetch(*((void **)&right)); //?!! Why D1mr > are observed there? > int cmp=left.key.cmp(right.key,(TREE_KT)ity); return > cmp<0?SLO_LT:cmp>0?SLO_GT:SLO_NOOP; > } > … > " > and put /__builtin_prefetch(&right)/ before calling /compare(), /upper > the calling stack. I even tried different `distance` from the > compare() call… But still stuck with 11.5% at the same point. > > And that is where I'm really get lost. ;) Not knowing how the IndexValue objects are organized, I can't really help here. To reduce the miss percentage, you somehow have to organize the accessed set so that it fits into the L1 cache. If they are not in an array, or if the size of that array is larger than the L1 cache, it will be difficult or impossible to reduce the rate further. > > >> Only a cache-friendly reorganization of your data or access patterns >> can reduce this… > I understand that. The problem is - I do not understand what is > happening with cache at the point of dereferencing. That is why it is > difficult to adjust the code. The cache has a set of recently used memory lines. A miss means that block of data hasn't been accessed recently. No more. If you group the data so that it is all close together, or access pieces that are very close together (moving on to a new section only rarely), then you have a cache-friendly organization. This may not be possible in your program. > > >> ...if your working set size exceeds the cache size, > I do not think it is the case with that program. > > Thanks for your help. I'm still trying to 'get control' over those D1mr. > Michael. > > On 2012-05-12, at 12:43 AM, David Chapman <dcc...@ac... > <mailto:dcc...@ac...>> wrote: > >> On 5/11/2012 8:04 PM, Michael Andronov wrote: >>> I'm looking for some help/hint to explain the results I'm observing with the following peace of code: >>> " >>> … >>> struct CmpIndexValue { >>> static SListOp compare(const IndexValue& left,IndexValue& right,ulong ity) { >>> __builtin_prefetch(&right); >>> __builtin_prefetch(*((void **)&right)); //?!! Why D1mr are observed there? >>> int cmp=left.key.cmp(right.key,(TREE_KT)ity); return cmp<0?SLO_LT:cmp>0?SLO_GT:SLO_NOOP; >>> } >>> }; >>> … >>> " >>> where the `right` is the reference for some structure like: >>> (gdb) print right >>> $1 = (IndexValue&) @0x7ffff5ad13e0: {key = {ptr = {p = 0x7ffff5ad13d8, l = 5}, ... } >>> (gdb) x /4x (void **)&right >>> 0x7ffff5ad13e0: 0xf5ad13d8 0x00007fff 0x00000005 0x00000000 >>> (gdb) x /4x *(void **)&right >>> 0x7ffff5ad13d8: 0x048056a1 0x01000016 0xf5ad13d8 0x00007fff >>> (gdb) >>> >>> The expected behaviour: >>> - the line - __builtin_prefetch(*((void **)&right)); - should trigger the prefetch (64 bytes of my machine) from address {key = {ptr = {p = 0x7ffff5ad13d8… . >>> - when the actual access to right.key.ptr.p will occur - within left.key.cmp() - the values should be within Ld1… >>> >>> The observed behaviour is different, however: >>> - the kcachegrind is reporting significant 11.5% D1mr misses at __builtin_prefetch(*((void **)&right)); line. Which looks a bit strange and unexpected for me... >>> >>> Actually, if that line is commented out, then approximately the same amount of D1mr losses is shown later, within left.key.cmp() function, at attempt to access right.key.ptr.p values. >>> So, to the 'certain degree', putting the __builtin_prefetch() function is doing the job and performing prefetching… >>> But the idea was to eliminate the D1mr misses, not 'to move' them around the code. ;) >>> >>> I would be very grateful for any hint for direction to understand and/or explanation where my expectations are wrong and/or what I'm doing wrong. >>> >>> >> >> The goal of prefetching is to get the data you are going to need >> while your code is doing something else. You have told the compiler >> that a specific value is going to be used very soon, so it should ask >> the CPU to ensure that the value is in the cache. It sounds like >> 11.5% of the time, the value is not in the cache and must be >> retrieved from main memory. Only a cache-friendly reorganization of >> your data or access patterns can reduce this, and if your working set >> size exceeds the cache size, you're more or less stuck with that miss >> rate. You could move the misses around in the code, but you can't >> get rid of them by prefetching. >> >> Alternatively, you could rewrite the code to do more work between the >> prefetch request and the actual use of the data. Right now you are >> dereferencing the variable named /right/ immediately after the >> prefetch, so the CPU will stall 11.5% of the time. If there was more >> work to do, the CPU would not be idle while the cache line was being >> loaded. >> -- >> David Cha...@ac... >> Chapman Consulting -- San Jose, CA >> Software Development Done Right. >> www.chapman-consulting-sj.com > -- David Chapman dcc...@ac... Chapman Consulting -- San Jose, CA Software Development Done Right. www.chapman-consulting-sj.com |
|
From: Michael A. <in...@sd...> - 2012-05-12 15:22:07
|
David -
Thank you for the speedy reply.
> ...like 11.5% of the time, the value is not in the cache and must be retrieved from main memory.
That is my understanding too. I do not know how much gain I can get by removing those 11.5%, but I would like to do that.
> ...dereferencing the variable named right immediately after the prefetch, so the CPU will stall 11.5% of the time.
Ok…
Do I understand correctly that the 11.5% misses at the point of dereferencing the variable named right means UNDOUBTEDLY that the variable right itself is not in L1 at this moment?
(In other words, the 64 bytes at $1 = (IndexValue &) @0x7ffff5ad13e0: are not prefetched? ).
Also, do I understand correctly, that that if the right were prefetched, putting additional prefetch command would NOT have stalled CPU?
Also, if - for testing purposes - I put some code between
>> __builtin_prefetch(&right);
and
>> __builtin_prefetch(*((void **)&right)); //?!! Why D1mr are observed there?
it should eliminate the stall. ( Though not going to improve overall performance, just eliminate the stall (D1mr misses)…)
I tried that, putting different length of code… But still getting the same 11.5% at the same de-referrencing point…
That is when I get puzzled… ;)
Finally, I modified the code like:
"
…
static SListOp compare(const IndexValue& left,IndexValue& right,ulong ity) {
__builtin_prefetch(*((void **)&right)); //?!! Why D1mr are observed there?
int cmp=left.key.cmp(right.key,(TREE_KT)ity); return cmp<0?SLO_LT:cmp>0?SLO_GT:SLO_NOOP;
}
…
"
and put __builtin_prefetch(&right) before calling compare(), upper the calling stack. I even tried different `distance` from the compare() call… But still stuck with 11.5% at the same point.
And that is where I'm really get lost. ;)
> Only a cache-friendly reorganization of your data or access patterns can reduce this…
I understand that. The problem is - I do not understand what is happening with cache at the point of dereferencing. That is why it is difficult to adjust the code.
> ...if your working set size exceeds the cache size,
I do not think it is the case with that program.
Thanks for your help. I'm still trying to 'get control' over those D1mr.
Michael.
On 2012-05-12, at 12:43 AM, David Chapman <dcc...@ac...> wrote:
> On 5/11/2012 8:04 PM, Michael Andronov wrote:
>>
>> I'm looking for some help/hint to explain the results I'm observing with the following peace of code:
>> "
>> …
>> struct CmpIndexValue {
>> static SListOp compare(const IndexValue& left,IndexValue& right,ulong ity) {
>> __builtin_prefetch(&right);
>> __builtin_prefetch(*((void **)&right)); //?!! Why D1mr are observed there?
>> int cmp=left.key.cmp(right.key,(TREE_KT)ity); return cmp<0?SLO_LT:cmp>0?SLO_GT:SLO_NOOP;
>> }
>> };
>> …
>> "
>> where the `right` is the reference for some structure like:
>> (gdb) print right
>> $1 = (IndexValue &) @0x7ffff5ad13e0: {key = {ptr = {p = 0x7ffff5ad13d8, l = 5}, ... }
>> (gdb) x /4x (void **)&right
>> 0x7ffff5ad13e0: 0xf5ad13d8 0x00007fff 0x00000005 0x00000000
>> (gdb) x /4x *(void **)&right
>> 0x7ffff5ad13d8: 0x048056a1 0x01000016 0xf5ad13d8 0x00007fff
>> (gdb)
>>
>> The expected behaviour:
>> - the line - __builtin_prefetch(*((void **)&right)); - should trigger the prefetch (64 bytes of my machine) from address {key = {ptr = {p = 0x7ffff5ad13d8… .
>> - when the actual access to right.key.ptr.p will occur - within left.key.cmp() - the values should be within Ld1…
>>
>> The observed behaviour is different, however:
>> - the kcachegrind is reporting significant 11.5% D1mr misses at __builtin_prefetch(*((void **)&right)); line. Which looks a bit strange and unexpected for me...
>>
>> Actually, if that line is commented out, then approximately the same amount of D1mr losses is shown later, within left.key.cmp() function, at attempt to access right.key.ptr.p values.
>> So, to the 'certain degree', putting the __builtin_prefetch() function is doing the job and performing prefetching…
>> But the idea was to eliminate the D1mr misses, not 'to move' them around the code. ;)
>>
>> I would be very grateful for any hint for direction to understand and/or explanation where my expectations are wrong and/or what I'm doing wrong.
>>
>>
>
> The goal of prefetching is to get the data you are going to need while your code is doing something else. You have told the compiler that a specific value is going to be used very soon, so it should ask the CPU to ensure that the value is in the cache. It sounds like 11.5% of the time, the value is not in the cache and must be retrieved from main memory. Only a cache-friendly reorganization of your data or access patterns can reduce this, and if your working set size exceeds the cache size, you're more or less stuck with that miss rate. You could move the misses around in the code, but you can't get rid of them by prefetching.
>
> Alternatively, you could rewrite the code to do more work between the prefetch request and the actual use of the data. Right now you are dereferencing the variable named right immediately after the prefetch, so the CPU will stall 11.5% of the time. If there was more work to do, the CPU would not be idle while the cache line was being loaded.
> --
> David Chapman dcc...@ac...
> Chapman Consulting -- San Jose, CA
> Software Development Done Right.
> www.chapman-consulting-sj.com
|
|
From: David C. <dcc...@ac...> - 2012-05-12 06:02:01
|
On 5/11/2012 8:04 PM, Michael Andronov wrote:
> I'm looking for some help/hint to explain the results I'm observing with the following peace of code:
> "
> …
> struct CmpIndexValue {
> static SListOp compare(const IndexValue& left,IndexValue& right,ulong ity) {
> __builtin_prefetch(&right);
> __builtin_prefetch(*((void **)&right)); //?!! Why D1mr are observed there?
> int cmp=left.key.cmp(right.key,(TREE_KT)ity); return cmp<0?SLO_LT:cmp>0?SLO_GT:SLO_NOOP;
> }
> };
> …
> "
> where the `right` is the reference for some structure like:
> (gdb) print right
> $1 = (IndexValue&) @0x7ffff5ad13e0: {key = {ptr = {p = 0x7ffff5ad13d8, l = 5}, ... }
> (gdb) x /4x (void **)&right
> 0x7ffff5ad13e0: 0xf5ad13d8 0x00007fff 0x00000005 0x00000000
> (gdb) x /4x *(void **)&right
> 0x7ffff5ad13d8: 0x048056a1 0x01000016 0xf5ad13d8 0x00007fff
> (gdb)
>
> The expected behaviour:
> - the line - __builtin_prefetch(*((void **)&right)); - should trigger the prefetch (64 bytes of my machine) from address {key = {ptr = {p = 0x7ffff5ad13d8… .
> - when the actual access to right.key.ptr.p will occur - within left.key.cmp() - the values should be within Ld1…
>
> The observed behaviour is different, however:
> - the kcachegrind is reporting significant 11.5% D1mr misses at __builtin_prefetch(*((void **)&right)); line. Which looks a bit strange and unexpected for me...
>
> Actually, if that line is commented out, then approximately the same amount of D1mr losses is shown later, within left.key.cmp() function, at attempt to access right.key.ptr.p values.
> So, to the 'certain degree', putting the __builtin_prefetch() function is doing the job and performing prefetching…
> But the idea was to eliminate the D1mr misses, not 'to move' them around the code. ;)
>
> I would be very grateful for any hint for direction to understand and/or explanation where my expectations are wrong and/or what I'm doing wrong.
>
>
The goal of prefetching is to get the data you are going to need while
your code is doing something else. You have told the compiler that a
specific value is going to be used very soon, so it should ask the CPU
to ensure that the value is in the cache. It sounds like 11.5% of the
time, the value is not in the cache and must be retrieved from main
memory. Only a cache-friendly reorganization of your data or access
patterns can reduce this, and if your working set size exceeds the cache
size, you're more or less stuck with that miss rate. You could move the
misses around in the code, but you can't get rid of them by prefetching.
Alternatively, you could rewrite the code to do more work between the
prefetch request and the actual use of the data. Right now you are
dereferencing the variable named /right/ immediately after the prefetch,
so the CPU will stall 11.5% of the time. If there was more work to do,
the CPU would not be idle while the cache line was being loaded.
--
David Chapman dcc...@ac...
Chapman Consulting -- San Jose, CA
Software Development Done Right.
www.chapman-consulting-sj.com
|
|
From: Michael A. <in...@sd...> - 2012-05-12 04:04:47
|
I'm looking for some help/hint to explain the results I'm observing with the following peace of code:
"
…
struct CmpIndexValue {
static SListOp compare(const IndexValue& left,IndexValue& right,ulong ity) {
__builtin_prefetch(&right);
__builtin_prefetch(*((void **)&right)); //?!! Why D1mr are observed there?
int cmp=left.key.cmp(right.key,(TREE_KT)ity); return cmp<0?SLO_LT:cmp>0?SLO_GT:SLO_NOOP;
}
};
…
"
where the `right` is the reference for some structure like:
(gdb) print right
$1 = (IndexValue &) @0x7ffff5ad13e0: {key = {ptr = {p = 0x7ffff5ad13d8, l = 5}, ... }
(gdb) x /4x (void **)&right
0x7ffff5ad13e0: 0xf5ad13d8 0x00007fff 0x00000005 0x00000000
(gdb) x /4x *(void **)&right
0x7ffff5ad13d8: 0x048056a1 0x01000016 0xf5ad13d8 0x00007fff
(gdb)
The expected behaviour:
- the line - __builtin_prefetch(*((void **)&right)); - should trigger the prefetch (64 bytes of my machine) from address {key = {ptr = {p = 0x7ffff5ad13d8… .
- when the actual access to right.key.ptr.p will occur - within left.key.cmp() - the values should be within Ld1…
The observed behaviour is different, however:
- the kcachegrind is reporting significant 11.5% D1mr misses at __builtin_prefetch(*((void **)&right)); line. Which looks a bit strange and unexpected for me...
Actually, if that line is commented out, then approximately the same amount of D1mr losses is shown later, within left.key.cmp() function, at attempt to access right.key.ptr.p values.
So, to the 'certain degree', putting the __builtin_prefetch() function is doing the job and performing prefetching…
But the idea was to eliminate the D1mr misses, not 'to move' them around the code. ;)
I would be very grateful for any hint for direction to understand and/or explanation where my expectations are wrong and/or what I'm doing wrong.
Thanks.
Michael.
|
|
From: Dave G. <go...@mc...> - 2012-05-08 02:27:20
|
On May 7, 2012, at 2:45 PM CDT, Martin Kalany wrote: > Am 07.05.2012 21:34, schrieb Dave Goodell: >> >> What does "ldd YOUR_BINARY_HERE" give you? You should see lines that look like this in the output: >> >> ----8<---- >> libmpich.so.6 => /sandbox/goodell/mpich2-installed/lib/libmpich.so.6 (0x00007fb786ffa000) >> libopa.so.1 => /sandbox/goodell/mpich2-installed/lib/libopa.so.1 (0x00007fb786df8000) >> libmpl.so.1 => /sandbox/goodell/mpich2-installed/lib/libmpl.so.1 (0x00007fb786bf1000) >> ----8<---- > > ldd gives me libmpi.so.0, but no mpich-related .so files > > And I guess that's the problem, right? I already reinstalled vlagrind > using ./configure --with-mpicc=path/to/mpich/ What is the ldd output when run on an executable supposedly built with an unmodified MPICH2 installation? My guess is that you've got MPICH2 and Open MPI (or some other MPI implementation with a "libmpi.so") installed on the same machine and you're doing one or more of: 1) using the wrong mpicc 2) incorrectly setting LD_LIBRARY_PATH If the ldd from an unmodified MPICH2 doesn't show proper linking against libmpich.so, then try using the absolute path to "mpicc" when building your application. > (as you suggested on > stackoverflow. That question there is from me, if you haven't noticed yet). Yes, I noticed. I originally saw it there, but decided that it would be better to avoid duplicate work and save some of Philippe's time by switching to this thread once I saw it here. -Dave |
|
From: Dave G. <go...@mc...> - 2012-05-08 02:14:47
|
On May 7, 2012, at 3:54 PM CDT, Philippe Waroquiers wrote: > For what concerns the original problem: I understand it is because > Valgrind was configured with a different mpi that the one you are using > and that created a mixup in the libs. Is that the explanation ? I don't think this has been confirmed, but something along these lines seems like the most likely explanation to me. I just tried the Valgrind 3.7.0 MPI wrapper with MPICH2 on Linux and it worked just fine without any library renaming. -Dave |
|
From: Philippe W. <phi...@sk...> - 2012-05-07 20:54:37
|
On Mon, 2012-05-07 at 21:15 +0200, Martin Kalany wrote:
> Nevertheless, valgrind doesn't print anything similar to
> "valgrind MPI wrappers 31901: Active for pid 31901
> valgrind MPI wrappers 31901: Try MPIWRAP_DEBUG=help for possible options"
> as stated in the documentation. How do I know whether or not the mpi
> wrappers now work?
If you add the option --trace-redir=yes to your Valgrind args,
Valgrind will trace all the actions related to redirection/wrapping:
* it will trace the creation of the redir specifications
(e.g; when loading the libmpiwrap which is part of Valgrind)
* it will trace the resulting "active" redirections or wrappings.
For what concerns the original problem: I understand it is because
Valgrind was configured with a different mpi that the one you are using
and that created a mixup in the libs. Is that the explanation ?
Philippe
|
|
From: Martin K. <m.k...@gm...> - 2012-05-07 19:44:52
|
Am 07.05.2012 21:34, schrieb Dave Goodell: > On May 7, 2012, at 2:15 PM CDT, Martin Kalany wrote: > >> Am 06.05.2012 00:54, schrieb Philippe Waroquiers: >>> On Sun, 2012-05-06 at 00:24 +0200, Martin Kalany wrote: >>>>> Valgrind documation states that "The MPI functions to be wrapped are assumed to be in an ELF shared object with soname matching libmpi.so*. This is known>to be correct at least for Open MPI and Quadrics MPI, and can easily be changed if required." >>> Note that the documentation is slightly out of date, as the code >>> contains the pattern libmpi*.so* >>> (so as to match a.o. libmpich.so.1.0). >>> >>> Philippe >> Thanks a lot for your help Phillippe! You finally led me in the right >> direction: >> I found a workaround for the problem: I simply renamed libmpich.so to >> libmpi.so and the error was gone. > That sounds like it will cause other problems. Do your applications still run correctly after the rename? To be more precise: I did a copy+rename, so the original still exists. > >> Nevertheless, valgrind doesn't print anything similar to >> "valgrind MPI wrappers 31901: Active for pid 31901 >> valgrind MPI wrappers 31901: Try MPIWRAP_DEBUG=help for possible options" >> as stated in the documentation. How do I know whether or not the mpi >> wrappers now work? > I'm guessing that they aren't working. I've hit this (not the "libmpi.so" name issue) in the past when linking statically instead of dynamically. > > How are you compiling and linking your code? With "mpicc"? Yes, I'm using mpicc to compile and link. > > What does "ldd YOUR_BINARY_HERE" give you? You should see lines that look like this in the output: > > ----8<---- > libmpich.so.6 => /sandbox/goodell/mpich2-installed/lib/libmpich.so.6 (0x00007fb786ffa000) > libopa.so.1 => /sandbox/goodell/mpich2-installed/lib/libopa.so.1 (0x00007fb786df8000) > libmpl.so.1 => /sandbox/goodell/mpich2-installed/lib/libmpl.so.1 (0x00007fb786bf1000) > ----8<---- > > -Dave > > ldd gives me libmpi.so.0, but no mpich-related .so files And I guess that's the problem, right? I already reinstalled vlagrind using ./configure --with-mpicc=path/to/mpich/ (as you suggested on stackoverflow. That question there is from me, if you haven't noticed yet). Thanks for your help! Martin |
|
From: Dave G. <go...@mc...> - 2012-05-07 19:34:58
|
On May 7, 2012, at 2:15 PM CDT, Martin Kalany wrote:
> Am 06.05.2012 00:54, schrieb Philippe Waroquiers:
>> On Sun, 2012-05-06 at 00:24 +0200, Martin Kalany wrote:
>>>> Valgrind documation states that "The MPI functions to be wrapped are assumed to be in an ELF shared object with soname matching libmpi.so*. This is known>to be correct at least for Open MPI and Quadrics MPI, and can easily be changed if required."
>> Note that the documentation is slightly out of date, as the code
>> contains the pattern libmpi*.so*
>> (so as to match a.o. libmpich.so.1.0).
>>
>> Philippe
> Thanks a lot for your help Phillippe! You finally led me in the right
> direction:
> I found a workaround for the problem: I simply renamed libmpich.so to
> libmpi.so and the error was gone.
That sounds like it will cause other problems. Do your applications still run correctly after the rename?
> Nevertheless, valgrind doesn't print anything similar to
> "valgrind MPI wrappers 31901: Active for pid 31901
> valgrind MPI wrappers 31901: Try MPIWRAP_DEBUG=help for possible options"
> as stated in the documentation. How do I know whether or not the mpi
> wrappers now work?
I'm guessing that they aren't working. I've hit this (not the "libmpi.so" name issue) in the past when linking statically instead of dynamically.
How are you compiling and linking your code? With "mpicc"?
What does "ldd YOUR_BINARY_HERE" give you? You should see lines that look like this in the output:
----8<----
libmpich.so.6 => /sandbox/goodell/mpich2-installed/lib/libmpich.so.6 (0x00007fb786ffa000)
libopa.so.1 => /sandbox/goodell/mpich2-installed/lib/libopa.so.1 (0x00007fb786df8000)
libmpl.so.1 => /sandbox/goodell/mpich2-installed/lib/libmpl.so.1 (0x00007fb786bf1000)
----8<----
-Dave
|
|
From: Martin K. <m.k...@gm...> - 2012-05-07 19:15:04
|
Am 06.05.2012 00:54, schrieb Philippe Waroquiers: > On Sun, 2012-05-06 at 00:24 +0200, Martin Kalany wrote: >>> Valgrind documation states that "The MPI functions to be wrapped are assumed to be in an ELF shared object with soname matching libmpi.so*. This is known>to be correct at least for Open MPI and Quadrics MPI, and can easily be changed if required." > Note that the documentation is slightly out of date, as the code > contains the pattern libmpi*.so* > (so as to match a.o. libmpich.so.1.0). > > Philippe Thanks a lot for your help Phillippe! You finally led me in the right direction: I found a workaround for the problem: I simply renamed libmpich.so to libmpi.so and the error was gone. Nevertheless, valgrind doesn't print anything similar to "valgrind MPI wrappers 31901: Active for pid 31901 valgrind MPI wrappers 31901: Try MPIWRAP_DEBUG=help for possible options" as stated in the documentation. How do I know whether or not the mpi wrappers now work? Martin |
|
From: Oliver S. <ol...@f-...> - 2012-05-07 13:15:39
|
Hello Phillipe, thank you for your response. I think you gave me an idea. // Oliver On 2012-05-05 14:22, Philippe Waroquiers wrote: > On Fri, 2012-05-04 at 17:32 +0000, Oliver Schneider wrote: >> Hi folks, >> >> I've got a question about Valgrind and its Memcheck tool. Is it possible >> to take a snapshot of a program under Valgrind, kinda similar to the way >> a fork() clones the process space, and then continue again from that >> snapshot with Valgrind? Could fork() perhaps be the answer? > This is not possible with Valgrind. > Having a fully general "snapshot" solution looks close to impossible > e.g. you have to re-create the exact "system state": > opened files and seek position > tcp/ip connections > pwd > ... > > The closest to what you describe here that I know of is the "unexec" > feature of emacs: emacs is first compiled, it has no lisp loaded. > As part of the build, it then loads a whole bunch of lisp files > and then "unexec" itself (i.e. creates a dumped executable) > After that, the dumped file is the one which is installed, with > loaded lisp files being part of the initialised data. > > So, I guess you better work in that direction (or have a data > structure that you can e.g. dump to a file to just mmap at startup). > > Philippe |
|
From: Bhattiprolu R. <rav...@gm...> - 2012-05-07 09:33:46
|
All, I tried to compile this for ARM11 MPCore and unsuccessful while running it. It gave undefined instruction errors. Anyone tried that before? regards, Ravi On Mon, May 7, 2012 at 2:04 PM, Julian Seward <js...@ac...> wrote: > > > I would like to be able to use Valgrind on an ARM platform -- > specifically > > a Cortex-A8. It appears that the support for this platform is not > > complete. Is it "almost complete" where we could do something to finish > it > > or are there major problems, etc.? > > It works pretty well on Cortex A8 and A9 now. Give it a try, either 3.7.0 > or > (better) the svn trunk. > > J > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Valgrind-users mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-users > |
|
From: Julian S. <js...@ac...> - 2012-05-07 09:04:44
|
> I would like to be able to use Valgrind on an ARM platform -- specifically > a Cortex-A8. It appears that the support for this platform is not > complete. Is it "almost complete" where we could do something to finish it > or are there major problems, etc.? It works pretty well on Cortex A8 and A9 now. Give it a try, either 3.7.0 or (better) the svn trunk. J |
|
From: Philippe W. <phi...@sk...> - 2012-05-05 22:54:27
|
On Sun, 2012-05-06 at 00:24 +0200, Martin Kalany wrote: > >Valgrind documation states that "The MPI functions to be wrapped are assumed to be in an ELF shared object with soname matching libmpi.so*. This is known>to be correct at least for Open MPI and Quadrics MPI, and can easily be changed if required." Note that the documentation is slightly out of date, as the code contains the pattern libmpi*.so* (so as to match a.o. libmpich.so.1.0). Philippe |
|
From: Philippe W. <phi...@sk...> - 2012-05-05 22:43:51
|
On Sun, 2012-05-06 at 00:24 +0200, Martin Kalany wrote:
> I'm already doing that, but in a slightly different way:
> LD_PRELOAD=~/valgrind/valgrind-3.7.0/mpi/libmpiwrap-x86-linux.so \
> mpirun -np 2 valgrind ./foo
>
> (This is suggested in the mpi section of the valgrind documentation).
> This way, mpirun will launch two processes; each process starts valgrind
> which will in turn execute the actual program.
>
> I did exactly as the documentation does; I think the main issue is this:
>
> >Valgrind documation states that "The MPI functions to be wrapped are assumed to be in an ELF shared object with soname matching libmpi.so*. This is known>to be correct at least for Open MPI and Quadrics MPI, and can easily be changed if required."
> How and where do I change that?
>From what I can see, to change this, you must edit the "Z encoding"
of the mpi wrappers.
I think this is the macro in the file libmpiwrap.c:
#define I_WRAP_FNNAME_U(_name) \
I_WRAP_SONAME_FNNAME_ZU(libmpiZaZdsoZa,_name)
However, the current Z encoding is the following pattern:
"libmpi*.so*"
so that it will wrap the mpi functions from all sonames matching
the above pattern.
In particular, it will wrap the functions in a soname:
libmpich.so.1.0
>From what I can see, the problem is not in the wrapping but
rather that for one reason or another, the dynamic loader under
Valgrind does not find the relevant lib.
Maybe Valgrind args -v -v -v -d -d -d --trace-redir=yes could give a
hint ?
Sorry for not be able to help more
Philippe
|
|
From: Martin K. <m.k...@gm...> - 2012-05-05 22:24:54
|
Am 05.05.2012 18:59, schrieb Philippe Waroquiers:
> On Sat, 2012-05-05 at 18:14 +0200, Martin Kalany wrote:
>>> Is the 'cannot open' error only there when running under Valgrind ?
>> Yes. When I use mpirun, it's fine.
>>
>> What I think is strange that valgrind apperantly tries to load
>> libmpi.so, although it should load libmpich.so.1.0
>>
>>> Maybe a problem related to the dynamic loader ?
>> I'm rather new to MPI so I'm not sure about this.
> Valgrind is not supposed to change which shared lib are used:
> the dynamic loader is executed by Valgrind and should behave
> the same (and so load the same shared libs as a native run).
>
> I know nothing abound MPI and so might not understand
> what you are doing.
> But I believe mpirun is a shell script
> which might be needed to setup some required env variables.
>
> So, to be sure, mpirun should be used also when using Valgrind
> e.g.
> valgrind --trace-children=yes mpirun ....
>
> (or is this what you are doing already ?)
>
>
> Philippe
I'm already doing that, but in a slightly different way:
LD_PRELOAD=~/valgrind/valgrind-3.7.0/mpi/libmpiwrap-x86-linux.so \
mpirun -np 2 valgrind ./foo
(This is suggested in the mpi section of the valgrind documentation).
This way, mpirun will launch two processes; each process starts valgrind
which will in turn execute the actual program.
I did exactly as the documentation does; I think the main issue is this:
>Valgrind documation states that "The MPI functions to be wrapped are assumed to be in an ELF shared object with soname matching libmpi.so*. This is known>to be correct at least for Open MPI and Quadrics MPI, and can easily be changed if required."
How and where do I change that?
Martin
|
|
From: Dan K. <da...@ke...> - 2012-05-05 22:14:31
|
On Sat, May 5, 2012 at 11:38 AM, Geoff Alexander <gal...@nc...> wrote: > You don’t show the code that’s calling rand_permute::rand_permute(long, > int). In particular, the passed in value of stride could be uninitialized. It might be even more helpful to see a small program we can actually compile so we can reproduce the problem here. Make it as small as possible. - Dan |
|
From: Philippe W. <phi...@sk...> - 2012-05-05 22:10:16
|
On Sat, 2012-05-05 at 14:45 -0400, Zheng Da wrote:
>
> The corresponding code is shown below. I don't understand
> which variable isn't initialized?
If you upgrade to Valgrind 3.7.0, you can use gdb to debug
your program under Valgrind.
With this, you have GDB monitor commands to ask if an address
is initialised (or not).
See user manual, sections
3.2. Debugging your program using Valgrind gdbserver and GDB
and 4.6. Memcheck Monitor Commands
This might make it easier to understand where the problem
is coming from.
Philippe
|
|
From: Zheng Da <zhe...@gm...> - 2012-05-05 18:45:42
|
157 class global_rand_permute_workload: public workload_gen
158 {
159 long start;
160 long end;
161 static const rand_permute *permute;
162 public:
163 global_rand_permute_workload(long num, int stride, long start, long
end) {
164 if (permute == NULL) {
165 permute = new rand_permute(num, stride);
166 }
167 this->start = start;
168 this->end = end;
169 }
In rand-read.cc,
300 int entry_size = 128;
399 int num_entries = npages * (PAGE_SIZE / entry_size);
473 case RAND_PERMUTE:
474 gen = new global_rand_permute_workload(num_entries,
475 entry_size, start, end);
npages is also initialized. Every variable should have been initialized. I
don't know where the problem is.
Thanks,
Da
476 break;
On Sat, May 5, 2012 at 2:38 PM, Geoff Alexander <gal...@nc...>wrote:
> Da,****
>
> ** ******
>
> You don’t show the code that’s calling rand_permute::rand_permute(long,
> int). In particular, the passed in value of stride could be
> uninitialized.****
>
> ** ******
>
> Geoff****
>
> ** ******
>
> -----Original Message-----
> *From:* Zheng Da [mailto:zhe...@gm...]
> *Sent:* Saturday, May 05, 2012 2:26 PM
> *To:* Philippe Waroquiers
> *Cc:* val...@li...
> *Subject:* Re: [Valgrind-users] valgrind prints out a lot of error
> messages pointing to the standard library
>
> ** ******
>
> hello,****
>
> > ==32701== Conditional jump or move depends on uninitialised value(s)
> > ==32701== at 0x4FB3D9: fillin_rpath
> > (in /home/zhengda/Dropbox/research/read-test/rand-read)
> > ==32701== by 0x4FDBCB: _dl_init_paths
> > (in /home/zhengda/Dropbox/research/read-test/rand-read)
> > ==32701== by 0x4CCC58: _dl_non_dynamic_init
> > (in /home/zhengda/Dropbox/research/read-test/rand-read)
> > ==32701== by 0x4CD762: __libc_init_first
> > (in /home/zhengda/Dropbox/research/read-test/rand-read)
> > ==32701== by 0x47F795: (below main)
> > (in /home/zhengda/Dropbox/research/read-test/rand-read)
> > ==32701==****
>
> The above error for example looks to somewhat match a suppression in
> glibc-2.3.supp
>
> It is however not clear what is the cause of all these errors
> not being suppressed.
> Note that usually, having more info such as Valgrind version,
> OS and distribution version, cpu etc might only help to guess
> what it is :).****
>
> Sorry, I forget.****
>
> I use valgrind-3.6.1, ****
>
> ubuntu 11.04, ****
>
> Xeon(R) CPU E5405****
>
> Linux 2.6.38.8****
>
> ** ******
>
> Other than the errors in the standard library, it also shows many errors
> in my own program and I found the error messages are also very misleading.
> ****
>
> ** ******
>
> ==21746== Use of uninitialised value of size 8****
>
> ==21746== at 0x40D168: rand_permute::rand_permute(long, int)
> (workload.h:63)****
>
> ==21746== by 0x410624:
> global_rand_permute_workload::global_rand_permute_workload(long, int, long,
> long) (workload.h:165)****
>
> ==21746== by 0x40F3FD: main (rand-read.cc:475)****
>
> ==21746== ****
>
> ** ******
>
> The corresponding code is shown below. I don't understand which variable
> isn't initialized?****
>
> ** ******
>
> 54 class rand_permute****
>
> 55 {****
>
> 56 off_t *offset;****
>
> 57 long num;****
>
> 58 ****
>
> 59 public:****
>
> 60 rand_permute(long num, int stride) {****
>
> 61 offset = (off_t *) valloc(num * sizeof(off_t));****
>
> 62 for (int i = 0; i < num; i++) {****
>
> 63 offset[i] = ((off_t) i) * stride;****
>
> 64 }****
>
> 65 ****
>
> 66 for (int i = num - 1; i >= 1; i--) {****
>
> 67 int j = random() % i;****
>
> 68 off_t tmp = offset[j];****
>
> 69 offset[j] = offset[i];****
>
> 70 offset[i] = tmp;****
>
> 71 }****
>
> 72 }****
>
> ** ******
>
> Thanks,****
>
> Da****
>
|
|
From: Geoff A. <gal...@nc...> - 2012-05-05 18:38:44
|
Da,
You dont show the code thats calling rand_permute::rand_permute(long,
int). In particular, the passed in value of stride could be uninitialized.
Geoff
-----Original Message-----
From: Zheng Da [mailto:zhe...@gm...]
Sent: Saturday, May 05, 2012 2:26 PM
To: Philippe Waroquiers
Cc: val...@li...
Subject: Re: [Valgrind-users] valgrind prints out a lot of error messages
pointing to the standard library
hello,
> ==32701== Conditional jump or move depends on uninitialised value(s)
> ==32701== at 0x4FB3D9: fillin_rpath
> (in /home/zhengda/Dropbox/research/read-test/rand-read)
> ==32701== by 0x4FDBCB: _dl_init_paths
> (in /home/zhengda/Dropbox/research/read-test/rand-read)
> ==32701== by 0x4CCC58: _dl_non_dynamic_init
> (in /home/zhengda/Dropbox/research/read-test/rand-read)
> ==32701== by 0x4CD762: __libc_init_first
> (in /home/zhengda/Dropbox/research/read-test/rand-read)
> ==32701== by 0x47F795: (below main)
> (in /home/zhengda/Dropbox/research/read-test/rand-read)
> ==32701==
The above error for example looks to somewhat match a suppression in
glibc-2.3.supp
It is however not clear what is the cause of all these errors
not being suppressed.
Note that usually, having more info such as Valgrind version,
OS and distribution version, cpu etc might only help to guess
what it is :).
Sorry, I forget.
I use valgrind-3.6.1,
ubuntu 11.04,
Xeon(R) CPU E5405
Linux 2.6.38.8
Other than the errors in the standard library, it also shows many errors in
my own program and I found the error messages are also very misleading.
==21746== Use of uninitialised value of size 8
==21746== at 0x40D168: rand_permute::rand_permute(long, int)
(workload.h:63)
==21746== by 0x410624:
global_rand_permute_workload::global_rand_permute_workload(long, int, long,
long) (workload.h:165)
==21746== by 0x40F3FD: main (rand-read.cc:475)
==21746==
The corresponding code is shown below. I don't understand which variable
isn't initialized?
54 class rand_permute
55 {
56 off_t *offset;
57 long num;
58
59 public:
60 rand_permute(long num, int stride) {
61 offset = (off_t *) valloc(num * sizeof(off_t));
62 for (int i = 0; i < num; i++) {
63 offset[i] = ((off_t) i) * stride;
64 }
65
66 for (int i = num - 1; i >= 1; i--) {
67 int j = random() % i;
68 off_t tmp = offset[j];
69 offset[j] = offset[i];
70 offset[i] = tmp;
71 }
72 }
Thanks,
Da
|
|
From: Zheng Da <zhe...@gm...> - 2012-05-05 18:25:56
|
hello,
> ==32701== Conditional jump or move depends on uninitialised value(s)
> > ==32701== at 0x4FB3D9: fillin_rpath
> > (in /home/zhengda/Dropbox/research/read-test/rand-read)
> > ==32701== by 0x4FDBCB: _dl_init_paths
> > (in /home/zhengda/Dropbox/research/read-test/rand-read)
> > ==32701== by 0x4CCC58: _dl_non_dynamic_init
> > (in /home/zhengda/Dropbox/research/read-test/rand-read)
> > ==32701== by 0x4CD762: __libc_init_first
> > (in /home/zhengda/Dropbox/research/read-test/rand-read)
> > ==32701== by 0x47F795: (below main)
> > (in /home/zhengda/Dropbox/research/read-test/rand-read)
> > ==32701==
> The above error for example looks to somewhat match a suppression in
> glibc-2.3.supp
>
> It is however not clear what is the cause of all these errors
> not being suppressed.
> Note that usually, having more info such as Valgrind version,
> OS and distribution version, cpu etc might only help to guess
> what it is :).
>
Sorry, I forget.
I use valgrind-3.6.1,
ubuntu 11.04,
Xeon(R) CPU E5405
Linux 2.6.38.8
Other than the errors in the standard library, it also shows many errors in
my own program and I found the error messages are also very misleading.
==21746== Use of uninitialised value of size 8
==21746== at 0x40D168: rand_permute::rand_permute(long, int)
(workload.h:63)
==21746== by 0x410624:
global_rand_permute_workload::global_rand_permute_workload(long, int, long,
long) (workload.h:165)
==21746== by 0x40F3FD: main (rand-read.cc:475)
==21746==
The corresponding code is shown below. I don't understand which variable
isn't initialized?
54 class rand_permute
55 {
56 off_t *offset;
57 long num;
58
59 public:
60 rand_permute(long num, int stride) {
61 offset = (off_t *) valloc(num * sizeof(off_t));
62 for (int i = 0; i < num; i++) {
63 offset[i] = ((off_t) i) * stride;
64 }
65
66 for (int i = num - 1; i >= 1; i--) {
67 int j = random() % i;
68 off_t tmp = offset[j];
69 offset[j] = offset[i];
70 offset[i] = tmp;
71 }
72 }
Thanks,
Da
|
|
From: Stephen H. N. <st...@fl...> - 2012-05-05 17:38:32
|
I would like to be able to use Valgrind on an ARM platform -- specifically a Cortex-A8. It appears that the support for this platform is not complete. Is it "almost complete" where we could do something to finish it or are there major problems, etc.? Thanks, Steve -- Steve Stephen Hicks, N5AC, AAR6AM VP Engineering FlexRadio Systems™ 4616 W Howard Ln Ste 1-150 Austin, TX 78728 Phone: 512-535-4713 x205 Email: st...@fl... Web: www.flexradio.com *Tune In Excitement™* PowerSDR™ is a trademark of FlexRadio Systems |
|
From: Philippe W. <phi...@sk...> - 2012-05-05 16:59:07
|
On Sat, 2012-05-05 at 18:14 +0200, Martin Kalany wrote: > > > >Is the 'cannot open' error only there when running under Valgrind ? > Yes. When I use mpirun, it's fine. > > What I think is strange that valgrind apperantly tries to load > libmpi.so, although it should load libmpich.so.1.0 > > >Maybe a problem related to the dynamic loader ? > I'm rather new to MPI so I'm not sure about this. Valgrind is not supposed to change which shared lib are used: the dynamic loader is executed by Valgrind and should behave the same (and so load the same shared libs as a native run). I know nothing abound MPI and so might not understand what you are doing. But I believe mpirun is a shell script which might be needed to setup some required env variables. So, to be sure, mpirun should be used also when using Valgrind e.g. valgrind --trace-children=yes mpirun .... (or is this what you are doing already ?) Philippe |
|
From: Philippe W. <phi...@sk...> - 2012-05-05 16:14:44
|
On Sat, 2012-05-05 at 14:25 +0200, Martin Kalany wrote:
> Hello,
>
> I'm trying to use valgrind do debug an mpich2 program. Unfortunately, I get the following error:
>
> libmpi.so.0: cannot open shared object file: No such file or directory
>
> I found out that libmpich.so.1.0 should be linked to instead (see libmpiwrap.c). Valgrind documation states that "The MPI functions to be wrapped are assumed to be in an ELF shared object with soname matching libmpi.so*. This is known to be correct at least for Open MPI and Quadrics MPI, and can easily be changed if required."
>
> How do I change that?
Is the 'cannot open' error only there when running under Valgrind ?
The Z encoding used in libmpiwrap.c is a pattern which matches
one or the other library:
#define I_WRAP_FNNAME_U(_name) \
I_WRAP_SONAME_FNNAME_ZU(libmpiZaZdsoZa,_name)
i.e. it is libmpi*.so*.
So, I guess your problem is not the Valgrind wrapping.
Maybe a problem related to the dynamic loader ?
Philippe
|