You can subscribe to this list here.
| 2003 |
Jan
|
Feb
|
Mar
(58) |
Apr
(261) |
May
(169) |
Jun
(214) |
Jul
(201) |
Aug
(219) |
Sep
(198) |
Oct
(203) |
Nov
(241) |
Dec
(94) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2004 |
Jan
(137) |
Feb
(149) |
Mar
(150) |
Apr
(193) |
May
(95) |
Jun
(173) |
Jul
(137) |
Aug
(236) |
Sep
(157) |
Oct
(150) |
Nov
(136) |
Dec
(90) |
| 2005 |
Jan
(139) |
Feb
(130) |
Mar
(274) |
Apr
(138) |
May
(184) |
Jun
(152) |
Jul
(261) |
Aug
(409) |
Sep
(239) |
Oct
(241) |
Nov
(260) |
Dec
(137) |
| 2006 |
Jan
(191) |
Feb
(142) |
Mar
(169) |
Apr
(75) |
May
(141) |
Jun
(169) |
Jul
(131) |
Aug
(141) |
Sep
(192) |
Oct
(176) |
Nov
(142) |
Dec
(95) |
| 2007 |
Jan
(98) |
Feb
(120) |
Mar
(93) |
Apr
(96) |
May
(95) |
Jun
(65) |
Jul
(62) |
Aug
(56) |
Sep
(53) |
Oct
(95) |
Nov
(106) |
Dec
(87) |
| 2008 |
Jan
(58) |
Feb
(149) |
Mar
(175) |
Apr
(110) |
May
(106) |
Jun
(72) |
Jul
(55) |
Aug
(89) |
Sep
(26) |
Oct
(96) |
Nov
(83) |
Dec
(93) |
| 2009 |
Jan
(97) |
Feb
(106) |
Mar
(74) |
Apr
(64) |
May
(115) |
Jun
(83) |
Jul
(137) |
Aug
(103) |
Sep
(56) |
Oct
(59) |
Nov
(61) |
Dec
(37) |
| 2010 |
Jan
(94) |
Feb
(71) |
Mar
(53) |
Apr
(105) |
May
(79) |
Jun
(111) |
Jul
(110) |
Aug
(81) |
Sep
(50) |
Oct
(82) |
Nov
(49) |
Dec
(21) |
| 2011 |
Jan
(87) |
Feb
(105) |
Mar
(108) |
Apr
(99) |
May
(91) |
Jun
(94) |
Jul
(114) |
Aug
(77) |
Sep
(58) |
Oct
(58) |
Nov
(131) |
Dec
(62) |
| 2012 |
Jan
(76) |
Feb
(93) |
Mar
(68) |
Apr
(95) |
May
(62) |
Jun
(109) |
Jul
(90) |
Aug
(87) |
Sep
(49) |
Oct
(54) |
Nov
(66) |
Dec
(84) |
| 2013 |
Jan
(67) |
Feb
(52) |
Mar
(93) |
Apr
(65) |
May
(33) |
Jun
(34) |
Jul
(52) |
Aug
(42) |
Sep
(52) |
Oct
(48) |
Nov
(66) |
Dec
(14) |
| 2014 |
Jan
(66) |
Feb
(51) |
Mar
(34) |
Apr
(47) |
May
(58) |
Jun
(27) |
Jul
(52) |
Aug
(41) |
Sep
(78) |
Oct
(30) |
Nov
(28) |
Dec
(26) |
| 2015 |
Jan
(41) |
Feb
(42) |
Mar
(20) |
Apr
(73) |
May
(31) |
Jun
(48) |
Jul
(23) |
Aug
(55) |
Sep
(36) |
Oct
(47) |
Nov
(48) |
Dec
(41) |
| 2016 |
Jan
(32) |
Feb
(34) |
Mar
(33) |
Apr
(22) |
May
(14) |
Jun
(31) |
Jul
(29) |
Aug
(41) |
Sep
(17) |
Oct
(27) |
Nov
(38) |
Dec
(28) |
| 2017 |
Jan
(28) |
Feb
(30) |
Mar
(16) |
Apr
(9) |
May
(27) |
Jun
(57) |
Jul
(28) |
Aug
(43) |
Sep
(31) |
Oct
(20) |
Nov
(24) |
Dec
(18) |
| 2018 |
Jan
(34) |
Feb
(50) |
Mar
(18) |
Apr
(26) |
May
(13) |
Jun
(31) |
Jul
(13) |
Aug
(11) |
Sep
(15) |
Oct
(12) |
Nov
(18) |
Dec
(13) |
| 2019 |
Jan
(12) |
Feb
(29) |
Mar
(51) |
Apr
(22) |
May
(13) |
Jun
(20) |
Jul
(13) |
Aug
(12) |
Sep
(21) |
Oct
(6) |
Nov
(9) |
Dec
(5) |
| 2020 |
Jan
(13) |
Feb
(5) |
Mar
(25) |
Apr
(4) |
May
(40) |
Jun
(27) |
Jul
(5) |
Aug
(17) |
Sep
(21) |
Oct
(1) |
Nov
(5) |
Dec
(15) |
| 2021 |
Jan
(28) |
Feb
(6) |
Mar
(11) |
Apr
(5) |
May
(7) |
Jun
(8) |
Jul
(5) |
Aug
(5) |
Sep
(11) |
Oct
(9) |
Nov
(10) |
Dec
(12) |
| 2022 |
Jan
(7) |
Feb
(13) |
Mar
(8) |
Apr
(7) |
May
(12) |
Jun
(27) |
Jul
(14) |
Aug
(27) |
Sep
(27) |
Oct
(17) |
Nov
(17) |
Dec
|
| 2023 |
Jan
(10) |
Feb
(18) |
Mar
(9) |
Apr
(26) |
May
|
Jun
(13) |
Jul
(18) |
Aug
(5) |
Sep
(12) |
Oct
(16) |
Nov
(1) |
Dec
|
| 2024 |
Jan
(4) |
Feb
(3) |
Mar
(6) |
Apr
(17) |
May
(2) |
Jun
(33) |
Jul
(13) |
Aug
(1) |
Sep
(6) |
Oct
(8) |
Nov
(6) |
Dec
(15) |
| 2025 |
Jan
(5) |
Feb
(11) |
Mar
(8) |
Apr
(20) |
May
(1) |
Jun
|
Jul
|
Aug
(9) |
Sep
(1) |
Oct
(7) |
Nov
(1) |
Dec
|
|
From: Philippe W. <phi...@sk...> - 2014-04-13 11:37:03
|
On Sun, 2014-04-06 at 19:29 -0700, janjust wrote: > Hi, > I'm trying to profile valgrind using perftools-lite and when compiling > the tools I get an undefined reference error. > VEX and coregrind build, but during linking it errs (error is below). > > I think it has do with static linking, the build command in memcheck was: > > ../coregrind/link_tool_exe_linux 0x38000000 cc -Wno-long-long -gdwarf-3 > -gstrict-dwarf -Wwrite-strings -fno-stack-protector -o > memcheck-amd64-linux -m64 -O2 -g -Wall -Wmissing-prototypes -Wshadow > -Wpointer-arith -Wstrict-prototypes -Wmissing-declarations > -Wno-format-zero-length -fno-strict-aliasing -fno-builtin > -fno-omit-frame-pointer -O2 -nodefaultlibs -nostartfiles -u _start > -Wl,--build-id=none -m64 memcheck_amd64_linux-mc_leakcheck.o > memcheck_amd64_linux-mc_malloc_wrappers.o memcheck_amd64_linux-mc_main.o > memcheck_amd64_linux-mc_translate.o memcheck_amd64_linux-mc_machine.o > memcheck_amd64_linux-mc_errors.o ../coregrind/libcoregrind-amd64-linux.a > ../VEX/libvex-amd64-linux.a -lgcc > > Does anyone have any suggestions on how to resolve this? Valgrind does not use any libs (even does not use glibc), and I guess for a very good reason, as not using any lib has quite some consequences. To my knowledge, 2 techniques are working to profile valgrind: 1. oprofile 2. self-hosting (i.e. running valgrind under a valgrind tool such as callgrind or cachegrind. For more info about self-hosting, search for "Self-hosting" in README_DEVELOPERS Philippe |
|
From: Paul S. <pa...@ma...> - 2014-04-12 23:16:35
|
On Sat, 2014-04-12 at 10:56 +0200, Matthias Schwarzott wrote:
> On 10.04.2014 21:16, Paul Smith wrote:
> > On Thu, 2014-04-10 at 11:15 -0700, John Reiser wrote:
> >>> So for example if we have this in MyAlloc.h:
> >>>
> >>> inline __attribute__((always_inline)) void* MyAlloc(size_t len)
> >>> {
> >>> return malloc(len);
> >>> }
> >>>
> >>> (obviously our inline function does a bit more than that!), then when I
> >>> get a memory loss stacktrace I see something like this:
> >>>
> >>> ==20930== 18,400 bytes in 23 blocks are possibly lost in loss record 496 of 528
> >>> ==20930== at 0x4C2C85E: malloc (vg_replace_malloc.c:292)
> >>> ==20930== by 0x6946E0: SparseArray<unsigned int, 200u>::getPtr(unsigned int)(MyAlloc.h:3)
> >>> ==20930== by 0x693250: SparseArray<unsigned int, 200u>::set(unsigned int, unsigned int) (SparseArray.h:125)
> >>> ==20930== by 0x68E5DD: ...
> >>>
> >>> Note how the function name is correct for the caller of malloc
> >>> (getPtr()), BUT the filename/linenumber is for the MyAlloc.h inlined
> >>> function rather than where the inlined function was invoked.
> >>
> >> Please post the traceback that GDB shows.
> >
> > This is what I get from GDB 7.6.1 (I don't have an example for this same
> > stack trace but you can see the format):
> >
> > #0 malloc (size=24)
> > #1 0x0000000000672337 in MyAlloc (size=24) at MyAlloc.h:3
> > #2 MsgBody::appendHeader (this=0x7f66d57feb90, cNumber=64) at MsgBody.cpp:101
> > #3 0x000000000076b77c in MsgPing::append (this=0x7f66d57feb90, val=0x7f66d8c7f700) at MsgPing.cpp:27
> > #4 ...
>
> Yes, new enough gdb versions know how to display inlined function calls.
> I tried to look into this code but did not understand it.
> addr2line can also do the same magic.
It would be super-nice if this knowledge were abstracted into a separate
library that was useful and usable by other utilities, such as valgrind.
I don't see why everyone should have to roll their own. Although, I'm
not familiar with how the Valgrind VM works so maybe it's not
sufficiently similar to allow use of the same code as gdb/binutils?
> I can send the patch on monday.
>
> I would be happy if valgrind could do the magic on its own.
Boy, me too! :-).
I don't have the time to learn enough about Valgrind to attempt a fix
myself but I'm willing to test any patches, etc. anyone wants to send
along. And I have enough programming fu to be able to, if not run, then
at least amble with a proposed patch on my own.
|
|
From: Matthias S. <zz...@ge...> - 2014-04-12 08:56:21
|
On 10.04.2014 21:16, Paul Smith wrote:
> On Thu, 2014-04-10 at 11:15 -0700, John Reiser wrote:
>>> So for example if we have this in MyAlloc.h:
>>>
>>> inline __attribute__((always_inline)) void* MyAlloc(size_t len)
>>> {
>>> return malloc(len);
>>> }
>>>
>>> (obviously our inline function does a bit more than that!), then when I
>>> get a memory loss stacktrace I see something like this:
>>>
>>> ==20930== 18,400 bytes in 23 blocks are possibly lost in loss record 496 of 528
>>> ==20930== at 0x4C2C85E: malloc (vg_replace_malloc.c:292)
>>> ==20930== by 0x6946E0: SparseArray<unsigned int, 200u>::getPtr(unsigned int)(MyAlloc.h:3)
>>> ==20930== by 0x693250: SparseArray<unsigned int, 200u>::set(unsigned int, unsigned int) (SparseArray.h:125)
>>> ==20930== by 0x68E5DD: ...
>>>
>>> Note how the function name is correct for the caller of malloc
>>> (getPtr()), BUT the filename/linenumber is for the MyAlloc.h inlined
>>> function rather than where the inlined function was invoked.
>>
>> Please post the traceback that GDB shows.
>
> This is what I get from GDB 7.6.1 (I don't have an example for this same
> stack trace but you can see the format):
>
> #0 malloc (size=24)
> #1 0x0000000000672337 in MyAlloc (size=24) at MyAlloc.h:3
> #2 MsgBody::appendHeader (this=0x7f66d57feb90, cNumber=64) at MsgBody.cpp:101
> #3 0x000000000076b77c in MsgPing::append (this=0x7f66d57feb90, val=0x7f66d8c7f700) at MsgPing.cpp:27
> #4 ...
>
> where MsgBody.cpp:101 is an invocation of the MyAlloc() inline function.
> It seems that GDB basically shows two lines at the same instruction
> location, and doesn't prefix the second one with the location.
>
Hi!
Yes, new enough gdb versions know how to display inlined function calls.
I tried to look into this code but did not understand it.
addr2line can also do the same magic.
Until now I solved this by using addr2line. For resolving callstacks
printed by valgrind you need to know the offset between the load address
and the values in the original library (e.g. printed by nm).
This can either be guessed (by comparing the addresses of some functions
in valgrind stacks to what nm prints (regarding offset+size and assuming
the last 3 hex digits stay the same).
That means choosing a short function immediately gives the unambigous
offset.
Additionally I did a patch: Valgrind now writes these load offsets into
the output xml file.
Then I have a postprocessing script that basically subtracts the load
offset of each single address and calls addr2line.
But this is another very slow step (so I only do it on demand).
I can send the patch on monday.
I would be happy if valgrind could do the magic on its own.
Regards
Matthias
|
|
From: Paul S. <pa...@ma...> - 2014-04-10 19:38:11
|
On Thu, 2014-04-10 at 11:15 -0700, John Reiser wrote:
> > So for example if we have this in MyAlloc.h:
> >
> > inline __attribute__((always_inline)) void* MyAlloc(size_t len)
> > {
> > return malloc(len);
> > }
> >
> > (obviously our inline function does a bit more than that!), then when I
> > get a memory loss stacktrace I see something like this:
> >
> > ==20930== 18,400 bytes in 23 blocks are possibly lost in loss record 496 of 528
> > ==20930== at 0x4C2C85E: malloc (vg_replace_malloc.c:292)
> > ==20930== by 0x6946E0: SparseArray<unsigned int, 200u>::getPtr(unsigned int)(MyAlloc.h:3)
> > ==20930== by 0x693250: SparseArray<unsigned int, 200u>::set(unsigned int, unsigned int) (SparseArray.h:125)
> > ==20930== by 0x68E5DD: ...
> >
> > Note how the function name is correct for the caller of malloc
> > (getPtr()), BUT the filename/linenumber is for the MyAlloc.h inlined
> > function rather than where the inlined function was invoked.
>
> Please post the traceback that GDB shows.
This is what I get from GDB 7.6.1 (I don't have an example for this same
stack trace but you can see the format):
#0 malloc (size=24)
#1 0x0000000000672337 in MyAlloc (size=24) at MyAlloc.h:3
#2 MsgBody::appendHeader (this=0x7f66d57feb90, cNumber=64) at MsgBody.cpp:101
#3 0x000000000076b77c in MsgPing::append (this=0x7f66d57feb90, val=0x7f66d8c7f700) at MsgPing.cpp:27
#4 ...
where MsgBody.cpp:101 is an invocation of the MyAlloc() inline function.
It seems that GDB basically shows two lines at the same instruction
location, and doesn't prefix the second one with the location.
> If you want
> ==20930== 18,400 bytes in 23 blocks are possibly lost in loss record 496 of 528
> ==20930== at 0x4C2C85E: malloc (vg_replace_malloc.c:292)
> ==20930== by 0x6946E0: MyAlloc (MyAlloc.h:3)
> ==20930== by 0x6946E0: SparseArray<unsigned int, 200u>::getPtr(unsigned int) (SparseArray.h:234)
> ==20930== by 0x693250: SparseArray<unsigned int, 200u>::set(unsigned int, unsigned int) (SparseArray.h:125)
> ==20930== by 0x68E5DD: ...
> then there may be a problem with having the same return address for MyAlloc.h:3 and SparseArray.h:234
> when only one actual subroutine CALL instruction is involved.
If I could only get one frame and I had to choose whether I got the
filename/linenumber info from the caller or the inline function, I would
choose the caller, myself.
In my situation the inline function is small and calls malloc() exactly
once so I can easily infer what happens and if I didn't have that info I
would probably not even notice. But, I can imagine other situations
where the inline function was more complex and contained multiple
invocations of malloc itself, then not having the inline
filename/linenumber as well could be a bummer.
But I still think, overall, the caller is more helpful most of the
time :-).
Cheers!
|
|
From: David C. <dc...@gm...> - 2014-04-10 19:35:03
|
Hi, We're seeing some crash behaviour in system routines such as gethostbyname. This only manifests itself on our newer machines - our old 32bit machines do not have this behaviour. Any idea what can cause it or how to fix it? Not supplying an special options to memcheck or valgrind. Tech details below. Thanks, David. valgrind version: valgrind-3.9.0 uname -a: Linux hostname 2.6.18-194.3.1.el5 #1 SMP Sun May 2 04:17:42 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux cat /proc/cpuinfo: processor : 22 vendor_id : GenuineIntel cpu family : 6 model : 44 model name : Intel(R) Xeon(R) CPU X5680 @ 3.33GHz stepping : 2 cpu MHz : 3333.597 cache size : 12288 KB physical id : 1 siblings : 12 core id : 9 cpu cores : 6 apicid : 51 fpu : yes fpu_exception : yes cpuid level : 11 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx pdpe1gb rdtscp lm constant_tsc nonstop_tsc arat pni monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr sse4_1 sse4_2 popcnt lahf_lm bogomips : 6666.80 clflush size : 64 address sizes : 40 bits physical, 48 bits virtual power management: [8] valgrind log: starts with... ==29802== Thread 2: ==29802== Invalid read of size 4 ==29802== at 0x3860209336: __libc_res_nsend (in /lib64/libresolv-2.5.so) ==29802== by 0x37FFFFFFFF: ??? ==29802== by 0x3860207E95: __libc_res_nquerydomain (in /lib64/ libresolv-2.5.so) ==29802== by 0x38602081BF: __libc_res_nsearch (in /lib64/libresolv-2.5.so) ==29802== by 0x3238C96E: _nss_dns_gethostbyname3_r (in /lib64/ libnss_dns-2.5.so) ==29802== by 0x3238CB5D: _nss_dns_gethostbyname_r (in /lib64/ libnss_dns-2.5.so) ==29802== by 0x2B97CA8F: ??? ==29802== by 0x26B66D8D: ??? ==29802== by 0x2B97CC1F: ??? ==29802== by 0x2B97CC37: ??? ==29802== by 0x385DE03A87: ??? (in /lib64/libc-2.5.so) ==29802== Address 0x2b97b37c is on thread 2's stack lots of that repeated, then finally: ==29802== Process terminating with default action of signal 11 (SIGSEGV) ==29802== Bad permissions for mapped region at address 0x3800000000 ==29802== at 0x3800000000: ??? ==29802== by 0x3860207E95: __libc_res_nquerydomain (in /lib64/ libresolv-2.5.so) ==29802== by 0x38602081BF: __libc_res_nsearch (in /lib64/libresolv-2.5.so) ==29802== by 0x3238C96E: _nss_dns_gethostbyname3_r (in /lib64/ libnss_dns-2.5.so) ==29802== by 0x3238CB5D: _nss_dns_gethostbyname_r (in /lib64/ libnss_dns-2.5.so) ==29802== by 0x2B97CA8F: ??? ==29802== by 0x26B66D8D: ??? ==29802== by 0x2B97CC1F: ??? ==29802== by 0x2B97CC37: ??? ==29802== by 0x385DE03A87: ??? (in /lib64/libc-2.5.so) |
|
From: John R. <jr...@Bi...> - 2014-04-10 18:15:16
|
> So for example if we have this in MyAlloc.h:
>
> inline __attribute__((always_inline)) void* MyAlloc(size_t len)
> {
> return malloc(len);
> }
>
> (obviously our inline function does a bit more than that!), then when I
> get a memory loss stacktrace I see something like this:
>
> ==20930== 18,400 bytes in 23 blocks are possibly lost in loss record 496 of 528
> ==20930== at 0x4C2C85E: malloc (vg_replace_malloc.c:292)
> ==20930== by 0x6946E0: SparseArray<unsigned int, 200u>::getPtr(unsigned int)(MyAlloc.h:3)
> ==20930== by 0x693250: SparseArray<unsigned int, 200u>::set(unsigned int, unsigned int) (SparseArray.h:125)
> ==20930== by 0x68E5DD: ...
>
> Note how the function name is correct for the caller of malloc
> (getPtr()), BUT the filename/linenumber is for the MyAlloc.h inlined
> function rather than where the inlined function was invoked. This
> example happens to C++ templates but the same thing happens for
> straightforward function calls.
>
> This is very painful because that function might allocate memory in many
> different places and I have no idea which one valgrind is talking about.
>
> If use GDB and set a breakpoint on malloc, then the stacktrace shown by
> GDB is correct: it shows a separate stack frame for the inlined
> allocation function and then the "right" stack frame for the caller.
Please post the traceback that GDB shows. You have done an excellent job
at presenting your environment (thank you for the software versions!)
and the actual output that you see from memcheck. However I'm unsure
of exactly the output that you desire.
If you want
==20930== 18,400 bytes in 23 blocks are possibly lost in loss record 496 of 528
==20930== at 0x4C2C85E: malloc (vg_replace_malloc.c:292)
==20930== by 0x6946E0: MyAlloc (MyAlloc.h:3)
==20930== by 0x6946E0: SparseArray<unsigned int, 200u>::getPtr(unsigned int) (SparseArray.h:234)
==20930== by 0x693250: SparseArray<unsigned int, 200u>::set(unsigned int, unsigned int) (SparseArray.h:125)
==20930== by 0x68E5DD: ...
then there may be a problem with having the same return address for MyAlloc.h:3 and SparseArray.h:234
when only one actual subroutine CALL instruction is involved.
|
|
From: Paul S. <pa...@ma...> - 2014-04-10 17:26:22
|
Hi all; I'm working on GNU/Linux with GCC 4.8.2, binutils 2.23.2, and
Valgrind 1.9.0. I'm building my code (for this test) with -g and no
optimization.
For various reasons we are using an inlined allocation function that
itself calls malloc(). In order to ensure it works properly for all our
situations (including building as both static and shared libraries: we
don't want to export the inlined function as a weak symbol) we are
marking it with "inline __attribute__((always_inline))".
This is working fine, but there's one issue we've noticed so far: when
Valgrind prints the stack traceback in its output, it's always using the
filename and line number of the _inlined_ function and not the calling
function. The _name_ of the calling function is correct though.
So for example if we have this in MyAlloc.h:
inline __attribute__((always_inline)) void* MyAlloc(size_t len)
{
return malloc(len);
}
(obviously our inline function does a bit more than that!), then when I
get a memory loss stacktrace I see something like this:
==20930== 18,400 bytes in 23 blocks are possibly lost in loss record 496 of 528
==20930== at 0x4C2C85E: malloc (vg_replace_malloc.c:292)
==20930== by 0x6946E0: SparseArray<unsigned int, 200u>::getPtr(unsigned int)(MyAlloc.h:3)
==20930== by 0x693250: SparseArray<unsigned int, 200u>::set(unsigned int, unsigned int) (SparseArray.h:125)
==20930== by 0x68E5DD: ...
Note how the function name is correct for the caller of malloc
(getPtr()), BUT the filename/linenumber is for the MyAlloc.h inlined
function rather than where the inlined function was invoked. This
example happens to C++ templates but the same thing happens for
straightforward function calls.
This is very painful because that function might allocate memory in many
different places and I have no idea which one valgrind is talking about.
If use GDB and set a breakpoint on malloc, then the stacktrace shown by
GDB is correct: it shows a separate stack frame for the inlined
allocation function and then the "right" stack frame for the caller.
Any thoughts?
|
|
From: Bob K. <Bo...@ri...> - 2014-04-08 16:10:54
|
On 4/7/14, 4:45 PM, "Julian Seward" <js...@ac...> wrote: >In theory you might be able to use addr2line, but you'd need to do some >arithmetic -- basically, subtract the library actual load address -- >before you can give those addresses to addr2line. Please excuse my ignorance - by "actual load address" do you mean something that would only be determined at run time like the base address from /proc/12345/maps? Or do I just need to do some arithmetic with the original binary, the debug symbols, and the map file? > >The usual problem with running V on embedded targets is (1) that it >takes forever to move the debuginfo objects onto the target, and (2) >doing so uses up all the flash storage on the target. If your target >can communicate with the build host using TCP/IP, you might like to try >running V's debuginfo server on the host (auxprogs/valgrind-di-server.c) >and asking V on the target to use that, using --debuginfo-server=. > >It's pretty crude (needs work) but it does just about work. Great, I'll take a look into this as well. > >J > |
|
From: John R. <jr...@Bi...> - 2014-04-08 04:57:58
|
Hi janjust, > I found that perftools-lite do not support static > linking (kinda odd). > There are no perf.a files to be linked statically with valgrind; however, > perftools should work with static linking. > > The perftools process is: > build your application and retain object files, then use: > $pat_build -O apa <executable> > This gives a new executable to run. > > However, I tried instrumenting the nulltool but get: > ERROR: Missing required ELF section '.note.link' from the program > ../none-amd64-linux It would be good to know whether that complaint is from valgrind, or from perftools, or from the app itself. And "../none-amd64-linux" is a peculiar name for a program; instead that usually designates an execution environment by naming three relevant pieces. In this case "amd64" is the hardware, "linux" is the operating system, and "none" is some aspect that I don't know. > > I presume this is somehow stripped from none-amd64-linux? > > Could this line in the makefile have anything to do whit this? I suppose it could, but the connection is not obvious. Vary the command-line parameters by removing each one individually (keeping the others), and see what complaints you get. Probably the individual results won't run, but we're after hints from any warnings or error messages that might arise. > > 523 # -Wl,--build-id=none is needed when linking tools with a linker that > only > 524 # knows -Ttext and not -Ttext-segment. Without this flag newer ld > versions > 525 # (2.20 and later) create a .note.gnu.build-id at the default text > segment > 526 # address, which of course means the resulting executable > 527 # is unusable. So we have to tell ld not to generate that, with > 528 # --build-id=none unless the linker supports -Ttext-segment. > 529 TOOL_LDFLAGS_COMMON_LINUX = \ > 530 -static -nodefaultlibs -nostartfiles -u _start Run "readelf --sections" on every binary file that contributes [including by transitive closure] to the "final executable", and see where all the ".note.link" sections are. See if any of them disappear before the end. |
|
From: janjust <tja...@un...> - 2014-04-08 03:37:16
|
Hi John, Thanks for the reply. I found that perftools-lite do not support static linking (kinda odd). There are no perf.a files to be linked statically with valgrind; however, perftools should work with static linking. The perftools process is: build your application and retain object files, then use: $pat_build -O apa <executable> This gives a new executable to run. However, I tried instrumenting the nulltool but get: ERROR: Missing required ELF section '.note.link' from the program ../none-amd64-linux I presume this is somehow stripped from none-amd64-linux? Could this line in the makefile have anything to do whit this? 523 # -Wl,--build-id=none is needed when linking tools with a linker that only 524 # knows -Ttext and not -Ttext-segment. Without this flag newer ld versions 525 # (2.20 and later) create a .note.gnu.build-id at the default text segment 526 # address, which of course means the resulting executable 527 # is unusable. So we have to tell ld not to generate that, with 528 # --build-id=none unless the linker supports -Ttext-segment. 529 TOOL_LDFLAGS_COMMON_LINUX = \ 530 -static -nodefaultlibs -nostartfiles -u _start Any ideas? -- View this message in context: http://valgrind.10908.n7.nabble.com/building-valgrind-perftools-lite-linking-error-tp49214p49230.html Sent from the Valgrind - Users mailing list archive at Nabble.com. |
|
From: Emilio C. <er...@gm...> - 2014-04-08 01:20:47
|
>
> Dear Sir,
>
> Thank you for replying but aprof i guess is like a
> profiler, what i am looking for is a code or tool that takes any c or c++
> program and on execution gives me the output in terms of asymptotic
> notation.
>
> For example : test.c
> for (i = 0; i < N; i++) {
> sequence of statements
> }
>
> output: O(N)
>
>
> Regards,
> Zinat Shaikh
>
Hi Zinat,
Thank you for replying but aprof i guess is like a profiler, what i am
> looking for is a code or tool that takes any c or c++ program and on
> execution gives me the output in terms of asymptotic notation.
>
I am not aware of any tool capable of doing this *BUT* aprof does in "some
way" exactly this: for each function/routine of your code, it gives you:
- input sizes: e.g., x1=5 , x2=15, x3=20, x4=100, xN=1000, ...
- running times (of the routine) on these input sizes: e.g., y1=10, y2=30,
y3=40, x4=200, yN=2000
Then, you can plot each point (x_i, y_i) and analyze the trend. The trend
will give you an hint on the asymptotic cost of you function, e.g. O(n)
There are several ways for analyzing the trend:
- if you use aprof-plot (a Java GUI which we developed for analyzing
aprof's reports), you can apply the "guess ratio rule". *See our tutorial
for more info* <https://code.google.com/p/aprof/wiki/Tutorial1>.
- You can easily export from aprof-plot the points related to a specific
routine and perform curve fitting (e.g., using gnuplot, matlab, python
scipy, etc). For instance, you can test for a function like a + b*x^c and
the curve fitting library wiil give you the parameters a, b and c.
aprof provides only an hint because it is a dynamic analysis tool and it
can only analyze the actual behaviour during your experiment. If you want a
more "general" and "sound" answer, you can't use any dynamic analysis tool.
Maybe, a static analysis tool can do better, but for this goal it is very
hard/expensive (you need to analyze all possible execution paths inside
your code and for each one analyze all possible inputs).
Emilio
|
|
From: John R. <jr...@Bi...> - 2014-04-08 00:17:02
|
On 04/07/2014 02:47 PM, Vijay Viswanathan wrote: > > I ran myprogram using : > > G_SLICE=always-malloc valgrind -v --run-libc-freeres=no --vgdb=no --tool=memcheck --leak-check=full --leak-resolution=high --num-callers=50 --workaround-gcc296-bugs=yes --log-f > ile=vgdump > > I got this output : > > mypgrogram:64: can't map '/etc/ld.so.cache' > > > Any idea whats goin on? There would be more ideas, and sooner, if you told us which hardware architecture, and which libc, and which OS, and the version number of each piece of software. It really DOES matter! Also, you should run under strace, or "valgrind --trace-syscalls=yes", to learn exactly why ld.so.cache could not be mapped. |
|
From: Vijay V. <vij...@gm...> - 2014-04-07 21:47:55
|
I ran myprogram using : G_SLICE=always-malloc valgrind -v --run-libc-freeres=no --vgdb=no --tool=memcheck --leak-check=full --leak-resolution=high --num-callers=50 --workaround-gcc296-bugs=yes --log-f ile=vgdump I got this output : mypgrogram:64: can't map '/etc/ld.so.cache' Any idea whats goin on? Thx. |
|
From: Julian S. <js...@ac...> - 2014-04-07 21:45:40
|
On 04/07/2014 10:26 PM, Bob Kuo wrote: > Is it possible to run valgrind *without* the debug symbols available > and then take the valgrind log output, the .map, and the debug symbols > to get a full stack trace? That would simplify our testing greatly. In theory you might be able to use addr2line, but you'd need to do some arithmetic -- basically, subtract the library actual load address -- before you can give those addresses to addr2line. The usual problem with running V on embedded targets is (1) that it takes forever to move the debuginfo objects onto the target, and (2) doing so uses up all the flash storage on the target. If your target can communicate with the build host using TCP/IP, you might like to try running V's debuginfo server on the host (auxprogs/valgrind-di-server.c) and asking V on the target to use that, using --debuginfo-server=. It's pretty crude (needs work) but it does just about work. J |
|
From: Bob K. <Bo...@ri...> - 2014-04-07 20:26:42
|
Hello valgrind-users, We are currently using valgrind at work by putting the debug symbols onto the appliance at test time and running our process under valgrind. The log output contains the correct stack trace as expected. Setting up the appliance for this kind of testing is non-trivial, especially getting the debug symbols in place. Is it possible to run valgrind *without* the debug symbols available and then take the valgrind log output, the .map, and the debug symbols to get a full stack trace? That would simplify our testing greatly. Thanks, Bob |
|
From: John R. <jr...@Bi...> - 2014-04-07 03:59:13
|
> I'm trying to profile valgrind using perftools-lite and when compiling > the tools I get an undefined reference error. > VEX and coregrind build, but during linking it errs (error is below). > > I think it has do with static linking, the build command in memcheck was: > > ../coregrind/link_tool_exe_linux 0x38000000 cc -Wno-long-long -gdwarf-3 > -gstrict-dwarf -Wwrite-strings -fno-stack-protector -o > memcheck-amd64-linux -m64 -O2 -g -Wall -Wmissing-prototypes -Wshadow > -Wpointer-arith -Wstrict-prototypes -Wmissing-declarations > -Wno-format-zero-length -fno-strict-aliasing -fno-builtin > -fno-omit-frame-pointer -O2 -nodefaultlibs -nostartfiles -u _start > -Wl,--build-id=none -m64 memcheck_amd64_linux-mc_leakcheck.o > memcheck_amd64_linux-mc_malloc_wrappers.o memcheck_amd64_linux-mc_main.o > memcheck_amd64_linux-mc_translate.o memcheck_amd64_linux-mc_machine.o > memcheck_amd64_linux-mc_errors.o ../coregrind/libcoregrind-amd64-linux.a > ../VEX/libvex-amd64-linux.a -lgcc > > Does anyone have any suggestions on how to resolve this? Run the link_tool_exe_linux under strace: strace -f -e trace=execve,file -o strace.out ../coregrind/link_tool_exe_linux ... in order to find out how many levels of execve() are involved, and exactly which files are referenced when. Many of the undefined symbols, such as _Unwind_Resume, are defined in libgcc_s.so.1 or libgcc_eh.so.1. So if you don't see those filenames, or the *.a archive library analogs, then that's the problem. Add them explicitly to the command line. |
|
From: janjust <tja...@un...> - 2014-04-07 02:29:36
|
Hi, I'm trying to profile valgrind using perftools-lite and when compiling the tools I get an undefined reference error. VEX and coregrind build, but during linking it errs (error is below). I think it has do with static linking, the build command in memcheck was: ../coregrind/link_tool_exe_linux 0x38000000 cc -Wno-long-long -gdwarf-3 -gstrict-dwarf -Wwrite-strings -fno-stack-protector -o memcheck-amd64-linux -m64 -O2 -g -Wall -Wmissing-prototypes -Wshadow -Wpointer-arith -Wstrict-prototypes -Wmissing-declarations -Wno-format-zero-length -fno-strict-aliasing -fno-builtin -fno-omit-frame-pointer -O2 -nodefaultlibs -nostartfiles -u _start -Wl,--build-id=none -m64 memcheck_amd64_linux-mc_leakcheck.o memcheck_amd64_linux-mc_malloc_wrappers.o memcheck_amd64_linux-mc_main.o memcheck_amd64_linux-mc_translate.o memcheck_amd64_linux-mc_machine.o memcheck_amd64_linux-mc_errors.o ../coregrind/libcoregrind-amd64-linux.a ../VEX/libvex-amd64-linux.a -lgcc Does anyone have any suggestions on how to resolve this? ----- ../coregrind/libcoregrind-amd64-linux.a(libcoregrind_amd64_linux_a-m_main.o):/ccs/home/janjust/janjust_proj/valgrind-trunk/coregrind/m_main.c:2684: first defined here /opt/cray/xe-sysroot/4.2.34/usr/lib64/libpthread.a(pthread_once.o): In function `clear_once_control': /usr/src/packages/BUILD/glibc-2.11.3/nptl/../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_once.S:163: undefined reference to `_Unwind_Resume' /opt/cray/xe-sysroot/4.2.34/usr/lib64/libpthread.a(pthread_once.o):(.eh_frame+0x13): undefined reference to `__gcc_personality_v0' /opt/cray/xe-sysroot/4.2.34/usr/lib64/libpthread.a(unwind.o): In function `__pthread_unwind': /usr/src/packages/BUILD/glibc-2.11.3/nptl/unwind.c:130: undefined reference to `_Unwind_ForcedUnwind' /opt/cray/xe-sysroot/4.2.34/usr/lib64/libpthread.a(unwind.o): In function `unwind_stop': /usr/src/packages/BUILD/glibc-2.11.3/nptl/unwind.c:61: undefined reference to `_Unwind_GetCFA' /usr/src/packages/BUILD/glibc-2.11.3/nptl/unwind.c:72: undefined reference to `_Unwind_GetCFA' /opt/cray/xe-sysroot/4.2.34/usr/lib64/libc.a(iofclose.o): In function `_IO_acquire_lock_fct': /usr/src/packages/BUILD/glibc-2.11.3/libio/libioP.h:969: undefined reference to `_Unwind_Resume' /opt/cray/xe-sysroot/4.2.34/usr/lib64/libc.a(iofclose.o):(.eh_frame+0x21b): undefined reference to `__gcc_personality_v0' /opt/cray/xe-sysroot/4.2.34/usr/lib64/libc.a(iofflush.o): In function `_IO_acquire_lock_fct': /usr/src/packages/BUILD/glibc-2.11.3/libio/libioP.h:969: undefined reference to `_Unwind_Resume' /opt/cray/xe-sysroot/4.2.34/usr/lib64/libc.a(iofflush.o):(.eh_frame+0x14b): undefined reference to `__gcc_personality_v0' /opt/cray/xe-sysroot/4.2.34/usr/lib64/libc.a(iofwrite.o): In function `_IO_acquire_lock_fct': /usr/src/packages/BUILD/glibc-2.11.3/libio/libioP.h:969: undefined reference to `_Unwind_Resume' /opt/cray/xe-sysroot/4.2.34/usr/lib64/libc.a(iofwrite.o):(.eh_frame+0x14b): undefined reference to `__gcc_personality_v0' /opt/cray/xe-sysroot/4.2.34/usr/lib64/libc.a(wfileops.o): In function `_IO_acquire_lock_fct': /usr/src/packages/BUILD/glibc-2.11.3/libio/../libio/libioP.h:969: undefined reference to `_Unwind_Resume' /opt/cray/xe-sysroot/4.2.34/usr/lib64/libc.a(wfileops.o):(.eh_frame+0x14b): undefined reference to `__gcc_personality_v0' /opt/cray/xe-sysroot/4.2.34/usr/lib64/libc.a(fileops.o): In function `_IO_acquire_lock_fct': /usr/src/packages/BUILD/glibc-2.11.3/libio/libioP.h:969: undefined reference to `_Unwind_Resume' /opt/cray/xe-sysroot/4.2.34/usr/lib64/libc.a(fileops.o): In function `_IO_new_file_fopen': /usr/src/packages/BUILD/glibc-2.11.3/libio/fileops.c:409: undefined reference to `_Unwind_Resume' /opt/cray/xe-sysroot/4.2.34/usr/lib64/libc.a(fileops.o):(.eh_frame+0x14b): undefined reference to `__gcc_personality_v0' /opt/cray/xe-sysroot/4.2.34/usr/lib64/libc.a(syslog.o): In function `__libc_cleanup_routine': /usr/src/packages/BUILD/glibc-2.11.3/misc/../nptl/sysdeps/pthread/bits/libc-lock.h:432: undefined reference to `_Unwind_Resume' /usr/src/packages/BUILD/glibc-2.11.3/misc/../nptl/sysdeps/pthread/bits/libc-lock.h:432: undefined reference to `_Unwind_Resume' /usr/src/packages/BUILD/glibc-2.11.3/misc/../nptl/sysdeps/pthread/bits/libc-lock.h:432: undefined reference to `_Unwind_Resume' /usr/src/packages/BUILD/glibc-2.11.3/misc/../nptl/sysdeps/pthread/bits/libc-lock.h:432: undefined reference to `_Unwind_Resume' /opt/cray/xe-sysroot/4.2.34/usr/lib64/libc.a(syslog.o):(.eh_frame+0x21b): undefined reference to `__gcc_personality_v0' /opt/cray/xe-sysroot/4.2.34/usr/lib64/libc.a(backtrace.o): In function `__backtrace': /usr/src/packages/BUILD/glibc-2.11.3/debug/../sysdeps/x86_64/../ia64/backtrace.c:110: undefined reference to `_Unwind_Backtrace' /opt/cray/xe-sysroot/4.2.34/usr/lib64/libc.a(backtrace.o): In function `backtrace_helper': /usr/src/packages/BUILD/glibc-2.11.3/debug/../sysdeps/x86_64/../ia64/backtrace.c:80: undefined reference to `_Unwind_GetIP' /usr/src/packages/BUILD/glibc-2.11.3/debug/../sysdeps/x86_64/../ia64/backtrace.c:83: undefined reference to `_Unwind_GetCFA' /opt/cray/xe-sysroot/4.2.34/usr/lib64/libc.a(vfprintf_chk.o): In function `_IO_acquire_lock_clear_flags2_fct': /usr/src/packages/BUILD/glibc-2.11.3/debug/../libio/libioP.h:979: undefined reference to `_Unwind_Resume' /opt/cray/xe-sysroot/4.2.34/usr/lib64/libc.a(vfprintf_chk.o):(.eh_frame+0x14b): undefined reference to `__gcc_personality_v0' /opt/cray/xe-sysroot/4.2.34/usr/lib64/libc.a(iofputs.o): In function `_IO_acquire_lock_fct': /usr/src/packages/BUILD/glibc-2.11.3/libio/libioP.h:969: undefined reference to `_Unwind_Resume' /opt/cray/xe-sysroot/4.2.34/usr/lib64/libc.a(iofputs.o):(.eh_frame+0x14b): undefined reference to `__gcc_personality_v0' /opt/cray/xe-sysroot/4.2.34/usr/lib64/libc.a(ioftell.o): In function `_IO_acquire_lock_fct': /usr/src/packages/BUILD/glibc-2.11.3/libio/libioP.h:969: undefined reference to `_Unwind_Resume' /opt/cray/xe-sysroot/4.2.34/usr/lib64/libc.a(ioftell.o):(.eh_frame+0x14b): undefined reference to `__gcc_personality_v0' /opt/cray/xe-sysroot/4.2.34/usr/lib64/libc.a(iogetdelim.o): In function `_IO_acquire_lock_fct': /usr/src/packages/BUILD/glibc-2.11.3/libio/libioP.h:969: undefined reference to `_Unwind_Resume' /opt/cray/xe-sysroot/4.2.34/usr/lib64/libc.a(iogetdelim.o):(.eh_frame+0x14b): undefined reference to `__gcc_personality_v0' /opt/cray/xe-sysroot/4.2.34/usr/lib64/libc.a(ioseekoff.o): In function `_IO_acquire_lock_fct': /usr/src/packages/BUILD/glibc-2.11.3/libio/../libio/libioP.h:969: undefined reference to `_Unwind_Resume' /opt/cray/xe-sysroot/4.2.34/usr/lib64/libc.a(ioseekoff.o):(.eh_frame+0x14b): undefined reference to `__gcc_personality_v0' /opt/cray/xe-sysroot/4.2.34/usr/lib64/libc.a(writev.o): In function `ifree': /usr/src/packages/BUILD/glibc-2.11.3/misc/../sysdeps/posix/writev.c:32: undefined reference to `_Unwind_Resume' /opt/cray/xe-sysroot/4.2.34/usr/lib64/libc.a(writev.o):(.eh_frame+0x13): undefined reference to `__gcc_personality_v0' /opt/cray/xe-sysroot/4.2.34/usr/lib64/libc.a(fseek.o): In function `_IO_acquire_lock_fct': /usr/src/packages/BUILD/glibc-2.11.3/libio/libioP.h:969: undefined reference to `_Unwind_Resume' /opt/cray/xe-sysroot/4.2.34/usr/lib64/libc.a(fseek.o):(.eh_frame+0x14b): undefined reference to `__gcc_personality_v0' /opt/cray/xe-sysroot/4.2.34/usr/lib64/libc.a(ftello.o): In function `_IO_acquire_lock_fct': /usr/src/packages/BUILD/glibc-2.11.3/libio/../libio/libioP.h:969: undefined reference to `_Unwind_Resume' /opt/cray/xe-sysroot/4.2.34/usr/lib64/libc.a(ftello.o):(.eh_frame+0x14b): undefined reference to `__gcc_personality_v0' /usr/bin/ld: link errors found, deleting executable `memcheck-amd64-linux' collect2: error: ld returned 1 exit status -- View this message in context: http://valgrind.10908.n7.nabble.com/building-valgrind-perftools-lite-linking-error-tp49214.html Sent from the Valgrind - Users mailing list archive at Nabble.com. |
|
From: Vasily G. <vas...@gm...> - 2014-04-05 17:45:32
|
Dear Zinat, You can try to look at http://code.google.com/p/aprof/ Vasily On Sat, Apr 5, 2014 at 5:25 PM, Zinat Shaikh <zin...@gm...> wrote: > > Dear All, > > I am looking out for an opensource tool or a source code > that could find time and space complexity of a proram and give its output in > terms of asymptotic notation. I got one tool called OpenPAT but for it you > have to register. > > > > > Regards, > Zinat Shaikh > > > ------------------------------------------------------------------------------ > > _______________________________________________ > Valgrind-users mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-users > -- Best Regards, Vasily |
|
From: Emilio C. <er...@gm...> - 2014-04-05 13:41:03
|
Hi, You can try with aprof: https://code.google.com/p/aprof/ It's a new tool based on Valgrind but it's not included in the standard Valgrind package/repository. Please check the wiki for a manual and a brief tutorial explaining the main goals of this profiler and how to use it. I would be happy to hear about any experience, difficulties, questions, or suggestions about aprof :) Emilio |
|
From: Zinat S. <zin...@gm...> - 2014-04-05 13:25:57
|
Dear All,
I am looking out for an opensource tool or a source code
that could find time and space complexity of a proram and give its output
in terms of asymptotic notation. I got one tool called OpenPAT but for it
you have to register.
Regards,
Zinat Shaikh
|
|
From: Philippe W. <phi...@sk...> - 2014-04-03 22:43:02
|
On Thu, 2014-04-03 at 15:33 -0700, janjust wrote: > Philippe, > This worked! Thank you so much for your help. > > So your hypothesis is correct. The huge_pages (at least on my system) have > an alignment issue if MAP_FIXED is used, and no alignment issue if it's not > used. > > The syswrap-generic.c seems to always use MAP_FIXED, is that a valgrind > requirement? That is the way today Valgrind manages memory. I think it could work without (cfr your patch) or at least use MAP_FIXED only as a first preferred way to do an mmap. > > At any rate, I filed a bug report, and attached a potential patch (which > worked for me), but I'm not sure if I did this correctly. All I did is added > another "refinement" fallback to do a 3rd mmap() ignoring the MAP_FIXED. > > This could be done better though but maybe giving aspacemanager a hint to > give me a hugepage_size aligned address if it fails the second time. > > Here is the bug report and the patch should be attached (again the patch is > brutally simple). > > https://bugs.kde.org/show_bug.cgi?id=333051 Ok, thanks for the feedback, the bug and the patch. Touching mmap and aspacemgr area is a touchy area, so for sure this patch will have to be carefully looked at. But a bug with a patch (and even better, with a test case) is more likely to attract attention :). > > Also I have another issue but it seems to work, for now… > > I get a warning with: > ==21390== Warning: noted but unhandled ioctl 0x7801 with no size/direction > hints > > Any ideas what that is? Valgrind must have some little code so that it "understands" what a syscall is doing (e.g. what memory it is reading and writing). There are a bunch of ioctl variety, and Valgrind does not understand them all. Such not understood ioctl causes such msgs (and then could cause false positive or false negative e.g. in memcheck). I think you will find more info in README_MISSING_SYSCALL_OR_IOCTL Philippe |
|
From: janjust <tja...@un...> - 2014-04-03 22:33:09
|
Philippe, This worked! Thank you so much for your help. So your hypothesis is correct. The huge_pages (at least on my system) have an alignment issue if MAP_FIXED is used, and no alignment issue if it's not used. The syswrap-generic.c seems to always use MAP_FIXED, is that a valgrind requirement? At any rate, I filed a bug report, and attached a potential patch (which worked for me), but I'm not sure if I did this correctly. All I did is added another "refinement" fallback to do a 3rd mmap() ignoring the MAP_FIXED. This could be done better though but maybe giving aspacemanager a hint to give me a hugepage_size aligned address if it fails the second time. Here is the bug report and the patch should be attached (again the patch is brutally simple). https://bugs.kde.org/show_bug.cgi?id=333051 Also I have another issue but it seems to work, for now… I get a warning with: ==21390== Warning: noted but unhandled ioctl 0x7801 with no size/direction hints Any ideas what that is? -- View this message in context: http://valgrind.10908.n7.nabble.com/mpich-unable-to-munmap-hugepages-tp49150p49181.html Sent from the Valgrind - Users mailing list archive at Nabble.com. |
|
From: Philippe W. <phi...@sk...> - 2014-04-03 17:15:29
|
On Thu, 2014-04-03 at 07:21 -0700, janjust wrote: > Philippe, > Thanks a lot for helping with this. > > I ran the code as you suggested. > aprun is a job submission system for our cluster machines. > > Attached is a file with all the output in order: > code, native, strace native, valgrind, strace -f valgrind > > If you look at the strace -f valgrind output, at the bottom you'll see a > mmap fail with EINVAL return code. Ok, I think I have an hypothesis (the below is pure guess work): I guess that the file on the hugetlbfs is special (as it is on this special mounted "huge page" file system). This file provides huge pages, which must (probably) respect some constraints such as: it must be mapped at a multiple of a huge page (1M ? 4M ? or whatever) and/or it must be in a specific part of the adress space and/or ... Valgrind adress space manager does not understand the notion of huge page. What valgrind does is: it maintains a list of unused "address space zone". To do an mmap, valgrind decides at which adress the mmap will be done. and then asks a fixed mapping at this address to the kernel. If this fixed mapping address is done in a way which is incompatible with the constraints to have a huge page, the kernel makes the mmap call fail. In the below strace extracts, you see that under valgrind, the mmap call is using a first argument different of NULL, and has added a MAP_FIXED argument : strace native: mmap(NULL, 4194304, PROT_READ|PROT_WRITE, MAP_SHARED, 3, 0) = 0x2aaaaac00000 strace valgrind: mmap(0x4801000, 4194304, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_FIXED, 3, 0) = -1 EINVAL (Invalid argument) The logic for all this is in syswrap-generic.c around line 2146 (SVN trunk). What we might maybe do is to have yet another refinement, which is: when the mmap fails, try again, but without any MAP_FIXED arg, and afterwards, verify that the address decided by the kernel is ok. In other words, in the syscall handling for the client, the idea is to introduce a kludge similar to what is done in aspacemgr-linux.c around line 2460 and following. This looks not too difficult to do (and I could even test this on the gcc110 compile farm system, that has a huge fs file system :). How to confirm the hypothesis ? I suggest you run several times the small test program natively. If the map is always at the same address, modify the small program to pass as first argument this address (and maybe add MAP_FIXED). If afterwards, the small program succeeds both natively and under valgrind, it looks like the hypothesis is somewhat confirmed. If the above hypothesis looks correct and/or you can confirm it, then I suggest you file a bug in bugzilla (and do not hesitate to try to prepare the patch described above :). Philippe |
|
From: David G. <da...@ve...> - 2014-04-03 14:40:49
|
That's quite understood, thank you. I was only a little worried that my bug report hadn't gone to the right location, or at the wrong time and been lost, a mistake I've made before. If it's been acknowledged then I'm happy to wait as long as it takes to fix. Thank you very much, and keep up the good work :) On Thu, Apr 3, 2014 at 2:07 AM, Vasily Golubev <vas...@gm...>wrote: > Hello, David. > > Sorry for inconvenience, but I simply haven't enough time to look in > this problem deeply. > FYI: My first investigation showed that it is necessary to do some > more efforts in VEX code to support these instructions. > > Vasily > > On Thu, Apr 3, 2014 at 12:23 AM, David Goldsmith <da...@ve...> wrote: > > Hi Vasily, > > I created a bug report (329963) back in January to address this > shortcoming > > but I notice two months later that it hasn't been looked at. > > Just curious if it has been passed over or just very low priority? > > Thanks > > -- David > > > > > > On Thu, Jan 9, 2014 at 5:34 PM, > > <val...@li...> wrote: > >> > >> Mr. Goldsmith, > >> > >> Yes, now Valgrind doesn't know VCVBT and VCVTT instructiosn on ARM. > >> So, please, create bug in bug-tracker. For reproducing it is better to > >> use "__asm__ __volatile__ (...)". > >> I'll try to go deeper in VEX library and if it is successfully - > >> produce patch and attach it to the bug. In any case, for me it is not > >> an issue for 1-2 hours, unfortunately. > >> > >> Vasily > > > > > > > ------------------------------------------------------------------------------ > > > > _______________________________________________ > > Valgrind-users mailing list > > Val...@li... > > https://lists.sourceforge.net/lists/listinfo/valgrind-users > > > > > > -- > Best Regards, > Vasily > |
|
From: janjust <tja...@un...> - 2014-04-03 14:21:25
|
Philippe, Thanks a lot for helping with this. I ran the code as you suggested. aprun is a job submission system for our cluster machines. Attached is a file with all the output in order: code, native, strace native, valgrind, strace -f valgrind If you look at the strace -f valgrind output, at the bottom you'll see a mmap fail with EINVAL return code. hugepage_test.txt <http://valgrind.10908.n7.nabble.com/file/n49175/hugepage_test.txt> -- View this message in context: http://valgrind.10908.n7.nabble.com/mpich-unable-to-munmap-hugepages-tp49150p49175.html Sent from the Valgrind - Users mailing list archive at Nabble.com. |