You can subscribe to this list here.
| 2003 |
Jan
|
Feb
|
Mar
(58) |
Apr
(261) |
May
(169) |
Jun
(214) |
Jul
(201) |
Aug
(219) |
Sep
(198) |
Oct
(203) |
Nov
(241) |
Dec
(94) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2004 |
Jan
(137) |
Feb
(149) |
Mar
(150) |
Apr
(193) |
May
(95) |
Jun
(173) |
Jul
(137) |
Aug
(236) |
Sep
(157) |
Oct
(150) |
Nov
(136) |
Dec
(90) |
| 2005 |
Jan
(139) |
Feb
(130) |
Mar
(274) |
Apr
(138) |
May
(184) |
Jun
(152) |
Jul
(261) |
Aug
(409) |
Sep
(239) |
Oct
(241) |
Nov
(260) |
Dec
(137) |
| 2006 |
Jan
(191) |
Feb
(142) |
Mar
(169) |
Apr
(75) |
May
(141) |
Jun
(169) |
Jul
(131) |
Aug
(141) |
Sep
(192) |
Oct
(176) |
Nov
(142) |
Dec
(95) |
| 2007 |
Jan
(98) |
Feb
(120) |
Mar
(93) |
Apr
(96) |
May
(95) |
Jun
(65) |
Jul
(62) |
Aug
(56) |
Sep
(53) |
Oct
(95) |
Nov
(106) |
Dec
(87) |
| 2008 |
Jan
(58) |
Feb
(149) |
Mar
(175) |
Apr
(110) |
May
(106) |
Jun
(72) |
Jul
(55) |
Aug
(89) |
Sep
(26) |
Oct
(96) |
Nov
(83) |
Dec
(93) |
| 2009 |
Jan
(97) |
Feb
(106) |
Mar
(74) |
Apr
(64) |
May
(115) |
Jun
(83) |
Jul
(137) |
Aug
(103) |
Sep
(56) |
Oct
(59) |
Nov
(61) |
Dec
(37) |
| 2010 |
Jan
(94) |
Feb
(71) |
Mar
(53) |
Apr
(105) |
May
(79) |
Jun
(111) |
Jul
(110) |
Aug
(81) |
Sep
(50) |
Oct
(82) |
Nov
(49) |
Dec
(21) |
| 2011 |
Jan
(87) |
Feb
(105) |
Mar
(108) |
Apr
(99) |
May
(91) |
Jun
(94) |
Jul
(114) |
Aug
(77) |
Sep
(58) |
Oct
(58) |
Nov
(131) |
Dec
(62) |
| 2012 |
Jan
(76) |
Feb
(93) |
Mar
(68) |
Apr
(95) |
May
(62) |
Jun
(109) |
Jul
(90) |
Aug
(87) |
Sep
(49) |
Oct
(54) |
Nov
(66) |
Dec
(84) |
| 2013 |
Jan
(67) |
Feb
(52) |
Mar
(93) |
Apr
(65) |
May
(33) |
Jun
(34) |
Jul
(52) |
Aug
(42) |
Sep
(52) |
Oct
(48) |
Nov
(66) |
Dec
(14) |
| 2014 |
Jan
(66) |
Feb
(51) |
Mar
(34) |
Apr
(47) |
May
(58) |
Jun
(27) |
Jul
(52) |
Aug
(41) |
Sep
(78) |
Oct
(30) |
Nov
(28) |
Dec
(26) |
| 2015 |
Jan
(41) |
Feb
(42) |
Mar
(20) |
Apr
(73) |
May
(31) |
Jun
(48) |
Jul
(23) |
Aug
(55) |
Sep
(36) |
Oct
(47) |
Nov
(48) |
Dec
(41) |
| 2016 |
Jan
(32) |
Feb
(34) |
Mar
(33) |
Apr
(22) |
May
(14) |
Jun
(31) |
Jul
(29) |
Aug
(41) |
Sep
(17) |
Oct
(27) |
Nov
(38) |
Dec
(28) |
| 2017 |
Jan
(28) |
Feb
(30) |
Mar
(16) |
Apr
(9) |
May
(27) |
Jun
(57) |
Jul
(28) |
Aug
(43) |
Sep
(31) |
Oct
(20) |
Nov
(24) |
Dec
(18) |
| 2018 |
Jan
(34) |
Feb
(50) |
Mar
(18) |
Apr
(26) |
May
(13) |
Jun
(31) |
Jul
(13) |
Aug
(11) |
Sep
(15) |
Oct
(12) |
Nov
(18) |
Dec
(13) |
| 2019 |
Jan
(12) |
Feb
(29) |
Mar
(51) |
Apr
(22) |
May
(13) |
Jun
(20) |
Jul
(13) |
Aug
(12) |
Sep
(21) |
Oct
(6) |
Nov
(9) |
Dec
(5) |
| 2020 |
Jan
(13) |
Feb
(5) |
Mar
(25) |
Apr
(4) |
May
(40) |
Jun
(27) |
Jul
(5) |
Aug
(17) |
Sep
(21) |
Oct
(1) |
Nov
(5) |
Dec
(15) |
| 2021 |
Jan
(28) |
Feb
(6) |
Mar
(11) |
Apr
(5) |
May
(7) |
Jun
(8) |
Jul
(5) |
Aug
(5) |
Sep
(11) |
Oct
(9) |
Nov
(10) |
Dec
(12) |
| 2022 |
Jan
(7) |
Feb
(13) |
Mar
(8) |
Apr
(7) |
May
(12) |
Jun
(27) |
Jul
(14) |
Aug
(27) |
Sep
(27) |
Oct
(17) |
Nov
(17) |
Dec
|
| 2023 |
Jan
(10) |
Feb
(18) |
Mar
(9) |
Apr
(26) |
May
|
Jun
(13) |
Jul
(18) |
Aug
(5) |
Sep
(12) |
Oct
(16) |
Nov
(1) |
Dec
|
| 2024 |
Jan
(4) |
Feb
(3) |
Mar
(6) |
Apr
(17) |
May
(2) |
Jun
(33) |
Jul
(13) |
Aug
(1) |
Sep
(6) |
Oct
(8) |
Nov
(6) |
Dec
(15) |
| 2025 |
Jan
(5) |
Feb
(11) |
Mar
(8) |
Apr
(20) |
May
(1) |
Jun
|
Jul
|
Aug
(9) |
Sep
(1) |
Oct
(7) |
Nov
(1) |
Dec
|
|
From: John R. <jr...@Bi...> - 2013-07-21 22:57:36
|
> I have a binary for valgrind on the Pi, and the ldd output is: > > /usr/lib/arm-linux-gnueabihf/libcofi_rpi.so (0x401dc000) > libgcc_s.so.1 => /lib/arm-linux-gnueabihf/libgcc_s.so.1 (0x400d3000) > libc.so.6 => /lib/arm-linux-gnueabihf/libc.so.6 (0x401e5000) > /lib/ld-linux-armhf.so.3 (0x40011000) > > Is there an easy way I can tweak the build files to statically link these libraries? Are there any other tools the valgrind executable depends on that I also need to build to make it work? The top-level Makefile for valgrind looks somewhat ordinary. So you could try adding "-static -static-libgcc" to to CFLAGS. (Or add "-static" to LDFLAGS, or add "-Wl,-static" to CFLAGS.) Then look at the final link command: invoke 'make' and observe the commands that are displayed on stderr. In the worst case, then run under "strace -f -o strace.out -e trace=execve make ..." then inspect strace.out to find the actual process command line. Also look at the output from ldd, to see how well those attempts worked. You might also look one level down at memcheck/Makefile, but CFLAGS etc. from the top-level Makefile should propagate to the individual tools such as memcheck. |
|
From: Mag G. <mag...@gm...> - 2013-07-21 06:23:31
|
http://manandvanprices.co.uk/cmxphx/gbrgy.dnybrrnnvijmv Mag Gam 7/21/2013 7:23:19 AM |
|
From: Arthur L. <lam...@gm...> - 2013-07-19 12:33:23
|
Hi, my dev environment is : - uClibc-0.9.30 - Openwrt backfire version - Valgrind 3.8.1 - ARCH : MIPS32 - Linux 2.6.28.8 I am trying to use valgrind to find memory leak and error on memory usage on my program. I have two issues with valgrind on my target. First when I run any binary without any particular option I get an error. Here for example I use a basic helloworld which just malloc and print a char* : ___________________________ root@OpenWrt:~# valgrind /opt/data/helloworld ==794== Memcheck, a memory error detector ==794== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al. ==794== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info ==794== Command: /opt/data/helloworld ==794== ==794== ==794== Process terminating with default action of signal 11 (SIGSEGV) ==794== Access not within mapped region at address 0x5554 ==794== at 0x4004444: ??? (in /lib/ld-uClibc-0.9.30.so) ==794== by 0x4000ACC: _start (in /lib/ld-uClibc-0.9.30.so) ==794== If you believe this happened as a result of a stack ==794== overflow in your program's main thread (unlikely but ==794== possible), you can try to increase the size of the ==794== main thread stack using the --main-stacksize= flag. ==794== The main thread stack size used in this run was 8388608. valgrind: m_scheduler/scheduler.c:923 (run_thread_for_a_while): Assertion 'two_words[0] == 0 && two_words[1] == 0' failed. ==794== at 0x3803F39C: ??? (in /usr/lib/valgrind/memcheck-mips32-linux) ==794== by 0x3803F2C8: ??? (in /usr/lib/valgrind/memcheck-mips32-linux) sched status: running_tid=1 Thread 1: status = VgTs_Runnable ==794== at 0x4004444: ??? (in /lib/ld-uClibc-0.9.30.so) ==794== by 0x4000ACC: _start (in /lib/ld-uClibc-0.9.30.so) Note: see also the FAQ in the source distribution. It contains workarounds to several common problems. In particular, if Valgrind aborted or crashed after identifying problems in your program, there's a good chance that fixing those problems will prevent Valgrind aborting or crashing, especially if it happened in m_mallocfree.c. If that doesn't help, please report this bug to: www.valgrind.org In the bug report, send all the above text, the valgrind version, and what OS and version you are using. Thanks. _____________________________________ Another bug was talking about this issue but with glibc : https://bugs.kde.org/show_bug.cgi?id=307141. Not sure, but by reading this bug, the issue was fixed by using an up to date revision of gcc. Do I have the same trouble here ? Secondly, my main program is working fine without valgrind. When I run it with valgrind, I get a segfault after few seconds which can be normal. I get a vgcore file. I am not able to use it on gdb or understand the meaning of the output of valgrind. I had already use core file generated by my binary without vallgrind after running ulimit -c unlimited command but impossible to use specific core file from valgrind : root@OpenWrt:~# valgrind --track-origins=yes --num-callers=32 --leak-check=full --workaround-gcc296-bugs=yes XXXXX ==795== Memcheck, a memory error detector ==795== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al. ==795== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info ==795== Command: XXXXX ==795== ==795== Conditional jump or move depends on uninitialised value(s) ==795== at 0x40044B0: ??? (in /lib/ld-uClibc-0.9.30.so) ==795== by 0x4000ACC: _start (in /lib/ld-uClibc-0.9.30.so) ==795== Uninitialised value was created by a stack allocation ==795== at 0x4004410: ??? (in /lib/ld-uClibc-0.9.30.so) ==795== ==795== Warning: client switching stacks? SP change: 0x7eff01c8 --> 0x3c2 ==795== to suppress, use: --max-stackframe=2130640390 or greater ==795== Invalid write of size 4 ==795== at 0x4E7ACD0: ??? (in /lib/libuClibc-0.9.30.so) ==795== by 0x4E7CE2C: re_search_2 (in /lib/libuClibc-0.9.30.so) ==795== Address 0x7eff00dc is on thread 1's stack ==795== ==795== Invalid write of size 4 ==795== at 0x4E7ACD4: ??? (in /lib/libuClibc-0.9.30.so) ==795== by 0x4E7CE2C: re_search_2 (in /lib/libuClibc-0.9.30.so) ==795== Address 0x7eff011c is on thread 1's stack ==795== ==795== Invalid write of size 4 ==795== at 0x4E7ACD8: ??? (in /lib/libuClibc-0.9.30.so) ==795== by 0x4E7CE2C: re_search_2 (in /lib/libuClibc-0.9.30.so) ==795== Address 0x7eff015c is on thread 1's stack ==795== ==795== Invalid write of size 4 ==795== at 0x4E7ACDC: ??? (in /lib/libuClibc-0.9.30.so) ==795== by 0x4E7CE2C: re_search_2 (in /lib/libuClibc-0.9.30.so) ==795== Address 0x7eff019c is on thread 1's stack ==795== ==795== Invalid read of size 4 ==795== at 0x4E7ACF0: ??? (in /lib/libuClibc-0.9.30.so) ==795== by 0x4E7CE2C: re_search_2 (in /lib/libuClibc-0.9.30.so) ==795== Address 0x7eff001c is on thread 1's stack ==795== ==795== Invalid write of size 4 ==795== at 0x4E7AD04: ??? (in /lib/libuClibc-0.9.30.so) ==795== by 0x4E7CE2C: re_search_2 (in /lib/libuClibc-0.9.30.so) ==795== Address 0x7eff001c is on thread 1's stack ==795== ==795== Can't extend stack to 0xfffffdc8 during signal delivery for thread 1: ==795== no stack segment ==795== ==795== Process terminating with default action of signal 11 (SIGSEGV) ==795== Access not within mapped region at address 0xFFFFFDC8 ==795== at 0x4831160: memcpy (in /usr/lib/valgrind/vgpreload_memcheck-mips32-linux.so) ==795== If you believe this happened as a result of a stack ==795== overflow in your program's main thread (unlikely but ==795== possible), you can try to increase the size of the ==795== main thread stack using the --main-stacksize= flag. ==795== The main thread stack size used in this run was 2088960. ==795== ==795== Process terminating with default action of signal 11 (SIGSEGV) ==795== General Protection Fault ==795== at 0x4816740: _vgnU_freeres (in /usr/lib/valgrind/vgpreload_core-mips32-linux.so) ==795== ==795== HEAP SUMMARY: ==795== in use at exit: 163,456 bytes in 1,398 blocks ==795== total heap usage: 5,688 allocs, 4,290 frees, 597,875 bytes allocated ==795== [...] ==795== ==795== LEAK SUMMARY: ==795== definitely lost: 0 bytes in 0 blocks ==795== indirectly lost: 0 bytes in 0 blocks ==795== possibly lost: 39,375 bytes in 530 blocks ==795== still reachable: 124,081 bytes in 868 blocks ==795== suppressed: 0 bytes in 0 blocks ==795== Reachable blocks (those to which a pointer was found) are not shown. ==795== To see them, rerun with: --leak-check=full --show-reachable=yes ==795== ==795== For counts of detected and suppressed errors, rerun with: -v ==795== ERROR SUMMARY: 85 errors from 15 contexts (suppressed: 0 from 0) Segmentation fault Thanks, - Arthur LAMBERT |
|
From: Tom H. <to...@co...> - 2013-07-18 21:22:33
|
On 18/07/13 21:59, Davis Ford wrote: > I have a binary for valgrind on the Pi, and the ldd output is: > > /usr/lib/arm-linux-gnueabihf/libcofi_rpi.so (0x401dc000) > libgcc_s.so.1 => /lib/arm-linux-gnueabihf/libgcc_s.so.1 (0x400d3000) > libc.so.6 => /lib/arm-linux-gnueabihf/libc.so.6 (0x401e5000) > /lib/ld-linux-armhf.so.3 (0x40011000) That's just the launcher which selects which tool binary to use. The real programs that do the work are the tools in the library directory and are already statically linked. Tom -- Tom Hughes (to...@co...) http://compton.nu/ |
|
From: Davis F. <dav...@gm...> - 2013-07-18 21:00:07
|
Hi, I'm interested in the same request as this user: http://thread.gmane.org/gmane.comp.debugging.valgrind/9322/focus=9323 I have a piece of hardware running Linux that does support shared objects, but the OS is heavily modified, and it lacks several shared objects in standard places. It also lacks a compiler along with a lot of other useful tools. Building on the target is not an option, so I'd have to cross-compile. However, it so happens that this target is binary compatible with the Raspberry Pi, and I am able to build executables on the Pi with -static and run them on this other target. If I compile without -static, it usually always fails b/c the .so files are missing or in non-standard places. At boot it explodes most of the filesystem to a ramdisk that is read only. Instead of trying to hack around all this -- the easiest thing to do is just build with -static. I have a binary for valgrind on the Pi, and the ldd output is: /usr/lib/arm-linux-gnueabihf/libcofi_rpi.so (0x401dc000) libgcc_s.so.1 => /lib/arm-linux-gnueabihf/libgcc_s.so.1 (0x400d3000) libc.so.6 => /lib/arm-linux-gnueabihf/libc.so.6 (0x401e5000) /lib/ld-linux-armhf.so.3 (0x40011000) Is there an easy way I can tweak the build files to statically link these libraries? Are there any other tools the valgrind executable depends on that I also need to build to make it work? Regards, Davis |
|
From: Masha N. (mnaret) <mn...@ci...> - 2013-07-17 09:10:11
|
Hello,
Attaching the reproducer - please compile it with gcc -g tmpthread.c -o tmpthread -lpthread
And then run: tmpthread argsFile
Also attached argsFile and valgrind log.
There are two "Invalid write" errors in the log, but only the second one seems to be the same as in the original program.
Some more explanation:
In the original program I noticed valgrind reported the problem when the stack, while growing, grew from area with permissions 'rwx' to area with permissions 'rw'.
The difference in premissions happened since each time thread stack is created, it's lower and upper part are protected with mprotect(NONE) and when it's out of use the protection is set back to READ | WRITE.
(exactly as in reproducer)
However, when running the program with valgrind, for some reason initially the all the momery is set to 'rwx'.
Which is weird, since normally the memory for the stack shouldn't have the 'x' permission.
When running the program without valgrind, there's no memory with 'rwx', only 'rw'
If I modify the original program to restore also EXEC persmission, valgrind doesn't report any error, however this doesn't seem like a correct thing to do.
The same happens with the reproducer.
If you modify the code of the function restoring permissions to add PROT_EXEC persmissions :
static void ReleaseThreadStack(int stacksize, void** stackbase)
{
int ret = 0;
void* ptStack = NULL;
//remove protection
ptStack = *stackbase - STACK_GUARD_SIZE;
ret = mprotect(ptStack, STACK_GUARD_SIZE, PROT_READ | PROT_WRITE | PROT_EXEC );
ptStack = *stackbase + stacksize;
ret = mprotect(ptStack, STACK_GUARD_SIZE, PROT_READ | PROT_WRITE | PROT_EXEC);
return;
}
The second "invalid write" problem will disappear!
If there’s need of any further logs, please let me know, I’ll send you valgrind debugger outputs.
Thank you very much for your support,
Masha.
-----Original Message-----
From: Philippe Waroquiers [mailto:phi...@sk...]
Sent: Wednesday, June 12, 2013 10:44 PM
To: Masha Naret (mnaret)
Cc: val...@li...
Subject: Re: [Valgrind-users] Valgrind shows "Invalid write os size 4" for memory allocated for the stack
On Mon, 2013-06-10 at 01:23 -0700, mnaret wrote:
> Hello,
> Recently I'm getting lot's of "invalid read/invalid write" valgrind
> errors which point out at memory allocated for the stack. However the
> code doesn't crush and finish running successfully.
> I'm trying to understand where the error comes from - and will be
> grateful fo any help wih this issue.
Do you have a small (compilable) reproducer ?
Philippe
|
|
From: vijay n. <vi...@gm...> - 2013-07-16 12:05:43
|
Yes, malloc_usable_size is stored just before and after the malloc chunks. "buffer-overflow" can overwrite malloc private data and valgrind should point this out. |
|
From: Xiaopi <liu...@gm...> - 2013-07-16 12:03:33
|
Thank you for the reply. Looks like there is API for this, after google with your suggestion. But I still don't quite understand. Are u meaning that if we write more bytes than the malloc usable size, it is corrupted for the ptr? Thanks, 2013-7-16,18:32,vijay nag <vi...@gm...> > On Tue, Jul 16, 2013 at 3:53 PM, Xiaopi Liu <liu...@gm...> wrote: >> Dear all, >> >> >> I have see this error/crash in an existing large code base. >> Just cannot identify the exact crash point, even using valgrind. >> Can someone help point out what exactly this crash/error means? >> >> Does it necessarily related to malloc/free operation? >> By this I mean, if only new/delete op is used, will I still see the same >> error? >> Any general background on the possible cause for this? >> >> >> *** glibc detected *** .: free(): invalid next size (fast) >> >> -- >> >> >> Best wishes! >> Sincerely yours, LIU >> >> ------------------------------------------------------------------------------ >> See everything from the browser to the database with AppDynamics >> Get end-to-end visibility with application monitoring from AppDynamics >> Isolate bottlenecks and diagnose root cause in seconds. >> Start your free trial of AppDynamics Pro today! >> http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk >> _______________________________________________ >> Valgrind-users mailing list >> Val...@li... >> https://lists.sourceforge.net/lists/listinfo/valgrind-users > It could either be corruption or double free error. Mostly the > malloc_usable_size is corrupted here. > You can get the malloc_usable_size the following way. > > malloc_usable_size(ptr) > { > char * p = (char*)ptr; > size = *(p -4); > if (size & 2) { > blockSize = (size & ~3) - 2*4; > } else { > blockSize = (size & ~3) -4; > } > } |
|
From: Xiaopi L. <liu...@gm...> - 2013-07-16 10:23:21
|
Dear all, I have see this error/crash in an existing large code base. Just cannot identify the exact crash point, even using valgrind. Can someone help point out what exactly this crash/error means? Does it necessarily related to malloc/free operation? By this I mean, if only new/delete op is used, will I still see the same error? Any general background on the possible cause for this? *** glibc detected *** .: free(): invalid next size (fast) -- Best wishes! Sincerely yours, LIU |
|
From: RCY <re...@ya...> - 2013-07-13 19:48:53
|
On Sat, Jul 13, 2013 at 1:40 PM, Christoph Schwarz <chr...@gm...> wrote: > On 13/07/2013 10:13 AM, RCY wrote: >> When I try to compile the svn version on , I get the error: >> > [...] >> m_debuginfo/readstabs.c:57:39: fatal error: a.out.h: No such file or directory >> compilation terminated. >> make[3]: *** [libcoregrind_x86_linux_a-readstabs.o] Error 1 >> make[3]: Leaving directory `/home/rc/Downloads/valgrind/coregrind' >> >> >> Any suggestions to fix the error are appreciated. > > I had the same problem with Debian Wheezy. My workaround was: > cd /usr/include > ln -s linux/a.out.h . > > (provided that /usr/include/linux/a.out.h is present) > > Apparently new Debian release changed the location of a.out.h, which > should be determined by configure. > > As Ubuntu builds on Debian there is a chance that it is the same problem. > > cheers, > Chris > Thanks, I included the full path to a.out.h in readstabs.C and the compilation completed successfully. |
|
From: Christoph S. <chr...@gm...> - 2013-07-13 17:40:52
|
On 13/07/2013 10:13 AM, RCY wrote: > When I try to compile the svn version on , I get the error: > [...] > m_debuginfo/readstabs.c:57:39: fatal error: a.out.h: No such file or directory > compilation terminated. > make[3]: *** [libcoregrind_x86_linux_a-readstabs.o] Error 1 > make[3]: Leaving directory `/home/rc/Downloads/valgrind/coregrind' > > > Any suggestions to fix the error are appreciated. I had the same problem with Debian Wheezy. My workaround was: cd /usr/include ln -s linux/a.out.h . (provided that /usr/include/linux/a.out.h is present) Apparently new Debian release changed the location of a.out.h, which should be determined by configure. As Ubuntu builds on Debian there is a chance that it is the same problem. cheers, Chris |
|
From: RCY <re...@ya...> - 2013-07-13 17:13:37
|
When I try to compile the svn version on , I get the error: gcc -DHAVE_CONFIG_H -I. -I.. -I.. -I../include -I../VEX/pub -DVGA_x86=1 -DVGO_linux=1 -DVGP_x86_linux=1 -DVGPV_x86_linux_vanilla=1 -I../coregrind -DVG_LIBDIR="\"/usr/local/lib/valgrind"\" -DVG_PLATFORM="\"x86-linux\"" -m32 -O2 -g -Wall -Wmissing-prototypes -Wshadow -Wpointer-arith -Wstrict-prototypes -Wmissing-declarations -Wno-format-zero-length -fno-strict-aliasing -fno-builtin -fomit-frame-pointer -DENABLE_LINUX_TICKET_LOCK -Wno-long-long -Wwrite-strings -fno-stack-protector -MT libcoregrind_x86_linux_a-readstabs.o -MD -MP -MF .deps/libcoregrind_x86_linux_a-readstabs.Tpo -c -o libcoregrind_x86_linux_a-readstabs.o `test -f 'm_debuginfo/readstabs.c' || echo './'`m_debuginfo/readstabs.c m_debuginfo/readstabs.c:57:39: fatal error: a.out.h: No such file or directory compilation terminated. make[3]: *** [libcoregrind_x86_linux_a-readstabs.o] Error 1 make[3]: Leaving directory `/home/rc/Downloads/valgrind/coregrind' Any suggestions to fix the error are appreciated. uname -a: Linux Asus1 3.8.0-19-generic #30-Ubuntu SMP Wed May 1 16:35:23 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux |
|
From: John R. <jr...@bi...> - 2013-07-10 17:01:16
|
> # http://udrepper.livejournal.com/11429.html > export MALLOC_PERTURB_=$(($RANDOM % 255 + 1)) > echo 1>&2 MALLOC_PERTURB_=$MALLOC_PERTURB_ " # $HOME/.bash_profile" > This will cause all bytes in newly malloc()ed areas to be set to the > same random byte. [Or, specify a constant such as > export MALLOC_PERTURB_=0xF5 The current implementation within glibc-2.17 uses (0xff & (0xff ^ atoi(getenv("MALLOC_PERTURB_")))) so you suffer the vagaries of atoi() and a bit-wise complement. |
|
From: John R. <jr...@bi...> - 2013-07-09 23:41:40
|
> This example also shows that the memcheck error descriptions should > be taken with a grain of salt i.e. the error may be occurring elsewhere, and > not where the error description is suggesting. Is this the nature of the beast, > or can memcheck be more accurate in locating the errors? Memcheck has decided not to complain unless the uninit affects control flow, file output, or an index expression for array access. This is motivated by an attempt to reduce "false positive" complaints: those that "do not affect the output", and thus "the user does not care about them." If there are too many false positive complaints, then users quickly ignore *ALL* complaints (that is, the user discards memcheck as not useful.) Also, alignment and padding in 'struct's often result in uninit "holes" that are copied indiscriminately, and often the end programmer cannot do anything about these because those holes were specified as part of some inner ABI. Also, gcc tends to "overfetch" scalar char and short operands (fetches 32 bits regardless of actual length being 16 or 8), and the overage often is uninit. *UNFORTUNATELY*, memcheck's policy of complaining "as late as possible" means that serious users often experience a really tough debugging chore because the source of the error happens very much earlier than the complaint. It would be much nicer if memcheck had the *OPTION* to complain "as soon as possible": namely, as soon as any uninit bits are fetched from memory. The serious user will either fix the holes in alignment and padding, or write suppressions, or otherwise ignore what is not interesting at the moment, in order to eradicate the *source* of the uninit-of-the-moment. But the implementers of memcheck are not yet convinced. |
|
From: Norman G. <no...@te...> - 2013-07-09 22:42:56
|
On 07/09/2013 01:16 PM, John Reiser wrote: >> What I did do is binary search on: >> ----- tsvd3.cpp >> // Elliminate spurious valgrind uninitialized errors >> #if 1 >> for( int iii=38; iii<lwork; ++iii ) work[iii]=123.456; >> #endif >> ----- >> I see no complaints when starting the loop at iii=1,2,4,8,16,32; >> then errors at 64,48,40; no complaint at 36; errors at 38; >> no complaint at 37. Hmmm... > First, whenever you are faced with murky malloc, then you should enlist help. > The glibc library provides some debugging aids which are quite inexpensive. > They are so cheap that I use them all the time, for all processes. > Put this in $HOME/.bash_profile, or feed it directly to your shell, etc.: > # http://udrepper.livejournal.com/11429.html > export MALLOC_PERTURB_=$(($RANDOM % 255 + 1)) > echo 1>&2 MALLOC_PERTURB_=$MALLOC_PERTURB_ " # $HOME/.bash_profile" > This will cause all bytes in newly malloc()ed areas to be set to the > same random byte. [Or, specify a constant such as > export MALLOC_PERTURB_=0xF5 > When running under valgrind, then the low-level interception of malloc() > and the careful watching by memcheck will supersede MALLOC_PERTURB_.] > > Continuing after the binary search, I tried: > ----- tsvd3.cpp > // Elliminate spurious valgrind uninitialized errors > #if 1 > for( int iii=38; iii<lwork; ++iii ) work[iii]=123.456; > for( int iii= 1; iii<= 36; ++iii ) work[iii]=123.456; > #endif > ----- > which leaves only work[37] uninit. Running this under valgrind > generates complaints from memcheck; the first is: > ----- > lwork_q= 108 > lwork= 108 > ==23901== Conditional jump or move depends on uninitialised value(s) > ==23901== at 0x5498486: dnrm2_ (/usr/src/debug/lapack-3.4.2/BLAS/SRC/dnrm2.f:94) > ==23901== by 0x4E27E27: dlarfg_ (in /usr/lib64/atlas/liblapack.so.3.0) > ==23901== by 0x4DACD89: dgelq2_ (in /usr/lib64/atlas/liblapack.so.3.0) > ==23901== by 0x4DAD457: dgelqf_ (in /usr/lib64/atlas/liblapack.so.3.0) > ==23901== by 0x4DBA96B: dgesdd_ (in /usr/lib64/atlas/liblapack.so.3.0) > ==23901== by 0x4018F7: main (/bigdata/home/jreiser/valgrind-fortran/tsvd3.cpp:62) > ==23901== Uninitialised value was created by a heap allocation > ==23901== at 0x4A07C84: operator new[](unsigned long) (/builddir/build/BUILD/valgrind-3.8.1/coregrind/m_replacemalloc/vg_replace_malloc.c:363) > ==23901== by 0x40180F: main (/bigdata/home/jreiser/valgrind-fortran/tsvd3.cpp:52) > ----- > Now we know that exactly one 8-byte 'double' uninit at work[37] will trigger the complaints. > This aligned 8-byte region is small enough that we can take advantage of debugging hardware > in x86 chips. > > So now I run directly under gdb (without valgrind), put a breakpoint just after > the code which leaves work[37] uninit, and plant a hardware 'read' watchpoint on &work[37]: > (gdb) b tsvd3.cpp:58 > (gdb) run > Breakpoint 2, main () at tsvd3.cpp:62 > (gdb) p &work[37] > $1 = (double *) 0x606178 > (gdb) rwatch *(double *)0x606178 > Hardware read watchpoint 3: *(double *)0x606178 > (gdb) continue > > > Lo and behold, work[37] is fetched and used. That is, there is a real error: > Hardware read watchpoint 3: *(double *)0x606178 > > Value = -1.6882786079646144e+260 > 0x000000000040154a in scal_generic<int, double, double> (n=0x3, > alpha=@0x7ffff7ca6be0: 0, y=0x606170, incY=0x1) at gemv2.cpp:17 > 17 y[iY] *= alpha; > (gdb) x/12i $pc-0x18 > 0x401532 <scal_generic<int, double, double>(int, double const&, double*, int)+54>: mov -0x8(%rbp),%eax > 0x401535 <scal_generic<int, double, double>(int, double const&, double*, int)+57>: cltq > 0x401537 <scal_generic<int, double, double>(int, double const&, double*, int)+59>: lea 0x0(,%rax,8),%rcx > 0x40153f <scal_generic<int, double, double>(int, double const&, double*, int)+67>: mov -0x28(%rbp),%rax > 0x401543 <scal_generic<int, double, double>(int, double const&, double*, int)+71>: add %rcx,%rax > 0x401546 <scal_generic<int, double, double>(int, double const&, double*, int)+74>: movsd (%rax),%xmm1 ### the fetch of uninit > => 0x40154a <scal_generic<int, double, double>(int, double const&, double*, int)+78>: mov -0x20(%rbp),%rax > 0x40154e <scal_generic<int, double, double>(int, double const&, double*, int)+82>: movsd (%rax),%xmm0 > 0x401552 <scal_generic<int, double, double>(int, double const&, double*, int)+86>: mulsd %xmm1,%xmm0 ### the use of uninit > 0x401556 <scal_generic<int, double, double>(int, double const&, double*, int)+90>: movsd %xmm0,(%rdx) > 0x40155a <scal_generic<int, double, double>(int, double const&, double*, int)+94>: addl $0x1,-0x4(%rbp) > 0x40155e <scal_generic<int, double, double>(int, double const&, double*, int)+98>: mov -0x18(%rbp),%eax > > (gdb) p $rax > $2 = 0x606178 ### yes, it is &work[37] > (gdb) x/2xw $rax ### and those bytes are uninit > 0x606178: 0xf5f5f5f5 0xf5f5f5f5 ### The pattern for uninit set by MALLOC_PERTURB_ > (gdb) bt > #0 0x000000000040154a in scal_generic<int, double, double> (n=0x3, > alpha=@0x7ffff7ca6be0: 0, y=0x606170, incY=0x1) at gemv2.cpp:17 > #1 0x00000000004011f8 in gemv_generic<int, double, double, double, double, double> (order=RowMajor, transA=Trans, conjX=NoTrans, m=0x8, n=0x3, > alpha=@0x7ffff7ca6bc8: 1, A=0x7fffffffde48, ldA=0x4, x=0x7fffffffde40, > incX=0x4, beta=@0x7ffff7ca6be0: 0, y=0x606170, incY=0x1) at gemv2.cpp:108 > #2 0x0000000000400e05 in gemv_generic<int, double, double, double, double, double> (order=ColMajor, transA=Trans, conjX=NoTrans, m=0x3, n=0x8, > alpha=@0x7ffff7ca6bc8: 1, A=0x7fffffffde48, ldA=0x4, x=0x7fffffffde40, > incX=0x4, beta=@0x7ffff7ca6be0: 0, y=0x606170, incY=0x1) at gemv2.cpp:58 > #3 0x0000000000400d4e in gemv<int, double, double, double, double, double> ( > order=ColMajor, trans=NoTrans, m=0x3, n=0x8, alpha=@0x7ffff7ca6bc8: 1, > A=0x7fffffffde48, ldA=0x4, x=0x7fffffffde40, incX=0x4, > beta=@0x7ffff7ca6be0: 0, y=0x606170, incY=0x1) at gemv2.cpp:156 > #4 0x0000000000400cc8 in dgemv_ (TRANS=0x7ffff7ca6be8 "No transpose", > M=0x7fffffffd938, N=0x7fffffffd93c, ALPHA=0x7ffff7ca6bc8, > _A=0x7fffffffde48, LDA=0x7fffffffdf58, X=0x7fffffffde40, > INCX=0x7fffffffdf58, BETA=0x7ffff7ca6be0, Y=0x606170, INCY=0x7ffff7ca6bdc) > at gemv2.cpp:204 > #5 0x00007ffff797f3fb in dlarf_ () from /usr/lib64/atlas/liblapack.so.3 > #6 0x00007ffff7906e1f in dgelq2_ () from /usr/lib64/atlas/liblapack.so.3 > #7 0x00007ffff7907458 in dgelqf_ () from /usr/lib64/atlas/liblapack.so.3 > #8 0x00007ffff791496c in dgesdd_ () from /usr/lib64/atlas/liblapack.so.3 > #9 0x00000000004018f8 in main () at tsvd3.cpp:62 > > So there is the [a] real error. Apologize to memcheck, and fix your bug. > > > ------------------------------------------------------------------------------ > See everything from the browser to the database with AppDynamics > Get end-to-end visibility with application monitoring from AppDynamics > Isolate bottlenecks and diagnose root cause in seconds. > Start your free trial of AppDynamics Pro today! > http://pubads.g.doubleclick.net/gampad/clk?id=48808831&iu=/4140/ostg.clktrk > _______________________________________________ > Valgrind-users mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-users > Thank you for the detailed analysis and explanation of techniques! This example also shows that the memcheck error descriptions should be taken with a grain of salt i.e. the error may be occurring elsewhere, and not where the error description is suggesting. Is this the nature of the beast, or can memcheck be more accurate in locating the errors? PS work[0] was also left uninitialized :-) |
|
From: John R. <jr...@bi...> - 2013-07-09 20:15:50
|
> What I did do is binary search on: > ----- tsvd3.cpp > // Elliminate spurious valgrind uninitialized errors > #if 1 > for( int iii=38; iii<lwork; ++iii ) work[iii]=123.456; > #endif > ----- > I see no complaints when starting the loop at iii=1,2,4,8,16,32; > then errors at 64,48,40; no complaint at 36; errors at 38; > no complaint at 37. Hmmm... First, whenever you are faced with murky malloc, then you should enlist help. The glibc library provides some debugging aids which are quite inexpensive. They are so cheap that I use them all the time, for all processes. Put this in $HOME/.bash_profile, or feed it directly to your shell, etc.: # http://udrepper.livejournal.com/11429.html export MALLOC_PERTURB_=$(($RANDOM % 255 + 1)) echo 1>&2 MALLOC_PERTURB_=$MALLOC_PERTURB_ " # $HOME/.bash_profile" This will cause all bytes in newly malloc()ed areas to be set to the same random byte. [Or, specify a constant such as export MALLOC_PERTURB_=0xF5 When running under valgrind, then the low-level interception of malloc() and the careful watching by memcheck will supersede MALLOC_PERTURB_.] Continuing after the binary search, I tried: ----- tsvd3.cpp // Elliminate spurious valgrind uninitialized errors #if 1 for( int iii=38; iii<lwork; ++iii ) work[iii]=123.456; for( int iii= 1; iii<= 36; ++iii ) work[iii]=123.456; #endif ----- which leaves only work[37] uninit. Running this under valgrind generates complaints from memcheck; the first is: ----- lwork_q= 108 lwork= 108 ==23901== Conditional jump or move depends on uninitialised value(s) ==23901== at 0x5498486: dnrm2_ (/usr/src/debug/lapack-3.4.2/BLAS/SRC/dnrm2.f:94) ==23901== by 0x4E27E27: dlarfg_ (in /usr/lib64/atlas/liblapack.so.3.0) ==23901== by 0x4DACD89: dgelq2_ (in /usr/lib64/atlas/liblapack.so.3.0) ==23901== by 0x4DAD457: dgelqf_ (in /usr/lib64/atlas/liblapack.so.3.0) ==23901== by 0x4DBA96B: dgesdd_ (in /usr/lib64/atlas/liblapack.so.3.0) ==23901== by 0x4018F7: main (/bigdata/home/jreiser/valgrind-fortran/tsvd3.cpp:62) ==23901== Uninitialised value was created by a heap allocation ==23901== at 0x4A07C84: operator new[](unsigned long) (/builddir/build/BUILD/valgrind-3.8.1/coregrind/m_replacemalloc/vg_replace_malloc.c:363) ==23901== by 0x40180F: main (/bigdata/home/jreiser/valgrind-fortran/tsvd3.cpp:52) ----- Now we know that exactly one 8-byte 'double' uninit at work[37] will trigger the complaints. This aligned 8-byte region is small enough that we can take advantage of debugging hardware in x86 chips. So now I run directly under gdb (without valgrind), put a breakpoint just after the code which leaves work[37] uninit, and plant a hardware 'read' watchpoint on &work[37]: (gdb) b tsvd3.cpp:58 (gdb) run Breakpoint 2, main () at tsvd3.cpp:62 (gdb) p &work[37] $1 = (double *) 0x606178 (gdb) rwatch *(double *)0x606178 Hardware read watchpoint 3: *(double *)0x606178 (gdb) continue Lo and behold, work[37] is fetched and used. That is, there is a real error: Hardware read watchpoint 3: *(double *)0x606178 Value = -1.6882786079646144e+260 0x000000000040154a in scal_generic<int, double, double> (n=0x3, alpha=@0x7ffff7ca6be0: 0, y=0x606170, incY=0x1) at gemv2.cpp:17 17 y[iY] *= alpha; (gdb) x/12i $pc-0x18 0x401532 <scal_generic<int, double, double>(int, double const&, double*, int)+54>: mov -0x8(%rbp),%eax 0x401535 <scal_generic<int, double, double>(int, double const&, double*, int)+57>: cltq 0x401537 <scal_generic<int, double, double>(int, double const&, double*, int)+59>: lea 0x0(,%rax,8),%rcx 0x40153f <scal_generic<int, double, double>(int, double const&, double*, int)+67>: mov -0x28(%rbp),%rax 0x401543 <scal_generic<int, double, double>(int, double const&, double*, int)+71>: add %rcx,%rax 0x401546 <scal_generic<int, double, double>(int, double const&, double*, int)+74>: movsd (%rax),%xmm1 ### the fetch of uninit => 0x40154a <scal_generic<int, double, double>(int, double const&, double*, int)+78>: mov -0x20(%rbp),%rax 0x40154e <scal_generic<int, double, double>(int, double const&, double*, int)+82>: movsd (%rax),%xmm0 0x401552 <scal_generic<int, double, double>(int, double const&, double*, int)+86>: mulsd %xmm1,%xmm0 ### the use of uninit 0x401556 <scal_generic<int, double, double>(int, double const&, double*, int)+90>: movsd %xmm0,(%rdx) 0x40155a <scal_generic<int, double, double>(int, double const&, double*, int)+94>: addl $0x1,-0x4(%rbp) 0x40155e <scal_generic<int, double, double>(int, double const&, double*, int)+98>: mov -0x18(%rbp),%eax (gdb) p $rax $2 = 0x606178 ### yes, it is &work[37] (gdb) x/2xw $rax ### and those bytes are uninit 0x606178: 0xf5f5f5f5 0xf5f5f5f5 ### The pattern for uninit set by MALLOC_PERTURB_ (gdb) bt #0 0x000000000040154a in scal_generic<int, double, double> (n=0x3, alpha=@0x7ffff7ca6be0: 0, y=0x606170, incY=0x1) at gemv2.cpp:17 #1 0x00000000004011f8 in gemv_generic<int, double, double, double, double, double> (order=RowMajor, transA=Trans, conjX=NoTrans, m=0x8, n=0x3, alpha=@0x7ffff7ca6bc8: 1, A=0x7fffffffde48, ldA=0x4, x=0x7fffffffde40, incX=0x4, beta=@0x7ffff7ca6be0: 0, y=0x606170, incY=0x1) at gemv2.cpp:108 #2 0x0000000000400e05 in gemv_generic<int, double, double, double, double, double> (order=ColMajor, transA=Trans, conjX=NoTrans, m=0x3, n=0x8, alpha=@0x7ffff7ca6bc8: 1, A=0x7fffffffde48, ldA=0x4, x=0x7fffffffde40, incX=0x4, beta=@0x7ffff7ca6be0: 0, y=0x606170, incY=0x1) at gemv2.cpp:58 #3 0x0000000000400d4e in gemv<int, double, double, double, double, double> ( order=ColMajor, trans=NoTrans, m=0x3, n=0x8, alpha=@0x7ffff7ca6bc8: 1, A=0x7fffffffde48, ldA=0x4, x=0x7fffffffde40, incX=0x4, beta=@0x7ffff7ca6be0: 0, y=0x606170, incY=0x1) at gemv2.cpp:156 #4 0x0000000000400cc8 in dgemv_ (TRANS=0x7ffff7ca6be8 "No transpose", M=0x7fffffffd938, N=0x7fffffffd93c, ALPHA=0x7ffff7ca6bc8, _A=0x7fffffffde48, LDA=0x7fffffffdf58, X=0x7fffffffde40, INCX=0x7fffffffdf58, BETA=0x7ffff7ca6be0, Y=0x606170, INCY=0x7ffff7ca6bdc) at gemv2.cpp:204 #5 0x00007ffff797f3fb in dlarf_ () from /usr/lib64/atlas/liblapack.so.3 #6 0x00007ffff7906e1f in dgelq2_ () from /usr/lib64/atlas/liblapack.so.3 #7 0x00007ffff7907458 in dgelqf_ () from /usr/lib64/atlas/liblapack.so.3 #8 0x00007ffff791496c in dgesdd_ () from /usr/lib64/atlas/liblapack.so.3 #9 0x00000000004018f8 in main () at tsvd3.cpp:62 So there is the [a] real error. Apologize to memcheck, and fix your bug. |
|
From: John R. <jr...@bi...> - 2013-07-09 05:24:33
|
> I found that the mxn submatrix was always initialized at all its > entries. I did this several times, until "gdb continue" resulted > in the program completing (because of looping, the breakpoint > re-occurred several times). > > This is why I am puzzled as to why valgrind is reporting uninitialised value(s). Thank you for the test case. After installing lapack and blas (Fedora 17), then I see the behavior you describe. I did not get the corresponding debuginfo versions, so I have not seen inside iladr. What I did do is binary search on: ----- tsvd3.cpp // Elliminate spurious valgrind uninitialized errors #if 1 for( int iii=38; iii<lwork; ++iii ) work[iii]=123.456; #endif ----- I see no complaints when starting the loop at iii=1,2,4,8,16,32; then errors at 64,48,40; no complaint at 36; errors at 38; no complaint at 37. Hmmm... At this point it's at least somewhat reasonable to file a bug report and include the test case. The rejoinder will be that memcheck finds a real error, and the code "Eliminate ... uninit errors" is the proof. I'm not so sure. I did notice that all the complaints on x86_64 are after a 'ucomisd' instruction where the low 64 bits of the register have a double floating point value, with the high 64 bits being zero. When the other operand to ucomids also is a register, then that register often has all bits zero (128 zero bits). In most ways the operating width is 64 bits, yet the register has 128 bits. -- |
|
From: Norman G. <no...@te...> - 2013-07-08 19:01:30
|
On 07/07/2013 01:18 PM, John Reiser wrote: >> gcc tsvd3.o -o tsvd3 -l:/libHigh.so -l:/libblasB.a -l:/libblas.so -l:/liblapack.so >> >> I am using -l: rather than -l because I was using debug versions >> of libraries in different locations, but I was not able to get the >> debug-info packages to work in the intended seamless way. >> >> Removing either the High or blasB library ( or both) from the link >> line results in no errors being reported by valgrind. >> This raises the following questions: >> >> -- The High library is not being referenced by the tsvd3 executable, so >> why does the map file show that the blasB library is being used to >> resolve references in the High library? > Consult the output from "readelf --dynamic tsvd3". > > Naming a .so shared library on the command line generates a [DT_]NEEDED entry > in the Dynamic table of the executable, so that shared library is *required* > at execution time. The .so is not optional, even if the .so satisfies no undefined > references at static link time. At static link time (during the running of /bin/ld), > any undefined symbols of the .so get added to the set which ld will try to satisfy. > Thus ld will search for any symbols which are undefined in libHigh.so, > and if some .o in libblasB.a defines such a symbol, then ld will load that .o > from libblasB.a into tsvd3. > >> -- In the above link line, the lapack library comes at the end. I thought >> that this would cause the lapack library not to see the blas symbols >> earlier in the link line i.e. the linker should be complaining about >> unresolved symbols. Why is the linker not so complaining? > The search procedure for symbols which are undefined in shared library > liblapack.so is an amalgamation of the [DT_]NEEDED entries within liblapack.so, > the module tsvd3 (including its [DT_]NEEDED entries), any .a archive libraries > that are specified on the command line at static link time, any -Rpath specifications, > and the value of environment variable LD_LIBRARY_PATH. There are different > rules between static link time and dynamic run time. Read the documentation > carefully (info ld, man ld, etc.) > >> -- Why is the linker not complaining about doubly defined symbols >> in the blasB.a and blas.so libraries? > "Doubly defined" pertains only to within one module (main program or shared library.) > Separate modules can have definitions of the same symbol without complaint. > Which symbol gets used depends on the search path, and the search path > depends scope rules at both static link time (/bin/ld) and run time > (dynamic link time, ld-linux.so) > >> Of course, how does all this trick valgrind into flagging uninitialized >> memory locations? The output of the svd3 program is correct with >> both versions of the tsvd3 executable. > Construct the smallest example which causes trouble. Post it here. > > > > ------------------------------------------------------------------------------ > This SF.net email is sponsored by Windows: > > Build for Windows Store. > > http://p.sf.net/sfu/windows-dev2dev > _______________________________________________ > Valgrind-users mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-users > Thank you for the helpful explanations. Attached are two C++ source files that illustrate the problem I have described. The build of the tsvd3 executable is: gcc -g -c gemv2.cpp gcc -g -c tsvd3.cpp gcc -g gemv2.o tsvd3.o -llapack -lblas -lstdc++ -o tsvd3 -Wl,-y,dgemv_ NOTES: 1) Rather than linking exactly as above, I used debug versions of lapack and blas so I could explore with gdb. 2) The output from the link tells about dgemv_ for this executable: gemv2.o: definition of dgemv_ /usr/lib/gcc/i686-redhat-linux/4.7.2/../../../liblapack.so: reference to dgemv_ /usr/lib/gcc/i686-redhat-linux/4.7.2/../../../libblas.so: reference to dgemv_ The gemv2.cpp is a slash and contract of (some of) the very well written and organized source code of the FLENS numerical linear algebra package. My apologies to FLENS for having to distort their handiwork for this purpose :-). Initially, I had thought that fortran was initializing the memory, since the routine is called dgemv_, but it turns out I was using a FLENS option that implements this routine in C++, as in the atached gemv2.cpp . So, the fortran lapack and blas libraries are calling the C++ routine dgemv_. Might this be confusing to valgrind?? The running of tsvd3 gives the values: VS= 6.25801 4.91733 3.72567 2.98619 (I think these are not the correct singular values, but this is not relevant to valgrind issue) I ran valgrind: valgrind --track-origins=yes --fullpath-after= ./tsvd3 The first reported error is ==18263== Conditional jump or move depends on uninitialised value(s) ==18263== at 0x433B99F: iladlr_ (/usr/src/debug/lapack-3.4.2/SRC/iladlr.f:107) ==18263== by 0x4287CEE: dlarf_ (/usr/src/debug/lapack-3.4.2/SRC/dlarf.f:187) ==18263== by 0x4202FC9: dgelq2_ (/usr/src/debug/lapack-3.4.2/SRC/dgelq2.f:184) ==18263== by 0x4203624: dgelqf_ (/usr/src/debug/lapack-3.4.2/SRC/dgelqf.f:262) ==18263== by 0x4216632: dgesdd_ (/usr/src/debug/lapack-3.4.2/SRC/dgesdd.f:1055) ==18263== by 0x8049559: main (/home/norm17b/dev/src/ssrd/test/lor/tsvd3.cpp:61) ==18263== Uninitialised value was created by a heap allocation ==18263== at 0x4008449: operator new[](unsigned int) (/builddir/build/BUILD/valgrind-3.8.1/coregrind/m_replacemalloc/vg_replace_malloc.c:357) ==18263== by 0x80494BA: main (/home/norm17b/dev/src/ssrd/test/lor/tsvd3.cpp:52) The line in main() that allocates the memory is double* work = new double[ lwork ]; where lwork= 300 This is a typical work array in the style of lapack. Next, I stepped into iladr through the above chain, and set a breakpoint at iladr.f:107 ELSE IF( A(M, 1).NE.ZERO .OR. A(M, N).NE.ZERO ) THEN ILADLR = M Examining memory, p m p n p lda x/Kfg &a where K = lda * n I found that the mxn submatrix was always initialized at all its entries. I did this several times, until "gdb continue" resulted in the program completing (because of looping, the breakpoint re-occurred several times). This is why I am puzzled as to why valgrind is reporting uninitialised value(s). |
|
From: Philippe W. <phi...@sk...> - 2013-07-08 15:34:43
|
On Thu, 2013-07-04 at 12:35 -0700, Norman Goldstein wrote: > I am using valgrind-3.8.1 on Fedora 18, 32 bit x86, and > am linking fortran libraries to my main C++ code. > It seems that when malloc'd values end up being initialized > in fortran code, that this is still tagged by valgrind as > uninitialized data. I have worked around (avoided) this problem > by initializing new memory to 0. I have seen related posts on the > net, but am not sure if this is already flagged as a bug, or if I > am not using valgrind correctly with fortran. If you know at what place(s) the fortran code is supposed to initialise the memory, you could use gdb/vgdb, put a break at this place(s), and then verify that the "v bits" of the "source memory" used to initialise the target memory are effectively what is expected (i.e. that the fortran code is copying "initialised" values to this memory. This might point at some locations that fortran assumes are initialised but valgrind did not track such initialisation. Using --track-origins=yes might also help to pinpoint a possible error. Philippe |
|
From: John R. <jr...@bi...> - 2013-07-07 20:17:20
|
> gcc tsvd3.o -o tsvd3 -l:/libHigh.so -l:/libblasB.a -l:/libblas.so -l:/liblapack.so > > I am using -l: rather than -l because I was using debug versions > of libraries in different locations, but I was not able to get the > debug-info packages to work in the intended seamless way. > > Removing either the High or blasB library ( or both) from the link > line results in no errors being reported by valgrind. > This raises the following questions: > > -- The High library is not being referenced by the tsvd3 executable, so > why does the map file show that the blasB library is being used to > resolve references in the High library? Consult the output from "readelf --dynamic tsvd3". Naming a .so shared library on the command line generates a [DT_]NEEDED entry in the Dynamic table of the executable, so that shared library is *required* at execution time. The .so is not optional, even if the .so satisfies no undefined references at static link time. At static link time (during the running of /bin/ld), any undefined symbols of the .so get added to the set which ld will try to satisfy. Thus ld will search for any symbols which are undefined in libHigh.so, and if some .o in libblasB.a defines such a symbol, then ld will load that .o from libblasB.a into tsvd3. > > -- In the above link line, the lapack library comes at the end. I thought > that this would cause the lapack library not to see the blas symbols > earlier in the link line i.e. the linker should be complaining about > unresolved symbols. Why is the linker not so complaining? The search procedure for symbols which are undefined in shared library liblapack.so is an amalgamation of the [DT_]NEEDED entries within liblapack.so, the module tsvd3 (including its [DT_]NEEDED entries), any .a archive libraries that are specified on the command line at static link time, any -Rpath specifications, and the value of environment variable LD_LIBRARY_PATH. There are different rules between static link time and dynamic run time. Read the documentation carefully (info ld, man ld, etc.) > > -- Why is the linker not complaining about doubly defined symbols > in the blasB.a and blas.so libraries? "Doubly defined" pertains only to within one module (main program or shared library.) Separate modules can have definitions of the same symbol without complaint. Which symbol gets used depends on the search path, and the search path depends scope rules at both static link time (/bin/ld) and run time (dynamic link time, ld-linux.so) > > Of course, how does all this trick valgrind into flagging uninitialized > memory locations? The output of the svd3 program is correct with > both versions of the tsvd3 executable. Construct the smallest example which causes trouble. Post it here. |
|
From: Norman G. <no...@te...> - 2013-07-07 18:17:03
|
On 07/04/2013 12:35 PM, Norman Goldstein wrote: > I am using valgrind-3.8.1 on Fedora 18, 32 bit x86, and > am linking fortran libraries to my main C++ code. > It seems that when malloc'd values end up being initialized > in fortran code, that this is still tagged by valgrind as > uninitialized data. I have worked around (avoided) this problem > by initializing new memory to 0. I have seen related posts on the > net, but am not sure if this is already flagged as a bug, or if I > am not using valgrind correctly with fortran. > Here is an update to the problem I am experiencing. While preparing a simplified build & source code example, I found that my link line was causing the problem, but I do not understand why. Here is a stylized link line that results in an executable with the described problem. gcc tsvd3.o -o tsvd3 -l:/libHigh.so -l:/libblasB.a -l:/libblas.so -l:/liblapack.so I am using -l: rather than -l because I was using debug versions of libraries in different locations, but I was not able to get the debug-info packages to work in the intended seamless way. Removing either the High or blasB library ( or both) from the link line results in no errors being reported by valgrind. This raises the following questions: -- The High library is not being referenced by the tsvd3 executable, so why does the map file show that the blasB library is being used to resolve references in the High library? -- In the above link line, the lapack library comes at the end. I thought that this would cause the lapack library not to see the blas symbols earlier in the link line i.e. the linker should be complaining about unresolved symbols. Why is the linker not so complaining? -- Why is the linker not complaining about doubly defined symbols in the blasB.a and blas.so libraries? Of course, how does all this trick valgrind into flagging uninitialized memory locations? The output of the svd3 program is correct with both versions of the tsvd3 executable. |
|
From: Tom H. <to...@co...> - 2013-07-06 23:32:45
|
On 06/07/13 19:34, Sebastian Feld wrote: > On Sun, Jun 23, 2013 at 10:09 PM, Sebastian Feld > <seb...@gm...> wrote: >> How good or bad does valgrind support fortran 95 applications? Are >> there any catches or know bugs? is there a usage example for memcheck >> and exp-sgcheck? > > Sounds like Fortran is no longer supported, or at leats no one cares > about it anymore (judging by the fact that the last couple of > Fortran-related emails in the valgrind lists went unanswered and no > Fortran tests exist). Well valgrind works at the machine code level so it doesn't really care what language the code was originally written in. What do you mean by a "usage example" anyway? Using it on a Fortran program is no different to using it in a C program - you run it and see what errors are reported and then deal with them. Tom -- Tom Hughes (to...@co...) http://compton.nu/ |
|
From: Sebastian F. <seb...@gm...> - 2013-07-06 18:34:12
|
On Sun, Jun 23, 2013 at 10:09 PM, Sebastian Feld <seb...@gm...> wrote: > How good or bad does valgrind support fortran 95 applications? Are > there any catches or know bugs? is there a usage example for memcheck > and exp-sgcheck? Sounds like Fortran is no longer supported, or at leats no one cares about it anymore (judging by the fact that the last couple of Fortran-related emails in the valgrind lists went unanswered and no Fortran tests exist). Sebastian |
|
From: Norman G. <no...@te...> - 2013-07-04 19:36:06
|
I am using valgrind-3.8.1 on Fedora 18, 32 bit x86, and am linking fortran libraries to my main C++ code. It seems that when malloc'd values end up being initialized in fortran code, that this is still tagged by valgrind as uninitialized data. I have worked around (avoided) this problem by initializing new memory to 0. I have seen related posts on the net, but am not sure if this is already flagged as a bug, or if I am not using valgrind correctly with fortran. |
|
From: Daniel S. <da...@co...> - 2013-07-04 18:50:26
|
On Thu, 2013-07-04 at 11:07 -0700, John Reiser wrote: > > alloca(4096); > > __yell(); > > > > > (gdb) monitor get_vbits 0xffeffeed0 256 > > ________ ________ ________ ________ ________ ________ ________ ________ > > ________ ________ ________ ________ ________ ________ ________ ________ > > ________ ________ ________ ________ ________ ________ ________ ________ > > ________ ________ ________ ________ ________ ________ ________ ________ > > ________ ________ ________ ________ ffffffff ffffffff ffffffff ffffffff > > ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff > > ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff > > ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff ffffffff > > Address 0xFFEFFEED0 len 256 has 144 bytes unaddressable > > > > Any ideas? It seems to depend on: > > > > - Some (small) number of threads being spawned. > > - A > page-sized alloca(). > > - Reasonably sized memset on top. > > - It's always the main thread which suffers. > > Thank you for the small, reproducible testcase! > > It's a bug in valgrind, so please file a bug report, and include the testcase. > See the "Bug Reports" entry in the left column of the main page > http://www.valgrind.org/ . Okay, thank you. I made https://bugs.kde.org/show_bug.cgi?id=321960 > Meanwhile the trivial workaround is to memset every result of alloca. :D > Oh, and "#include <string.h>" Ah, sorry for that. A previous version only triggered while we had made __yell a variadic function. Yeah, the above one had bad headers. Thanks, Daniel > > ------------------------------------------------------------------------------ > This SF.net email is sponsored by Windows: > > Build for Windows Store. > > http://p.sf.net/sfu/windows-dev2dev > _______________________________________________ > Valgrind-users mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-users |