|
From: Pierre-Luc P. <pie...@ho...> - 2012-10-02 16:27:18
|
Hello, I am new at using valgrind. I first tried it on a relatively small program, and everything worked ok. When I try it with a much larger program (19MB), I get a SIGSEGV before the main() is actually called. It looks like the program crashes while calling in the constructor functions. This is a 32-bit program run on a CentOs 6.2 64-bit system. The system has 8 GB memory. The address that is being used when the crash occurs is: Access not within mapped region at address 0xFECB7114. Which is near 4 GB, but should be ok on a 8 GB system. Any help is appreciated. Here is the outout from valgrind: [root@xms plp]# valgrind --leak-check=full test1 -cv ==11125== Memcheck, a memory error detector ==11125== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al. ==11125== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info ==11125== Command: test1 -cv ==11125== ==11125== Invalid write of size 4 ==11125== at 0x8055129: CStdStr<wchar_t>::CStdStr(wchar_t const*) (XString.h:1777) ==11125== by 0x8188CED: __static_initialization_and_destruction_0(int, int) (Base64.cpp:29) ==11125== by 0x8188D2F: global constructors keyed to Base64.cpp (Base64.cpp:121) ==11125== by 0x81B96DC: ??? (in /usr/bin/test1) ==11125== by 0x8053427: ??? (in /usr/bin/test1) ==11125== by 0x81B95F8: __libc_csu_init (in /usr/bin/test1) ==11125== by 0x483FC83: (below main) (in /lib/libc-2.12.so) ==11125== Address 0xfecb7114 is not stack'd, malloc'd or (recently) free'd ==11125== ==11125== ==11125== Process terminating with default action of signal 11 (SIGSEGV) ==11125== Access not within mapped region at address 0xFECB7114 ==11125== at 0x8055129: CStdStr<wchar_t>::CStdStr(wchar_t const*) (XString.h:1777) ==11125== by 0x8188CED: __static_initialization_and_destruction_0(int, int) (Base64.cpp:29) ==11125== by 0x8188D2F: global constructors keyed to Base64.cpp (Base64.cpp:121) ==11125== by 0x81B96DC: ??? (in /usr/bin/test1) ==11125== by 0x8053427: ??? (in /usr/bin/test1) ==11125== by 0x81B95F8: __libc_csu_init (in /usr/bin/test1) ==11125== by 0x483FC83: (below main) (in /lib/libc-2.12.so) ==11125== If you believe this happened as a result of a stack ==11125== overflow in your program's main thread (unlikely but ==11125== possible), you can try to increase the size of the ==11125== main thread stack using the --main-stacksize= flag. ==11125== The main thread stack size used in this run was 10485760. ==11125== ==11125== HEAP SUMMARY: ==11125== in use at exit: 172 bytes in 2 blocks ==11125== total heap usage: 4 allocs, 2 frees, 644 bytes allocated ==11125== ==11125== 160 bytes in 1 blocks are possibly lost in loss record 2 of 2 ==11125== at 0x4025F0D: calloc (vg_replace_malloc.c:593) ==11125== by 0x4011D19: _dl_allocate_tls (in /lib/ld-2.12.so) ==11125== by 0x451E328: pthread_create@@GLIBC_2.1 (in /lib/libpthread-2.12.so) ==11125== by 0x81B4174: XBeginThread(ttThreadHandle**, void* (*)(void*), bool, void*, int, bool) (XThreads.cpp:289) ==11125== by 0x81B6972: CXMultipleTimer::Run() (XMultipleTimer.cpp:65) ==11125== by 0x81B6701: CXMultipleTimer::CXMultipleTimer() (XMultipleTimer.cpp:31) ==11125== by 0x81B3E6A: __static_initialization_and_destruction_0(int, int) (XSocket.cpp:25) ==11125== by 0x81B3EAC: global constructors keyed to XSocket.cpp (XSocket.cpp:1331) ==11125== by 0x81B96DC: ??? (in /usr/bin/test1) ==11125== by 0x8053427: ??? (in /usr/bin/test1) ==11125== by 0x81B95F8: __libc_csu_init (in /usr/bin/test1) ==11125== by 0x483FC83: (below main) (in /lib/libc-2.12.so) ==11125== ==11125== LEAK SUMMARY: ==11125== definitely lost: 0 bytes in 0 blocks ==11125== indirectly lost: 0 bytes in 0 blocks ==11125== possibly lost: 160 bytes in 1 blocks ==11125== still reachable: 12 bytes in 1 blocks ==11125== suppressed: 0 bytes in 0 blocks ==11125== Reachable blocks (those to which a pointer was found) are not shown. ==11125== To see them, rerun with: --leak-check=full --show-reachable=yes ==11125== ==11125== For counts of detected and suppressed errors, rerun with: -v ==11125== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 85 from 8) Thanks, Pierre |
|
From: David F. <fa...@kd...> - 2012-10-02 16:36:04
|
On Tuesday 02 October 2012 12:27:12 Pierre-Luc Provencal wrote: > ==11125== by 0x8188CED: __static_initialization_and_destruction_0(int, int) (Base64.cpp:29) > ==11125== by 0x8188D2F: global constructors keyed to Base64.cpp (Base64.cpp:121) Greetings from Provence :-) What does the code at these two lines, say? Unless valgrind is wrong, something funky is being done by that code, so it might help to see what it's doing. -- David Faure, fa...@kd..., http://www.davidfaure.fr Working on KDE, in particular KDE Frameworks 5 |
|
From: Pierre-Luc P. <pie...@ho...> - 2012-10-02 18:07:34
|
Well,
This is just initialization for a static member from the class CBase64:
==11125== by 0x8188CED: __static_initialization_and_destruction_0(int, int) (Base64.cpp:29):
27
28 CXString CBase64::m_sBase64Alphabet =
29 "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";
30
The other line is just the end of the file.
==11125== by 0x8188D2F: global constructors keyed to Base64.cpp (Base64.cpp:121)
My guess is that the constructor code resides at the end of the code for the current file, but this is just my guess. ;-)
Pierre
> From: fa...@kd...
> To: val...@li...
> CC: pie...@ho...
> Subject: Re: [Valgrind-users] Large application SIGSEGV when run in valgrind
> Date: Tue, 2 Oct 2012 18:38:07 +0000
>
> On Tuesday 02 October 2012 12:27:12 Pierre-Luc Provencal wrote:
> > ==11125== by 0x8188CED: __static_initialization_and_destruction_0(int, int) (Base64.cpp:29)
> > ==11125== by 0x8188D2F: global constructors keyed to Base64.cpp (Base64.cpp:121)
>
> Greetings from Provence :-)
>
> What does the code at these two lines, say?
>
> Unless valgrind is wrong, something funky is being done by that code, so it might help to see what it's doing.
>
> --
> David Faure, fa...@kd..., http://www.davidfaure.fr
> Working on KDE, in particular KDE Frameworks 5
>
|
|
From: John R. <jr...@bi...> - 2012-10-02 19:47:23
|
On 10/02/2012 09:27 AM, Pierre-Luc Provencal wrote: > Hello, > > I am new at using valgrind. I first tried it on a relatively small program, and everything worked ok. > > When I try it with a much larger program (19MB), I get a SIGSEGV before the main() is actually called. It looks like the program crashes while calling in the constructor functions. This is a 32-bit program run on a CentOs 6.2 64-bit system. The system has 8 GB memory. The address that is being used > when the crash occurs is: Access not within mapped region at address 0xFECB7114. Which is near 4 GB, but should be ok on a 8 GB system. What is large is not necessarily the 19MB program, but the use of almost 4GiB of address space. 0xFECB7114 is within 20MB or so of the architectural limit on a 32-bit address. In addition to the architectural limit, the software which provides a 32-bit environment on a 64-bit machine imposes an even lower limit on a 32-bit address. There is a large gain in speed if some high 32-bit addresses can be reserved for use by the operating system and libc, instead of making the entire 4GiB range available to the 32-bit "client" (memcheck, including the program being checked.) The reserved amount is at least 256KiB, but might be enough to cover 20MiB, which would tend to explain some of the problems you see. In any case, please run "cat /proc/<PID>/maps" (where <PID> is the numerical process ID) and show us what the mappings look like for addresses 0xFE000000 and above, when the program hits the memcheck error (or shortly before.) -- |
|
From: Philippe W. <phi...@sk...> - 2012-10-02 20:55:26
|
On Tue, 2012-10-02 at 12:48 -0700, John Reiser wrote: > In any case, please run "cat /proc/<PID>/maps" (where <PID> is the numerical > process ID) and show us what the mappings look like for addresses 0xFE000000 > and above, when the program hits the memcheck error (or shortly before.) The easiest to obtain this info when the segv is triggered is to start with --vgdb-error=1, then attach with gdb/vgdb when the error is reported. In gdb, you can then use: monitor v.info memory aspacemgr # this will show the set of mappings as seen by Valgrind shell cat /proc/<pid>maps # this will show the mappings as seen by the kernel Philippe |
|
From: John R. <jr...@bi...> - 2012-10-02 22:36:51
|
> Using valgrind with gdb, the faulty address is now 0xFEABA104: > > ==29254== Process terminating with default action of signal 11 (SIGSEGV) > ==29254== Access not within mapped region at address 0xFEABA104 > ==29254== at 0x8055129: CStdStr<wchar_t>::CStdStr(wchar_t const*) (XString.h:1777) > ==29254== by 0x8188CED: __static_initialization_and_destruction_0(int, int) (Base64.cpp:29) > ==29254== by 0x8188D2F: global constructors keyed to Base64.cpp (Base64.cpp:121) > ==29254== by 0x81B96DC: ??? (in /usr/bin/test1) > ==29254== by 0x8053427: ??? (in /usr/bin/test1) > ==29254== by 0x81B95F8: __libc_csu_init (in /usr/bin/test1) > ==29254== by 0x483FC83: (below main) (in /lib/libc-2.12.so) > > The address 0xFEABA104 does not seem to be mapped: > > # cat /proc/29254/maps > 04000000-0401e000 r-xp 00000000 08:01 347089 /lib/ld-2.12.so ## shared libraries [snip] > 08048000-0828d000 r-xp 00000000 08:01 266751 /usr/bin/test1 ## the program being checked > 0828d000-08292000 rw-p 00245000 08:01 266751 /usr/bin/test1 > 08292000-082aa000 rw-p 00000000 00:00 0 > 082aa000-082ab000 rwxp 00000000 00:00 0 > 38000000-382ab000 r-xp 00001000 08:01 355740 /usr/local/lib/valgrind/memcheck-x86-linux ## the "main" program > 382ab000-382ad000 rw-p 002ab000 08:01 355740 /usr/local/lib/valgrind/memcheck-x86-linux > 382ad000-38d43000 rw-p 00000000 00:00 0 ### This gap from 0x38...... to 0x81...... is interesting. ### A bad access at 0xFEABA104 is beginning to look like ### bad code (sign error, etc.) in static inits or CStdStr<wchar_t>. > 81d60000-82785000 rwxp 00000000 00:00 0 [snip] > 866fa000-867fa000 rwxp 00000000 00:00 0 > 867fa000-867fc000 ---p 00000000 00:00 0 # is this the "hole" between two thread stacks? > feabb000-feabe000 rwxp 00000000 00:00 0 > ffaab000-ffac0000 rw-p 00000000 00:00 0 [stack] ## So the reserved area is about (0x100000000 - 0xffac0000) ==> 5.5MB ## which looks like 4MiB plus some deliberate randomization ## as a defense against malware. -- |
|
From: Pierre-Luc P. <pie...@ho...> - 2012-10-02 23:14:41
|
The thing is, I could move on further by not making this one variable static, but then I got another error for another static variable somewhere else. It really looks like static variable initialization is buggy. Thanks,Pierre > Date: Tue, 2 Oct 2012 15:37:44 -0700 > From: jr...@bi... > CC: val...@li... > Subject: Re: [Valgrind-users] Large application SIGSEGV when run in valgrind > > > Using valgrind with gdb, the faulty address is now 0xFEABA104: > > > > ==29254== Process terminating with default action of signal 11 (SIGSEGV) > > ==29254== Access not within mapped region at address 0xFEABA104 > > ==29254== at 0x8055129: CStdStr<wchar_t>::CStdStr(wchar_t const*) (XString.h:1777) > > ==29254== by 0x8188CED: __static_initialization_and_destruction_0(int, int) (Base64.cpp:29) > > ==29254== by 0x8188D2F: global constructors keyed to Base64.cpp (Base64.cpp:121) > > ==29254== by 0x81B96DC: ??? (in /usr/bin/test1) > > ==29254== by 0x8053427: ??? (in /usr/bin/test1) > > ==29254== by 0x81B95F8: __libc_csu_init (in /usr/bin/test1) > > ==29254== by 0x483FC83: (below main) (in /lib/libc-2.12.so) > > > > The address 0xFEABA104 does not seem to be mapped: > > > > # cat /proc/29254/maps > > 04000000-0401e000 r-xp 00000000 08:01 347089 /lib/ld-2.12.so ## shared libraries > [snip] > > 08048000-0828d000 r-xp 00000000 08:01 266751 /usr/bin/test1 ## the program being checked > > 0828d000-08292000 rw-p 00245000 08:01 266751 /usr/bin/test1 > > 08292000-082aa000 rw-p 00000000 00:00 0 > > 082aa000-082ab000 rwxp 00000000 00:00 0 > > 38000000-382ab000 r-xp 00001000 08:01 355740 /usr/local/lib/valgrind/memcheck-x86-linux ## the "main" program > > 382ab000-382ad000 rw-p 002ab000 08:01 355740 /usr/local/lib/valgrind/memcheck-x86-linux > > 382ad000-38d43000 rw-p 00000000 00:00 0 > > ### This gap from 0x38...... to 0x81...... is interesting. > ### A bad access at 0xFEABA104 is beginning to look like > ### bad code (sign error, etc.) in static inits or CStdStr<wchar_t>. > > > 81d60000-82785000 rwxp 00000000 00:00 0 > [snip] > > 866fa000-867fa000 rwxp 00000000 00:00 0 > > 867fa000-867fc000 ---p 00000000 00:00 0 # is this the "hole" between two thread stacks? > > feabb000-feabe000 rwxp 00000000 00:00 0 > > ffaab000-ffac0000 rw-p 00000000 00:00 0 [stack] > > ## So the reserved area is about (0x100000000 - 0xffac0000) ==> 5.5MB > ## which looks like 4MiB plus some deliberate randomization > ## as a defense against malware. > > -- > > > ------------------------------------------------------------------------------ > Don't let slow site performance ruin your business. Deploy New Relic APM > Deploy New Relic app performance management and know exactly > what is happening inside your Ruby, Python, PHP, Java, and .NET app > Try New Relic at no cost today and get our sweet Data Nerd shirt too! > http://p.sf.net/sfu/newrelic-dev2dev > _______________________________________________ > Valgrind-users mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-users |