|
From: Kalaivani R <kal...@gm...> - 2013-01-09 15:43:24
|
Hi,
I installed valgrind 3.8.1 and when I tried to run with a C++ executable it
always terminates with SIGSEGV for a C++ static variable initialization.
Is there any fix available for this issue in valgrind?
We could not proceed further. We tried to check for patches but we could
not succeed.
We faced the same problem with the earlier version of valgrind as well
(3.1.1)
*glibc version:* ldd (GNU libc) 2.3.4
*gcc version: *gcc (GCC) 3.4.6 20060404 (Red Hat 3.4.6-11)
We have checked the code base.
It starts from a class's static variable definition in one of our internal
CPP files.
>From there it says global contructors keyed to and then inturn to memset.
We are always getting valgrind core immediately when we start the valgrind
for our image.
Below is the backtrace of the valgrind core. The static variable
initialization is the one that triggers the segmentation fault.
Is it possible to run valgrind completely for my binary without terminating
in between due to any of these signals like SIGSEGV.
Do we have any such options to configure valgrind or is there any patch or
workaround for this issue?
Thanks,
Kalai
#0 0x04009a34 in _vgr20210ZU_libcZdsoZa_memset (s=0x0, c=0, n=22150000) at
mc_replace_strmem.c:1007
1007 MEMSET(VG_Z_LIBC_SONAME, memset)
(gdb) bt
#0 0x04009a34 in _vgr20210ZU_libcZdsoZa_memset (s=0x0, c=0, n=22150000) at
mc_replace_strmem.c:1007
#1 0x040c7d0b in Heap (this=0x8060e00) at
/vobs/eps_fw/New/./src/Heap.cpp:61
#2 0x040cb41f in operator new (size=24) at
/vobs/eps_fw/New/./src/NewDelete.cpp:23
#3 0x040ed2ec in __static_initialization_and_destruction_0
(__initialize_p=1, __priority=65535)
at /vobs/eps_fw/Telnet/./src/TelnetClient.cpp:31
#4 0x040ed3a3 in global constructors keyed to
_ZN12TelnetClient27numberOpenClientConnectionsE ()
at /vobs/eps_fw/Telnet/./src/TelnetClient.cpp:275
#5 0x040f0885 in __do_global_ctors_aux () from /root/enb/lib/libTelnet.so
#6 0x040ebbe5 in _init () from /root/enb/lib/libTelnet.so
#7 0x0044d898 in _dl_init_internal () from /lib/ld-linux.so.2
#8 0x004417ff in _dl_start_user () from /lib/ld-linux.so.2
Current language: auto; currently c
VALGRIND ERROR :
================
==19332== Process terminating with default action of signal 11 (SIGSEGV)
==19332== Access not within mapped region at address 0x0
==19332== at 0x4009A34: memset (mc_replace_strmem.c:1007)
==19332== by 0x40C7D0A: Heap::Heap() (Heap.cpp:61)
==19332== by 0x40CB41E: operator new(unsigned int) (NewDelete.cpp:23)
==19332== by 0x4644055: __static_initialization_and_destruction_0(int,
int) (LteEgtpCli.cpp:30)
==19332== by 0x46440B2: global constructors keyed to
SendEgtpuMsg::egtpRxEgtpRB (LteEgtpCli.cpp:994)
==19332== by 0x464C9C0: ??? (in /root/enb/lib/liblteegtp.so)
==19332== by 0x4641C3C: ??? (in /root/enb/lib/liblteegtp.so)
==19332== by 0x25C897: _dl_init (in /lib/ld-2.3.4.so)
==19332== by 0x2507FE: ??? (in /lib/ld-2.3.4.so)
==19332== If you believe this happened as a result of a stack
==19332== overflow in your program's main thread (unlikely but
==19332== possible), you can try to increase the size of the
==19332== main thread stack using the --main-stacksize= flag.
==19332== The main thread stack size used in this run was 10485760.
==19332==
==19332== HEAP SUMMARY:
==19332== in use at exit: 0 bytes in 0 blocks
==19332== total heap usage: 0 allocs, 0 frees, 0 bytes allocated
==19332==
==19332== All heap blocks were freed -- no leaks are possible
==19332==
==19332== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 86 from 8)
==19332==
==19332== 1 errors in context 1 of 1:
==19332== Invalid write of size 4
==19332== at 0x4009A34: memset (mc_replace_strmem.c:1007)
==19332== by 0x40C7D0A: Heap::Heap() (Heap.cpp:61)
==19332== by 0x40CB41E: operator new(unsigned int) (NewDelete.cpp:23)
==19332== by 0x4644055: __static_initialization_and_destruction_0(int,
int) (LteEgtpCli.cpp:30)
==19332== by 0x46440B2: global constructors keyed to
SendEgtpuMsg::egtpRxEgtpRB (LteEgtpCli.cpp:994)
==19332== by 0x464C9C0: ??? (in /root/enb/lib/liblteegtp.so)
==19332== by 0x4641C3C: ??? (in /root/enb/lib/liblteegtp.so)
==19332== by 0x25C897: _dl_init (in /lib/ld-2.3.4.so)
==19332== by 0x2507FE: ??? (in /lib/ld-2.3.4.so)
==19332== Address 0x0 is not stack'd, malloc'd or (recently) free'd
==19332==
Thanks,
Kalai
|
|
From: Rich C. <rc...@wi...> - 2013-01-09 21:36:01
|
On Wed, 9 Jan 2013 21:13:16 +0530
Kalaivani R <kal...@gm...> wrote:
> Hi,
> I installed valgrind 3.8.1 and when I tried to run with a C++ executable it
> always terminates with SIGSEGV for a C++ static variable initialization.
[ ... ]
> Thanks,
> Kalai
>
> #0 0x04009a34 in _vgr20210ZU_libcZdsoZa_memset (s=0x0, c=0, n=22150000) at
> mc_replace_strmem.c:1007
> 1007 MEMSET(VG_Z_LIBC_SONAME, memset)
> (gdb) bt
> #0 0x04009a34 in _vgr20210ZU_libcZdsoZa_memset (s=0x0, c=0, n=22150000) at
> mc_replace_strmem.c:1007
> #1 0x040c7d0b in Heap (this=0x8060e00) at
> /vobs/eps_fw/New/./src/Heap.cpp:61
What is Heap.cpp doing at line 61?
Where did it get the pointer of memory allocated that it's passing
to memset() ?
> #2 0x040cb41f in operator new (size=24) at
> /vobs/eps_fw/New/./src/NewDelete.cpp:23
> #3 0x040ed2ec in __static_initialization_and_destruction_0
> (__initialize_p=1, __priority=65535)
> at /vobs/eps_fw/Telnet/./src/TelnetClient.cpp:31
> #4 0x040ed3a3 in global constructors keyed to
> _ZN12TelnetClient27numberOpenClientConnectionsE ()
> at /vobs/eps_fw/Telnet/./src/TelnetClient.cpp:275
> #5 0x040f0885 in __do_global_ctors_aux () from /root/enb/lib/libTelnet.so
> #6 0x040ebbe5 in _init () from /root/enb/lib/libTelnet.so
> #7 0x0044d898 in _dl_init_internal () from /lib/ld-linux.so.2
> #8 0x004417ff in _dl_start_user () from /lib/ld-linux.so.2
> Current language: auto; currently c
Rich
|
|
From: Kalaivani R <kal...@gm...> - 2013-01-10 10:56:32
|
Heap.cpp allocates memory of 22 MB from HEAP. When I run the image without valgrind, the memory allocation is successful. But when we run with valgrind, the memory allocation fails. We understand that using valgrind increases the amount of memory consumption. But the machine in which we are running valgrind has sufficient amount of memory. But still while using valgrind, the memory allocation fails. Is there any memory threshold / hard limit in heap consumption while using valgrind? And if yes, is it possible to customize the heap consumption limit using any options? Is there any option to customize valgrind to proceed further by just reporting all the issues in binary? And is there any default customizations that we need to do for valgrind to run. Kindly let us know. Thanks in advance. -Kalai On Thu, Jan 10, 2013 at 3:05 AM, Rich Coe <rc...@wi...> wrote: > On Wed, 9 Jan 2013 21:13:16 +0530 > Kalaivani R <kal...@gm...> wrote: > > Hi, > > I installed valgrind 3.8.1 and when I tried to run with a C++ executable > it > > always terminates with SIGSEGV for a C++ static variable initialization. > [ ... ] > > Thanks, > > Kalai > > > > #0 0x04009a34 in _vgr20210ZU_libcZdsoZa_memset (s=0x0, c=0, n=22150000) > at > > mc_replace_strmem.c:1007 > > 1007 MEMSET(VG_Z_LIBC_SONAME, memset) > > (gdb) bt > > #0 0x04009a34 in _vgr20210ZU_libcZdsoZa_memset (s=0x0, c=0, n=22150000) > at > > mc_replace_strmem.c:1007 > > #1 0x040c7d0b in Heap (this=0x8060e00) at > > /vobs/eps_fw/New/./src/Heap.cpp:61 > > What is Heap.cpp doing at line 61? > > Where did it get the pointer of memory allocated that it's passing > to memset() ? > > > #2 0x040cb41f in operator new (size=24) at > > /vobs/eps_fw/New/./src/NewDelete.cpp:23 > > #3 0x040ed2ec in __static_initialization_and_destruction_0 > > (__initialize_p=1, __priority=65535) > > at /vobs/eps_fw/Telnet/./src/TelnetClient.cpp:31 > > #4 0x040ed3a3 in global constructors keyed to > > _ZN12TelnetClient27numberOpenClientConnectionsE () > > at /vobs/eps_fw/Telnet/./src/TelnetClient.cpp:275 > > #5 0x040f0885 in __do_global_ctors_aux () from > /root/enb/lib/libTelnet.so > > #6 0x040ebbe5 in _init () from /root/enb/lib/libTelnet.so > > #7 0x0044d898 in _dl_init_internal () from /lib/ld-linux.so.2 > > #8 0x004417ff in _dl_start_user () from /lib/ld-linux.so.2 > > Current language: auto; currently c > > Rich > -- Cheers, Kalai |
|
From: Philippe W. <phi...@sk...> - 2013-01-10 19:09:37
|
On Thu, 2013-01-10 at 16:26 +0530, Kalaivani R wrote: > Heap.cpp allocates memory of 22 MB from HEAP. 22 MB looks very small and should work without problem (assuming there is no artificial low ulimit for example). Now, if this is a typo, and it is rather 22 GB, then this is too big and does not work on an unpatched Valgrind. It might work on a Valgrind patched to support more memory. See http://sourceforge.net/mailarchive/message.php?msg_id=30299697 for a patch to evaluate. Philippe |
|
From: Philippe W. <phi...@sk...> - 2013-01-11 18:32:35
|
On Fri, 2013-01-11 at 12:56 +0530, Muthumeenal Natarajan wrote: > Pls find attached the Mem stats output from valgrind, attached both > the versions (one without the patch and one with the patch which helps > to increase the memory from > http://sourceforge.net/mailarchive/message.php?msg_id=30299697) The stats shows that the amount of memory is not the problem: all arenas are quite small. So, something else is going wrong. Here are a few ideas that could help to understand what is going on: * try with other tools to see if the problem disappear e.g. with --tool=none --tool=massif --tool=helgrind * use two GDBs to debug what is going on inside NewDelete.cpp and/or LteMacDefs.cpp:12, comparing a native run with a run under Valgrind (cfr Valgrind gdbsrv). * run Valgrind with -v -v -v -d -d -d to output plenty of Valgrind traces, just in case this would give some light. Philippe |
|
From: Rich C. <rc...@wi...> - 2013-01-17 13:44:43
|
I would take the file Heap.cpp and construct a small test case, main.cpp that initializes Heap exactly like your failing program. Cut out everything that is not relevant to the problem. Get this new code to fail with valgrind. Then send the source code to the list and the steps to reproduce the problem so we can see what you see. Rich On Fri, 11 Jan 2013 12:56:53 +0530 Muthumeenal Natarajan <mna...@ai...> wrote: > Hi, > > Pls find attached the Mem stats output from valgrind, attached both the versions (one without the patch and one with the patch which helps to increase the memory from http://sourceforge.net/mailarchive/message.php?msg_id=30299697) > > Memory stats look the same in both the cases and it's not about increasing the memory as we allocate very low (22MB) which terminates the process while running with valgrind. > > Can you pls redirect us if you have already faced similar issues or is there any settings which can bypass this cause we may not be able to reduce this size, as we need them for our project. > > Thanks, > Meenal > > From: Kalaivani R [mailto:kal...@gm...] > Sent: Friday, January 11, 2013 11:05 AM > To: Muthumeenal Natarajan > Subject: Fwd: [Valgrind-developers] Need help : to prevent valgrind from terminating due to SIGSEGV > > > ---------- Forwarded message ---------- > From: Philippe Waroquiers <phi...@sk...<mailto:phi...@sk...>> > Date: Fri, Jan 11, 2013 at 1:02 AM > Subject: Re: [Valgrind-developers] Need help : to prevent valgrind from terminating due to SIGSEGV > To: Kalaivani R <kal...@gm...<mailto:kal...@gm...>> > > On Fri, 2013-01-11 at 00:47 +0530, Kalaivani R wrote: > > > > > Today we tried to reduce the amount of memory allocated from HEAP for > > our binary by reducing the allocations and then the valgrind seems to > > work fine with the same executable.. > So, this seems to point at using a lot of memory. > > Which version of Valgrind are you using ? > Better use the last released version (3.8.1) as there was > some memory use improvements in recent versions. > > > > Is there a way to figure out how much memory valgrind is consuming > > while running with our exe? > To see what is going on, restart your program giving the flags > --stats=yes --profile-heap=yes to Valgrind. > With this, Valgrind will report various detailed statistics about > memory usage (for the client process, and for the Valgrind internals). > The memory of Valgrind is divided in "arenas". > Post the last statistics for each arena. > Typically, one arena stat looks like: > -------- Arena "core": 1048576/1048576 max/curr mmap'd, 0/0 unsplit/split sb unmmap'd, 112800/112800 max/curr on_loan 4 rzB -------- > 16 in 1: stacks.rs.1 > 40 in 1: gdbserved_watches > 72 in 1: main.mpclo.3 > 1,008 in 71: errormgr.sLTy.1 > 2,592 in 81: errormgr.losf.1 > 2,880 in 81: errormgr.losf.2 > 3,216 in 81: errormgr.losf.4 > 4,440 in 175: errormgr.sLTy.2 > 33,000 in 6: gdbsrv > 65,536 in 1: di.syswrap-x86.azxG.1 > > > > > Is this patch required if more heap memory is consumed? > It should (or could?) help. To be confirmed based on the above stats. > > Philippe > > > > > > -- > Cheers, > Kalai -- Rich Coe rc...@wi... |
|
From: Philippe W. <phi...@sk...> - 2013-01-17 20:19:04
|
On Thu, 2013-01-17 at 20:02 +0530, Muthumeenal Natarajan wrote: > Hi Rich, > > Sure, Thanks for your help. > Pls do let us know..if this is something to do with C++ (when using static variables, > if we remove the static variables from the class, it gets postponed to another class > down the line which has static variable...) If so, is there a work around or patch ? > We can't remove these static variables from these classes.. > so pls help us to run valgrind with our code. > > Also if this is the problem with C++, you must have already got such error reports. > .we remember seeing some blogs with patches for the same, we couldn't download the > same...so pls let us know if there're patches in official valgrind site which can be used. The previous logs you have captured have quite clearly shown that the problem is *not* a lack of memory. The advice of Rich to create a small set of sources allowing others to reproduce the problem is a good way to work. If that is not possible (closed sources) or too difficult, I suggest you follow then my previous advice: Have two GDBs, one debugging a native (working) execution; another GDB debugging the execution under Valgrind. Of course, do not let it run till it crashes. Instead, put breakpoints before the crash (for example at NewDelete.cpp:64 or at LtePdcpConstants.cpp:7) and then use next/step/... in parallel in both GDBs. Then at some point in time, the behaviour will diverge. That might give some idea what is going wrong. Philippe |