You can subscribe to this list here.
| 2003 |
Jan
|
Feb
|
Mar
(58) |
Apr
(261) |
May
(169) |
Jun
(214) |
Jul
(201) |
Aug
(219) |
Sep
(198) |
Oct
(203) |
Nov
(241) |
Dec
(94) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2004 |
Jan
(137) |
Feb
(149) |
Mar
(150) |
Apr
(193) |
May
(95) |
Jun
(173) |
Jul
(137) |
Aug
(236) |
Sep
(157) |
Oct
(150) |
Nov
(136) |
Dec
(90) |
| 2005 |
Jan
(139) |
Feb
(130) |
Mar
(274) |
Apr
(138) |
May
(184) |
Jun
(152) |
Jul
(261) |
Aug
(409) |
Sep
(239) |
Oct
(241) |
Nov
(260) |
Dec
(137) |
| 2006 |
Jan
(191) |
Feb
(142) |
Mar
(169) |
Apr
(75) |
May
(141) |
Jun
(169) |
Jul
(131) |
Aug
(141) |
Sep
(192) |
Oct
(176) |
Nov
(142) |
Dec
(95) |
| 2007 |
Jan
(98) |
Feb
(120) |
Mar
(93) |
Apr
(96) |
May
(95) |
Jun
(65) |
Jul
(62) |
Aug
(56) |
Sep
(53) |
Oct
(95) |
Nov
(106) |
Dec
(87) |
| 2008 |
Jan
(58) |
Feb
(149) |
Mar
(175) |
Apr
(110) |
May
(106) |
Jun
(72) |
Jul
(55) |
Aug
(89) |
Sep
(26) |
Oct
(96) |
Nov
(83) |
Dec
(93) |
| 2009 |
Jan
(97) |
Feb
(106) |
Mar
(74) |
Apr
(64) |
May
(115) |
Jun
(83) |
Jul
(137) |
Aug
(103) |
Sep
(56) |
Oct
(59) |
Nov
(61) |
Dec
(37) |
| 2010 |
Jan
(94) |
Feb
(71) |
Mar
(53) |
Apr
(105) |
May
(79) |
Jun
(111) |
Jul
(110) |
Aug
(81) |
Sep
(50) |
Oct
(82) |
Nov
(49) |
Dec
(21) |
| 2011 |
Jan
(87) |
Feb
(105) |
Mar
(108) |
Apr
(99) |
May
(91) |
Jun
(94) |
Jul
(114) |
Aug
(77) |
Sep
(58) |
Oct
(58) |
Nov
(131) |
Dec
(62) |
| 2012 |
Jan
(76) |
Feb
(93) |
Mar
(68) |
Apr
(95) |
May
(62) |
Jun
(109) |
Jul
(90) |
Aug
(87) |
Sep
(49) |
Oct
(54) |
Nov
(66) |
Dec
(84) |
| 2013 |
Jan
(67) |
Feb
(52) |
Mar
(93) |
Apr
(65) |
May
(33) |
Jun
(34) |
Jul
(52) |
Aug
(42) |
Sep
(52) |
Oct
(48) |
Nov
(66) |
Dec
(14) |
| 2014 |
Jan
(66) |
Feb
(51) |
Mar
(34) |
Apr
(47) |
May
(58) |
Jun
(27) |
Jul
(52) |
Aug
(41) |
Sep
(78) |
Oct
(30) |
Nov
(28) |
Dec
(26) |
| 2015 |
Jan
(41) |
Feb
(42) |
Mar
(20) |
Apr
(73) |
May
(31) |
Jun
(48) |
Jul
(23) |
Aug
(55) |
Sep
(36) |
Oct
(47) |
Nov
(48) |
Dec
(41) |
| 2016 |
Jan
(32) |
Feb
(34) |
Mar
(33) |
Apr
(22) |
May
(14) |
Jun
(31) |
Jul
(29) |
Aug
(41) |
Sep
(17) |
Oct
(27) |
Nov
(38) |
Dec
(28) |
| 2017 |
Jan
(28) |
Feb
(30) |
Mar
(16) |
Apr
(9) |
May
(27) |
Jun
(57) |
Jul
(28) |
Aug
(43) |
Sep
(31) |
Oct
(20) |
Nov
(24) |
Dec
(18) |
| 2018 |
Jan
(34) |
Feb
(50) |
Mar
(18) |
Apr
(26) |
May
(13) |
Jun
(31) |
Jul
(13) |
Aug
(11) |
Sep
(15) |
Oct
(12) |
Nov
(18) |
Dec
(13) |
| 2019 |
Jan
(12) |
Feb
(29) |
Mar
(51) |
Apr
(22) |
May
(13) |
Jun
(20) |
Jul
(13) |
Aug
(12) |
Sep
(21) |
Oct
(6) |
Nov
(9) |
Dec
(5) |
| 2020 |
Jan
(13) |
Feb
(5) |
Mar
(25) |
Apr
(4) |
May
(40) |
Jun
(27) |
Jul
(5) |
Aug
(17) |
Sep
(21) |
Oct
(1) |
Nov
(5) |
Dec
(15) |
| 2021 |
Jan
(28) |
Feb
(6) |
Mar
(11) |
Apr
(5) |
May
(7) |
Jun
(8) |
Jul
(5) |
Aug
(5) |
Sep
(11) |
Oct
(9) |
Nov
(10) |
Dec
(12) |
| 2022 |
Jan
(7) |
Feb
(13) |
Mar
(8) |
Apr
(7) |
May
(12) |
Jun
(27) |
Jul
(14) |
Aug
(27) |
Sep
(27) |
Oct
(17) |
Nov
(17) |
Dec
|
| 2023 |
Jan
(10) |
Feb
(18) |
Mar
(9) |
Apr
(26) |
May
|
Jun
(13) |
Jul
(18) |
Aug
(5) |
Sep
(12) |
Oct
(16) |
Nov
(1) |
Dec
|
| 2024 |
Jan
(4) |
Feb
(3) |
Mar
(6) |
Apr
(17) |
May
(2) |
Jun
(33) |
Jul
(13) |
Aug
(1) |
Sep
(6) |
Oct
(8) |
Nov
(6) |
Dec
(15) |
| 2025 |
Jan
(5) |
Feb
(11) |
Mar
(8) |
Apr
(20) |
May
(1) |
Jun
|
Jul
|
Aug
(9) |
Sep
(1) |
Oct
(7) |
Nov
(1) |
Dec
|
|
From: Masha N. (mnaret) <mn...@ci...> - 2013-06-13 06:40:31
|
Not yet, Following the first answer in the forum I tried several code modifications around mprotect: 1. When releasing the memory, in addition to setting mprotect to PROT_READ and PROT_WRITE, set it PROT_EXEC 2. When allocating the memory, in addition to calling mprotect for the upper and lower parts of the allocated stack with PROT_NONE, Call mprotect with PROT_WRITE and PROT_READ for the valid part of the stack. Each one of the above solutions solved the problem. (Separately) Regarding the small reproducer: The problem only appears after the tests bring the process up and down several times, And only after the pages where the problem appears were allocated for some other stack, released, and then allocated again for the current stack. I'm not sure how to reproduce this behavior in a small executable, I'll try now. Thank you for your reply, Masha. -----Original Message----- From: Philippe Waroquiers [mailto:phi...@sk...] Sent: Wednesday, June 12, 2013 10:44 PM To: Masha Naret (mnaret) Cc: val...@li... Subject: Re: [Valgrind-users] Valgrind shows "Invalid write os size 4" for memory allocated for the stack On Mon, 2013-06-10 at 01:23 -0700, mnaret wrote: > Hello, > Recently I'm getting lot's of "invalid read/invalid write" valgrind > errors which point out at memory allocated for the stack. However the > code doesn't crush and finish running successfully. > I'm trying to understand where the error comes from - and will be > grateful fo any help wih this issue. Do you have a small (compilable) reproducer ? Philippe |
|
From: David C. <dcc...@ac...> - 2013-06-13 06:27:16
|
On 6/12/2013 10:29 PM, Sanjay Kumar (sanjaku5) wrote:
> Hi Philippe,
> Below is part of Make file where I have done the Linking:
>
> sslLIB=$(OUT_DIR)/libsyfer_ssl.a
>
> # pull in dependency info for *existing* .o files
> *********************************
> -include $(commonOBJS:.o=.d)
> -include $(sslOBJS:.o=.d)
> -include $(cliCOMMONOBJS:.o=.d)
> -include $(cliCLIOBJS:.o=.d)
> -include $(cliSYFERSERVEROBJS:.o=.d)
> -include $(cliSYFERCARDSERVEROBJS:.o=.d)
> -include $(cliETHSERVERSERVEROBJS:.o=.d)
>
> all: $(OUT_DIR)/$(OUTNAME) $(OUT_DIR)/cli
>
> $(OUT_DIR)/cli : $(cliCOMMONOBJS) $(cliCLIOBJS)
> @echo "LL $(LOPTS) $@" $(REDIRECT)
> $(SILENT)$(LL) $(LOPTS) $(cliCOMMONOBJS) $(cliCLIOBJS) -o cli -lpthread
>
> $(OUT_DIR)/$(OUTNAME): $(VERSION_FILE) $(commonOBJS) $(sslLIB) $(cliCOMMONOBJS) $(cliSYFERSERVEROBJS)
> @echo "LL $(LOPTS) $@" $(REDIRECT)
> $(SILENT)$(LL) $(LOPTS) $(commonOBJS) $(cliCOMMONOBJS) $(cliSYFERSERVEROBJS) ./libs/librohc.a $(LIB_GLIB_DIR)/libglib-2.0.a ./libs/libsnutils.a ./libs/libcommon.a ./libs/libasnbase.a ./libs/libs1apgen.a ./libs/libsctp.a ./libs/libpcap.a $(LIB_PATH) -lssp -lsyfer_ssl -lcrypto -leventbridge -ldl -o $(OUTNAME)
Without knowing the exact link options, there is no way for anyone on
this list to determine what is going wrong. Please paste the link
command as printed during the build process, perhaps with proprietary
object file names stripped out. We don't know what, for example,
"$(LOPTS)" expands to, and even "$(LL)" could have some linker flags in
it. These are your Makefile's definitions; they are not universal.
>
> $(OUT_DIR)/syfercard : $(cliCOMMONOBJS) $(cliSYFERCARDSERVEROBJS)
> @echo "LL $(LOPTS) $@" $(REDIRECT)
> $(SILENT)$(LL) $(LOPTS) $(cliCOMMONOBJS) $(cliSYFERCARDSERVEROBJS) -o syfercard
>
> $(OUT_DIR)/ethserver : $(cliCOMMONOBJS) $(cliETHSERVERSERVEROBJS)
> @echo "LL $(LOPTS) $@" $(REDIRECT)
> $(SILENT)$(LL) $(LOPTS) $(cliCOMMONOBJS) $(cliETHSERVERSERVEROBJS) -o ethserver -lpthread
>
> *********************************
>
> Could you please comments on this.
> What modification I need to do here ?
>
> Thanks,
> Sanjay
>
> -----Original Message-----
> From: Philippe Waroquiers [mailto:phi...@sk...]
> Sent: Thursday, June 13, 2013 1:38 AM
> To: David Chapman
> Cc: Sanjay Kumar (sanjaku5); val...@li...
> Subject: Re: [Valgrind-users] Why doesn't valgrind detect a memory leak for my application
>
> Sanjay,
>>> --11597-- object doesn't have a dynamic symbol table
> With the above
> and the below stacktrace, it looks like this application is statically linked (or at least, the malloc lib is statically linked).
> valgrind can find leaks if malloc lib is statically linked (you need to use --soname-synonyms=.... to indicate malloc is statically linked).
> However, a completely statically linked application causes other problems.
> You should have at least one (even dummy) dynamically linked lib to have the dynamic loader be invoked in your process otherwise valgrind cannot "LD_PRELOAD" some of its own .so.
>
> Philippe
>
>>> ==11597== Conditional jump or move depends on uninitialised value(s)
>>>
>>> ==11597== at 0x8A838B5: __register_atfork
>>> (in /usr/local/flare/flare)
>>>
>>> ==11597== by 0x8A68074: ptmalloc_init (in /usr/local/flare/flare)
>>>
>>> ==11597== by 0x8A6C265: malloc_hook_ini
>>> (in /usr/local/flare/flare)
>>>
>>> ==11597== by 0x8A6BDCE: malloc (in /usr/local/flare/flare)
>>>
>>> ==11597== by 0x8AA893B: _dl_init_paths
>>> (in /usr/local/flare/flare)
>>>
>>> ==11597== by 0x8A8BDBB: _dl_non_dynamic_init
>>> (in /usr/local/flare/flare)
>>>
>>> ==11597== by 0x8A8CA75: __libc_init_first
>>> (in /usr/local/flare/flare)
>>>
>>> ==11597== by 0x8A3E460: (below main) (in /usr/local/flare/flare)
>
>
>
>
--
David Chapman dcc...@ac...
Chapman Consulting -- San Jose, CA
Software Development Done Right.
www.chapman-consulting-sj.com
|
|
From: Sanjay K. (sanjaku5) <san...@ci...> - 2013-06-13 05:29:21
|
Hi Philippe,
Below is part of Make file where I have done the Linking:
sslLIB=$(OUT_DIR)/libsyfer_ssl.a
# pull in dependency info for *existing* .o files
*********************************
-include $(commonOBJS:.o=.d)
-include $(sslOBJS:.o=.d)
-include $(cliCOMMONOBJS:.o=.d)
-include $(cliCLIOBJS:.o=.d)
-include $(cliSYFERSERVEROBJS:.o=.d)
-include $(cliSYFERCARDSERVEROBJS:.o=.d)
-include $(cliETHSERVERSERVEROBJS:.o=.d)
all: $(OUT_DIR)/$(OUTNAME) $(OUT_DIR)/cli
$(OUT_DIR)/cli : $(cliCOMMONOBJS) $(cliCLIOBJS)
@echo "LL $(LOPTS) $@" $(REDIRECT)
$(SILENT)$(LL) $(LOPTS) $(cliCOMMONOBJS) $(cliCLIOBJS) -o cli -lpthread
$(OUT_DIR)/$(OUTNAME): $(VERSION_FILE) $(commonOBJS) $(sslLIB) $(cliCOMMONOBJS) $(cliSYFERSERVEROBJS)
@echo "LL $(LOPTS) $@" $(REDIRECT)
$(SILENT)$(LL) $(LOPTS) $(commonOBJS) $(cliCOMMONOBJS) $(cliSYFERSERVEROBJS) ./libs/librohc.a $(LIB_GLIB_DIR)/libglib-2.0.a ./libs/libsnutils.a ./libs/libcommon.a ./libs/libasnbase.a ./libs/libs1apgen.a ./libs/libsctp.a ./libs/libpcap.a $(LIB_PATH) -lssp -lsyfer_ssl -lcrypto -leventbridge -ldl -o $(OUTNAME)
$(OUT_DIR)/syfercard : $(cliCOMMONOBJS) $(cliSYFERCARDSERVEROBJS)
@echo "LL $(LOPTS) $@" $(REDIRECT)
$(SILENT)$(LL) $(LOPTS) $(cliCOMMONOBJS) $(cliSYFERCARDSERVEROBJS) -o syfercard
$(OUT_DIR)/ethserver : $(cliCOMMONOBJS) $(cliETHSERVERSERVEROBJS)
@echo "LL $(LOPTS) $@" $(REDIRECT)
$(SILENT)$(LL) $(LOPTS) $(cliCOMMONOBJS) $(cliETHSERVERSERVEROBJS) -o ethserver -lpthread
*********************************
Could you please comments on this.
What modification I need to do here ?
Thanks,
Sanjay
-----Original Message-----
From: Philippe Waroquiers [mailto:phi...@sk...]
Sent: Thursday, June 13, 2013 1:38 AM
To: David Chapman
Cc: Sanjay Kumar (sanjaku5); val...@li...
Subject: Re: [Valgrind-users] Why doesn't valgrind detect a memory leak for my application
Sanjay,
> > --11597-- object doesn't have a dynamic symbol table
With the above
and the below stacktrace, it looks like this application is statically linked (or at least, the malloc lib is statically linked).
valgrind can find leaks if malloc lib is statically linked (you need to use --soname-synonyms=.... to indicate malloc is statically linked).
However, a completely statically linked application causes other problems.
You should have at least one (even dummy) dynamically linked lib to have the dynamic loader be invoked in your process otherwise valgrind cannot "LD_PRELOAD" some of its own .so.
Philippe
> > ==11597== Conditional jump or move depends on uninitialised value(s)
> >
> > ==11597== at 0x8A838B5: __register_atfork
> > (in /usr/local/flare/flare)
> >
> > ==11597== by 0x8A68074: ptmalloc_init (in /usr/local/flare/flare)
> >
> > ==11597== by 0x8A6C265: malloc_hook_ini
> > (in /usr/local/flare/flare)
> >
> > ==11597== by 0x8A6BDCE: malloc (in /usr/local/flare/flare)
> >
> > ==11597== by 0x8AA893B: _dl_init_paths
> > (in /usr/local/flare/flare)
> >
> > ==11597== by 0x8A8BDBB: _dl_non_dynamic_init
> > (in /usr/local/flare/flare)
> >
> > ==11597== by 0x8A8CA75: __libc_init_first
> > (in /usr/local/flare/flare)
> >
> > ==11597== by 0x8A3E460: (below main) (in /usr/local/flare/flare)
|
|
From: Philippe W. <phi...@sk...> - 2013-06-12 20:07:15
|
Sanjay, > > --11597-- object doesn't have a dynamic symbol table With the above and the below stacktrace, it looks like this application is statically linked (or at least, the malloc lib is statically linked). valgrind can find leaks if malloc lib is statically linked (you need to use --soname-synonyms=.... to indicate malloc is statically linked). However, a completely statically linked application causes other problems. You should have at least one (even dummy) dynamically linked lib to have the dynamic loader be invoked in your process otherwise valgrind cannot "LD_PRELOAD" some of its own .so. Philippe > > ==11597== Conditional jump or move depends on uninitialised value(s) > > > > ==11597== at 0x8A838B5: __register_atfork > > (in /usr/local/flare/flare) > > > > ==11597== by 0x8A68074: ptmalloc_init (in /usr/local/flare/flare) > > > > ==11597== by 0x8A6C265: malloc_hook_ini > > (in /usr/local/flare/flare) > > > > ==11597== by 0x8A6BDCE: malloc (in /usr/local/flare/flare) > > > > ==11597== by 0x8AA893B: _dl_init_paths > > (in /usr/local/flare/flare) > > > > ==11597== by 0x8A8BDBB: _dl_non_dynamic_init > > (in /usr/local/flare/flare) > > > > ==11597== by 0x8A8CA75: __libc_init_first > > (in /usr/local/flare/flare) > > > > ==11597== by 0x8A3E460: (below main) (in /usr/local/flare/flare) |
|
From: Philippe W. <phi...@sk...> - 2013-06-12 19:43:41
|
On Mon, 2013-06-10 at 01:23 -0700, mnaret wrote: > Hello, > Recently I'm getting lot's of "invalid read/invalid write" valgrind errors > which point out at memory allocated for the stack. However the code doesn't > crush and finish running successfully. > I'm trying to understand where the error comes from - and will be grateful > fo any help wih this issue. Do you have a small (compilable) reproducer ? Philippe |
|
From: David C. <dcc...@ac...> - 2013-06-12 16:10:26
|
Please reply to the group, not just me, and please don't top-post. My
reply is at the bottom.
On 6/12/2013 1:35 AM, Sanjay Kumar (sanjaku5) wrote:
>
> Hi David,
>
> Below is code which I added to create the leak in my application :
>
> /*******************/
>
> char *mleak = NULL;
>
> static int mcnt = 0;
>
> mleak = (char *)malloc(10000);
>
> if(NULL == mleak)
>
> printf("\nmleak is NULL\n");
>
> strcpy(mleak, "aaaaaaaaaaaaaaaaaaaaaaa");
>
> printf("\nmleak called:%d mleak:%s \n", mcnt++, mleak);
>
> /*******************/
>
> Below is summary of report:
>
> ==11597== Memcheck, a memory error detector
>
> ==11597== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
>
> ==11597== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright
> info
>
> ==11597== Command: ./flare -f syfer.conf
>
> ==11597==
>
> --11597-- Valgrind options:
>
> --11597-- -v
>
> --11597-- --tool=memcheck
>
> --11597-- --leak-check=full
>
> --11597-- --leak-resolution=high
>
> --11597-- Contents of /proc/version:
>
> --11597-- Linux version 2.6.38-staros-v3-40087-deb-32 (root@releng7)
> (gcc version 3.3.5 (Debian 1:3.3.5-13)) #1 SMP PREEMPT Sat Oct 1
> 02:50:26 EDT 2011
>
> --11597-- Arch and hwcaps: X86, x86-sse1-sse2
>
> --11597-- Page sizes: currently 4096, max supported 4096
>
> --11597-- Valgrind library directory: /usr/local/lib/valgrind
>
> --11597-- Reading syms from /usr/local/flare/flare
>
> --11597-- object doesn't have a dynamic symbol table
>
> --11597-- warning: DiCfSI 0x0 .. 0x0 outside mapped rw segments (NONE)
>
> --11597-- warning: DiCfSI 0x1 .. 0x2 outside mapped rw segments (NONE)
>
> --11597-- warning: DiCfSI 0x3 .. 0x8 outside mapped rw segments (NONE)
>
> --11597-- warning: DiCfSI 0x9 .. 0x437 outside mapped rw segments (NONE)
>
> --11597-- warning: DiCfSI 0x0 .. 0x0 outside mapped rw segments (NONE)
>
> --11597-- warning: DiCfSI 0x1 .. 0x2 outside mapped rw segments (NONE)
>
> --11597-- warning: DiCfSI 0x3 .. 0x8 outside mapped rw segments (NONE)
>
> --11597-- warning: DiCfSI 0x9 .. 0x3a6 outside mapped rw segments (NONE)
>
> --11597-- warning: DiCfSI 0x0 .. 0x0 outside mapped rw segments (NONE)
>
> --11597-- warning: DiCfSI 0x1 .. 0x2 outside mapped rw segments (NONE)
>
> --11597-- Reading syms from /usr/local/lib/valgrind/memcheck-x86-linux
>
> --11597-- object doesn't have a dynamic symbol table
>
> --11597-- Scheduler: using generic scheduler lock implementation.
>
> --11597-- Reading suppressions file: /usr/local/lib/valgrind/default.supp
>
> ==11597== embedded gdbserver: reading from
> /tmp/vgdb-pipe-from-vgdb-to-11597-by-root-on-???
>
> ==11597== embedded gdbserver: writing to
> /tmp/vgdb-pipe-to-vgdb-from-11597-by-root-on-???
>
> ==11597== embedded gdbserver: shared mem
> /tmp/vgdb-pipe-shared-mem-vgdb-11597-by-root-on-???
>
> ==11597==
>
> ==11597==
>
> ==11597== TO CONTROL THIS PROCESS USING vgdb (which you probably
>
> ==11597== don't want to do, unless you know exactly what you're doing,
>
> ==11597== or are doing some strange experiment):
>
> ==11597== /usr/local/lib/valgrind/../../bin/vgdb --pid=11597 ...command...
>
> ==11597==
>
> ==11597== TO DEBUG THIS PROCESS USING GDB: start GDB like this
>
> ==11597== /path/to/gdb ./flare
>
> ==11597== and then give GDB the following command
>
> ==11597== target remote | /usr/local/lib/valgrind/../../bin/vgdb
> --pid=11597
>
> ==11597== --pid is optional if only one valgrind process is running
>
> ==11597==
>
> ==11597== Conditional jump or move depends on uninitialised value(s)
>
> ==11597== at 0x8A838B5: __register_atfork (in /usr/local/flare/flare)
>
> ==11597== by 0x8A68074: ptmalloc_init (in /usr/local/flare/flare)
>
> ==11597== by 0x8A6C265: malloc_hook_ini (in /usr/local/flare/flare)
>
> ==11597== by 0x8A6BDCE: malloc (in /usr/local/flare/flare)
>
> ==11597== by 0x8AA893B: _dl_init_paths (in /usr/local/flare/flare)
>
> ==11597== by 0x8A8BDBB: _dl_non_dynamic_init (in
> /usr/local/flare/flare)
>
> ==11597== by 0x8A8CA75: __libc_init_first (in /usr/local/flare/flare)
>
> ==11597== by 0x8A3E460: (below main) (in /usr/local/flare/flare)
>
> ==11597==
>
> ==11597== Conditional jump or move depends on uninitialised value(s)
>
> ==11597== at 0x8A83926: __register_atfork (in /usr/local/flare/flare)
>
> ==11597== by 0x8A68074: ptmalloc_init (in /usr/local/flare/flare)
>
> ==11597== by 0x8A6C265: malloc_hook_ini (in /usr/local/flare/flare)
>
> ==11597== by 0x8A6BDCE: malloc (in /usr/local/flare/flare)
>
> ==11597== by 0x8AA893B: _dl_init_paths (in /usr/local/flare/flare)
>
> ==11597== by 0x8A8BDBB: _dl_non_dynamic_init (in
> /usr/local/flare/flare)
>
> ==11597== by 0x8A8CA75: __libc_init_first (in /usr/local/flare/flare)
>
> ==11597== by 0x8A3E460: (below main) (in /usr/local/flare/flare)
>
> ......
>
> ...........
>
> .............
>
> ==11597==
>
> ==11597== 481512 errors in context 998 of 1000:
>
> ==11597== Invalid read of size 4
>
> ==11597== at 0x8A6FBCF: memcpy (in /usr/local/flare/flare)
>
> ==11597== by 0x80D9F80:
> configuration::cfgparams::read_pattern(std::string, std::string,
> fsm_t*) (stl_iterator.h:704)
>
> ==11597== by 0x8086EE2:
> configuration::cfg::read_pattern(configuration::configfile*,
> std::string, bool, unsigned short) (cfgdata.cpp:1684)
>
> ==11597== by 0x8088215: configuration::cfg::read_pattern_file()
> (cfgdata.cpp:780)
>
> ==11597== by 0x809935B: configuration::cfg::read_config_file()
> (cfgdata.cpp:623)
>
> ==11597== by 0x811576D: main (main.cpp:3930)
>
> ==11597== Address 0xbef88714 is on thread 1's stack
>
> ==11597==
>
> ==11597==
>
> ==11597== 481512 errors in context 999 of 1000:
>
> ==11597== Invalid read of size 4
>
> ==11597== at 0x8A6FBCF: memcpy (in /usr/local/flare/flare)
>
> ==11597== by 0x80D9CD4:
> configuration::cfgparams::read_pattern(std::string, std::string,
> fsm_t*) (stl_iterator.h:704)
>
> ==11597== by 0x8086EE2:
> configuration::cfg::read_pattern(configuration::configfile*,
> std::string, bool, unsigned short) (cfgdata.cpp:1684)
>
> ==11597== by 0x8088215: configuration::cfg::read_pattern_file()
> (cfgdata.cpp:780)
>
> ==11597== by 0x809935B: configuration::cfg::read_config_file()
> (cfgdata.cpp:623)
>
> ==11597== by 0x811576D: main (main.cpp:3930)
>
> ==11597== Address 0xbefc3388 is on thread 1's stack
>
> ==11597==
>
> ==11597==
>
> ==11597== 662079 errors in context 1000 of 1000:
>
> ==11597== Invalid read of size 4
>
> ==11597== at 0x8A6FBCF: memcpy (in /usr/local/flare/flare)
>
> ==11597== by 0x80DC54B: void
> std::__uninitialized_fill_n_aux<__gnu_cxx::__normal_iterator<flare_stack*,
> std::vector<flare_stack, std::allocator<flare_stack> > >, unsigned
> int, flare_stack>(__gnu_cxx::__normal_iterator<flare_stack*,
> std::vector<flare_stack, std::allocator<flare_stack> > >, unsigned
> int, flare_stack const&, __false_type) (stl_construct.h:81)
>
> ==11597== by 0x80DEAE5: std::vector<flare_stack,
> std::allocator<flare_stack>
> >::_M_fill_insert(__gnu_cxx::__normal_iterator<flare_stack*,
> std::vector<flare_stack, std::allocator<flare_stack> > >, unsigned
> int, flare_stack const&) (vector.tcc:365)
>
> ==11597== by 0x80DB8C4:
> configuration::cfgparams::read_pattern(std::string, std::string,
> fsm_t*) (stl_vector.h:658)
>
> ==11597== by 0x8086EE2:
> configuration::cfg::read_pattern(configuration::configfile*,
> std::string, bool, unsigned short) (cfgdata.cpp:1684)
>
> ==11597== by 0x8088215: configuration::cfg::read_pattern_file()
> (cfgdata.cpp:780)
>
> ==11597== by 0x809935B: configuration::cfg::read_config_file()
> (cfgdata.cpp:623)
>
> ==11597== by 0x811576D: main (main.cpp:3930)
>
> ==11597== Address 0xbedb2374 is on thread 1's stack
>
> ==11597==
>
> ==11597== ERROR SUMMARY: 3704242 errors from 1000 contexts
> (suppressed: 0 from 0)
>
> *When the same code I run as test program valgrind able to detect the
> leak.*
>
> *But NOT when it is part of my application.***
>
> Thanks,
>
> Sanjay
>
I'll leave the details of why leaks would not be reported to the
Valgrind development team. But it seems to me that you have plenty to
work on already: 3,704,242 errors of various kinds. If these are
cleaned up, what happens then?
I suppose it is possible that Valgrind doesn't notice leaks among all of
the other reports, but IMO that is a much smaller problem than reading
out-of-bounds memory and executing instructions based on uninitialized
data. My goal is always zero errors from Valgrind, even if the program
seems to work fine without any changes. Fix the errors that Valgrind
reports now and let us know if there are still problems with leak reporting.
--
David Chapman dcc...@ac...
Chapman Consulting -- San Jose, CA
Software Development Done Right.
www.chapman-consulting-sj.com
|
|
From: David C. <dcc...@ac...> - 2013-06-12 07:38:25
|
On 6/12/2013 12:01 AM, Sanjay Kumar (sanjaku5) wrote:
>
> Hi,
>
> I run my program using valgrind to detect the memory leaks
>
> valgrind -v --tool=memcheck --leak-check=full ./binary -f conf.file
>
>
> But it doesn't show leaks, even I create one leak of 10000 bytes in
> the code.
>
> Any wild guess ??
>
>
What happens if you create a simple program that does nothing but
allocate 10000 bytes and then exits? I presume that the program you
describe above cannot be posted on the Internet; try to create a test
case as small as possible.
And of course it would be good to know what version of Valgrind you are
using, and on what platform.
--
David Chapman dcc...@ac...
Chapman Consulting -- San Jose, CA
Software Development Done Right.
www.chapman-consulting-sj.com
|
|
From: Sanjay K. (sanjaku5) <san...@ci...> - 2013-06-12 07:01:21
|
Hi,
I run my program using valgrind to detect the memory leaks
valgrind -v --tool=memcheck --leak-check=full ./binary -f conf.file
But it doesn't show leaks, even I create one leak of 10000 bytes in the code.
Any wild guess ??
Thanks,
Sanjay
|
|
From: John R. <jr...@bi...> - 2013-06-10 14:12:53
|
> (gdb) monitor v.info last_error > ==10259== Invalid write of size 4 > ==10259== at 0x28686C: vsnprintf (in /lib/libc-2.12.so) > ==10259== Address 0x4b43040 is 45,120 bytes inside a block of size 65,536 > alloc'd > ==10259== at 0x4005DB9: memalign (vg_replace_malloc.c:727) > ==10259== by 0x4005E68: posix_memalign (vg_replace_malloc.c:876) > > > -The memory for the stack is allocated using memalign and then the upper and > lower parts of it are protected using mprotect, so the stack looks like > this: 16k protected with mprotect, 32K valid for usage, another 16K > protected. The problem happens only in the valid area. There is an inherent conflict between memcheck and mprotect. On one side, it is desirable that after mprotect which removes all access privileges then memcheck's accounting bits should reflect the current state, including no access. On the other side, it is desirable that after a following mprotect which restores the previous access privileges to the same region, then memcheck's accounting bits should reflect the state *AS IF* those two mprotect() never had occurred. That is, it would be nice if memcheck remembered which bytes were defined and undefined, even though the region "disappeared behind the mprotect NO-ACCESS curtain" for a while. I don't know the exact situation now, but some time in the past it was impossible to have both properties because the accounting bits couldn't handle it. -- |
|
From: mnaret <mn...@ci...> - 2013-06-10 08:23:37
|
Hello, Recently I'm getting lot's of "invalid read/invalid write" valgrind errors which point out at memory allocated for the stack. However the code doesn't crush and finish running successfully. I'm trying to understand where the error comes from - and will be grateful fo any help wih this issue. Here's what I can see using vgdb: (gdb) monitor v.info last_error ==10259== Invalid write of size 4 ==10259== at 0x28686C: vsnprintf (in /lib/libc-2.12.so) ==10259== Address 0x4b43040 is 45,120 bytes inside a block of size 65,536 alloc'd ==10259== at 0x4005DB9: memalign (vg_replace_malloc.c:727) ==10259== by 0x4005E68: posix_memalign (vg_replace_malloc.c:876) -The memory for the stack is allocated using memalign and then the upper and lower parts of it are protected using mprotect, so the stack looks like this: 16k protected with mprotect, 32K valid for usage, another 16K protected. The problem happens only in the valid area. -The problem always seems to happen in the end of the first page of the valid memory. For example, the stack above starts with the address 0x4b44000 and goes down, the first inaccessible address is 0x4b4304b. -The address pointed out as inaccessible is above the stack pointer: (gdb) info registers eax 0x100 256 ecx 0x0 0 edx 0x0 0 ebx 0x3acff4 3854324 esp 0x4b42f44 0x4b42f44 ebp 0x4b4304c 0x4b4304c esi 0x4b43b70 78920560 edi 0x4b431d4 78918100 eip 0x28686c 0x28686c <vsnprintf+12> -Even though addressibility is not supposed to be affected by mprotect, when I comment out the calls to mprotect the poblem doen't happen any more. -There's no specific line in the code that is causing the problem. It seems that he problem always happens in the end of the first page of the stack. I tried checkig the memory at the point of allocation - and the pointed out address is valid. It also seems valid when the thread starts running and becoms invalid only when the stack reaches near 4K. Considering the fact there's no crash in the program and it runs normally and correctly in spite of those errors, I don't understand what is the reason of valgrind componains. Any help with the cause of the problem or with further evaluation will be highly appreciated, as after spending a few days on this I'm run out of ideas. Thank you, Masha. -- View this message in context: http://valgrind.10908.n7.nabble.com/Valgrind-shows-Invalid-write-os-size-4-for-memory-allocated-for-the-stack-tp45597.html Sent from the Valgrind - Users mailing list archive at Nabble.com. |
|
From: John R. <jr...@bi...> - 2013-06-05 19:09:44
|
>> is it certain, based on what you've seen, that I actually am using the debug >> package? I installed it, which apparently is all that is required to cause >> it to get used when compiled for debugging? > There is a chance of pathname mixup: installing into the "wrong" directory. > > The way to tell is to run valgrind under strace: > > $ strace -f -o strace.out -e trace=file ~/local/bin/valgrind --leak-check=yes ./clock_gettime CLOCK_MONOTONIC > > and afterwards look in strace.out for any open() on the debug symbol file(s). > Also try $ ~/local/bin/valgrind -d -d -d -v -v -v --leak-check=yes ./clock_gettime CLOCK_MONOTONIC and look for debug symbol loading such as: ----- --25057:1:main Load initial debug info --25057-- Reading syms from /usr/bin/date --25057-- svma 0x0000401ad0, avma 0x0000401ad0 --25057-- Considering /usr/lib/debug/.build-id/08/884e015589393715fa4c5d3e9a4ab0d1541e99.debug .. --25057-- .. build-id is valid ----- -- |
|
From: John R. <jr...@bi...> - 2013-06-05 17:57:23
|
> is it certain, based on what you've seen, that I actually am using the debug > package? I installed it, which apparently is all that is required to cause > it to get used when compiled for debugging? There is a chance of pathname mixup: installing into the "wrong" directory. The way to tell is to run valgrind under strace: $ strace -f -o strace.out -e trace=file ~/local/bin/valgrind --leak-check=yes ./clock_gettime CLOCK_MONOTONIC and afterwards look in strace.out for any open() on the debug symbol file(s). -- |
|
From: John R. <jr...@bi...> - 2013-06-05 15:45:09
|
On 06/04/2013, Britton Kerin wrote about "Beagleboard white" ARM using Angstrom Linux:
>>> root@bboneumh2:~/software# ~/local/bin/valgrind --leak-check=yes
>>> ./clock_gettime CLOCK_MONOTONIC
>>> ==28151== Memcheck, a memory error detector
>>> ==28151== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
>>> ==28151== Using Valgrind-3.9.0.SVN and LibVEX; rerun with -h for copyright info
>>> ==28151== Command: ./clock_gettime CLOCK_MONOTONIC
>>> ==28151==
>>> ==28151== Conditional jump or move depends on uninitialised value(s)
>>> ==28151== at 0x400B308: ??? (in /lib/ld-2.12.2.so)
>>> ==28151==
> I'll attach the output of 'objdump -d -S /lib/ld-2.12.2.so' as well. In
> case I'm misunderstanding things.
The addresses such as
==28151== at 0x400B308: ??? (in /lib/ld-2.12.2.so)
are "in /lib/ld-2.12.2.so", so the code at that location is what matters.
Looking at those addresses (0x4000000 less because of runtime placement
in the process versus no prelinking in the file):
$$$ b1b4: e0828001 add r8, r2, r1
b1b8: e5970008 ldr r0, [r7, #8]
b1bc: e3500000 cmp r0, #0
%%% b1c0: 0a00011f beq b644 <_dl_rtld_di_serinfo+0x2fbc>
b1c4: e1520008 cmp r2, r8
%%% b1c8: 2a00000d bcs b204 <_dl_rtld_di_serinfo+0x2b7c>
b2bc: e30a0aab movw r0, #43691 ; 0xaaab
b2c0: e595c004 ldr ip, [r5, #4]
b2c4: e34a0aaa movt r0, #43690 ; 0xaaaa; 0xaaaaaaab = (2<<32)/3
b2c8: e0815190 umull r5, r1, r0, r1 ### (r5=hi, r1=lo) = r0 * r1; 32x32==>64
b2cc: e1a011a1 lsr r1, r1, #3 ### r1 = original_r1 / 12;
b2d0: e151000c cmp r1, ip
b2d4: 21a0100c movcs r1, ip ### minimum
b2d8: e0811081 add r1, r1, r1, lsl #1 ### r1 = r1 + (r1<<1); // * 3
b2dc: e1a05101 lsl r5, r1, #2
b2e0: e51b7070 ldr r7, [fp, #-112] ; 0x70
$$$ b2e4: e0825005 add r5, r2, r5
b2e8: e08f1007 add r1, pc, r7
b2ec: e2811e51 add r1, r1, #1296 ; 0x510
b2f0: e2811008 add r1, r1, #8
b2f4: e1540001 cmp r4, r1
b2f8: 0a00000a beq b328 <_dl_rtld_di_serinfo+0x2ca0>
b2fc: e35a0000 cmp sl, #0
b300: 0a000239 beq bbec <_dl_rtld_di_serinfo+0x3564>
b304: e1520005 cmp r2, r5
%%% b308: 2a000006 bcs b328 <_dl_rtld_di_serinfo+0x2ca0>
b30c: e5931008 ldr r1, [r3, #8]
b310: e5932000 ldr r2, [r3]
b314: e283300c add r3, r3, #12
b318: e1550003 cmp r5, r3
b31c: e081100a add r1, r1, sl
b320: e782100a str r1, [r2, sl]
b324: 8afffff8 bhi b30c <_dl_rtld_di_serinfo+0x2c84>
b328: e59430e4 ldr r3, [r4, #228] ; 0xe4
b32c: e3530000 cmp r3, #0
b330: 0a000362 beq c0c0 <_dl_rtld_di_serinfo+0x3a38>
b334: e51b0064 ldr r0, [fp, #-100] ; 0x64
b338: e5933004 ldr r3, [r3, #4]
b33c: e1500005 cmp r0, r5
b340: e50b3068 str r3, [fp, #-104] ; 0x68
%%% b344: 9a000088 bls b56c <_dl_rtld_di_serinfo+0x2ee4>
c0c0: e51bc064 ldr ip, [fp, #-100] ; 0x64
c0c4: e15c0005 cmp ip, r5
%%% c0c8: 9afffd27 bls b56c <_dl_rtld_di_serinfo+0x2ee4>
we see that the complaints are about comparisons such as
b1c4: e1520008 cmp r2, r8
b304: e1520005 cmp r2, r5
where one of the operands was an addend for the other:
$$$ b1b4: e0828001 add r8, r2, r1 ### r8 = r2 + r1;
$$$ b2e4: e0825005 add r5, r2, r5 ### r5 += r2;
This looks like [I saw this a decade ago]:
----- glibc-2.12/elf/do-rel.h
auto inline void __attribute__ ((always_inline))
elf_dynamic_do_rel (struct link_map *map,
ElfW(Addr) reladdr, ElfW(Addr) relsize,
int lazy)
{
const ElfW(Rel) *r = (const void *) reladdr;
const ElfW(Rel) *end = (const void *) (reladdr + relsize);
ElfW(Addr) l_addr = map->l_addr;
#if (!defined DO_RELA || !defined ELF_MACHINE_PLT_REL) && !defined RTLD_BOOTSTRAP
/* We never bind lazily during ld.so bootstrap. Unfortunately gcc is
not clever enough to see through all the function calls to realize
that. */
if (lazy)
{
/* Doing lazy PLT relocations; they need very little info. */
for (; r < end; ++r)
elf_machine_lazy_rel (map, l_addr, r);
}
-----
----- glibc-2.12/elf/dynamic-link.h
# define _ELF_DYNAMIC_DO_RELOC(RELOC, reloc, map, do_lazy, test_rel) \
do { \
struct { ElfW(Addr) start, size; int lazy; } ranges[2]; \
ranges[0].lazy = 0; \
ranges[0].size = ranges[1].size = 0; \
ranges[0].start = 0; \
...
{ \
int ranges_index; \
for (ranges_index = 0; ranges_index < 2; ++ranges_index) \
elf_dynamic_do_##reloc ((map), \
ranges[ranges_index].start, \
ranges[ranges_index].size, \
ranges[ranges_index].lazy); \
} \
-----
Note that ranges[1].start never is initialized in dynamic-link.h,
while elf_dynamic_do_rel() computes the sum
const ElfW(Rel) *r = (const void *) reladdr;
const ElfW(Rel) *end = (const void *) (reladdr + relsize);
and then compares
for (; r < end; ++r)
which relies on
(uninit + 0) == the_same_uninit
Thus glibc violates the C standard, because an operation on an uninitialized
value is totally undefined. Memcheck complains correctly; glibc stubbornly
HAS REFUSED (!!!!!) the trivial fix "ranges[1].start = 0;" which costs *1* CPU cycle.
Glibc uses this code on all platforms, so why doesn't memcheck complain everywhere?
Actually memcheck does notice, but the default suppressions hide the complaint.
Why don't the default suppressions work here?
File valgrind/glibc-2.X.supp contains various suppressions such as:
{
dl-hack3-cond-0
Memcheck:Cond
fun:_dl_start
fun:_start
}
{
dl-hack3-cond-1
Memcheck:Cond
obj:*/lib*/ld-2.12*.so*
obj:*/lib*/ld-2.12*.so*
obj:*/lib*/ld-2.12*.so*
}
which suppress "Cond" complaints ("Conditional jump or move depends on uninitialised value(s)")
in the context of a particular traceback (_dl_start called from _start) or a pattern
of any three nested routines within ld-2.12*.so*.
THE PROBLEM with Angstrom ld-2.12.2.so is that there are so few symbols
that the traceback is only the *one* routine
==28151== at 0x400B308: ??? (in /lib/ld-2.12.2.so)
whose name memcheck does not know. [The only name that objdump discovers is
_dl_rtld_di_serinfo, which is highly suspect.] Thus the known suppressions
do not match the traceback.
Therefore Angstrom should provide better debugging info for developers.
[And glibc should fix its outright error.]
--
|
|
From: John R. <jr...@bi...> - 2013-06-03 20:20:14
|
[[Inconsistently quoted, for brevity.]]
> The actual source of memcheck's complaint is coregrind/m_redir.c:
> void VG_(redir_initialise) ( void )
> <<snip>>
> # elif defined(VGP_arm_linux)
> /* If we're using memcheck, use these intercepts right from
> the start, otherwise ld.so makes a lot of noise. */
> if (0==VG_(strcmp)("Memcheck", VG_(details).name)) {
> add_hardwired_spec(
> "ld-linux.so.3", "strlen",
> (Addr)&VG_(arm_linux_REDIR_FOR_strlen),
> complain_about_stripped_glibc_ldso
> );
> //add_hardwired_spec(
> // "ld-linux.so.3", "index",
> // (Addr)&VG_(arm_linux_REDIR_FOR_index),
> // NULL
> //);
> add_hardwired_spec(
> "ld-linux.so.3", "memcpy",
> (Addr)&VG_(arm_linux_REDIR_FOR_memcpy),
> complain_about_stripped_glibc_ldso
> );
> }
> I finally got around to trying this. I used 'trunk' in the svn command,
> which I hope is what you meant by top-of-trunk. It had both the strlen
> and memcpy sections that you mention above uncommented, so I tried
> commending out the memcpy secion. I then got an error just like the one
> I originally described, except it referred to strlen. So I tried again
> with that section commented out as well. Then the program ran, but I
> got some scary messages:
>
> root@bboneumh2:~/software# ~/local/bin/valgrind --leak-check=yes
> ./clock_gettime CLOCK_MONOTONIC
> ==28151== Memcheck, a memory error detector
> ==28151== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
> ==28151== Using Valgrind-3.9.0.SVN and LibVEX; rerun with -h for copyright info
> ==28151== Command: ./clock_gettime CLOCK_MONOTONIC
> ==28151==
> ==28151== Conditional jump or move depends on uninitialised value(s)
> ==28151== at 0x400B308: ??? (in /lib/ld-2.12.2.so)
> ==28151==
> ==28151== Conditional jump or move depends on uninitialised value(s)
> ==28151== at 0x400B344: ??? (in /lib/ld-2.12.2.so)
> ==28151==
> ==28151== Conditional jump or move depends on uninitialised value(s)
> ==28151== at 0x400C0C8: ??? (in /lib/ld-2.12.2.so)
> ==28151==
> ==28151== Conditional jump or move depends on uninitialised value(s)
> ==28151== at 0x400B1C0: ??? (in /lib/ld-2.12.2.so)
> ==28151==
> ==28151== Conditional jump or move depends on uninitialised value(s)
> ==28151== at 0x400B1C8: ??? (in /lib/ld-2.12.2.so)
> ==28151==
> 234081.669967
> ==28151==
> ==28151== HEAP SUMMARY:
> ==28151== in use at exit: 0 bytes in 0 blocks
> ==28151== total heap usage: 0 allocs, 0 frees, 0 bytes allocated
> ==28151==
> ==28151== All heap blocks were freed -- no leaks are possible
> ==28151==
> ==28151== For counts of detected and suppressed errors, rerun with: -v
> ==28151== Use --track-origins=yes to see where uninitialised values come from
> ==28151== ERROR SUMMARY: 17 errors from 5 contexts (suppressed: 0 from 0)
>
> I haven't seen anything like those messages before. Are they likely
> related to the commented-out sections?
Very probably these are due to calls on strlen() or memcpy()
which have been expanded inline because of aggressive optimization
("#include <string.h>" compiled with -O2 or -O3.)
You can check this by examining the generated assembly code
which surrounds the addresses mentioned (0x400B308, 0x400B344, 0x400C0C8, etc.,
which are runtime addresses that are 0x4000000 higher than the addresses
in the filesystem copy of ld-2.12.2.so [or 0x0 higher in case ld-linux
has been prelinked.]) It should look like strlen or memcpy.
Assuming that it does look like strlen or memcpy, then it is safe to construct
and use one or more suppression rules in order to hide these complaints
in the future. (We know [assume] that it is safe because other platforms
have hit this problem before, and the code is "the same". In any event,
you as a "mere" user of ld-linux cannot do anything about that code, so don't
listen to those complaints.) Invoke valgrind using the additional argument
"--gen-suppressions=yes" and see the documentation about adding the output
to the default suppressions for your platform.
The other choice is to ask the supplier of your glibc for a version of ld-linux
that is compiled without -O2 and without -O3. [ld-linux is a part of glibc]
Your supplier of ld-linux is acting unfriendly to developers by not making
such a version readily available. This problem has been known generally
for several years by embedded developers on many platforms. You should
raise your voice and complain somewhat forcefully to your supplier of ld-linux.
[Of course, this being open source, in theory you could re-compile ld-linux
yourself. But it is somewhat difficult, cumbersome, and long.]
--
|
|
From: Julian S. <js...@ac...> - 2013-05-30 22:31:25
|
On 05/30/2013 09:52 PM, Dieter, William R wrote: > I followed the instructions in README.android, setting HWKIND to > generic, and making to following changes to get valgrind to build: Cross-compiling doesn't work well from MacOS hosts. Those instructions work better on Linux hosts. > and included the --smc-check=all in the VGPARAMS, because it sounded > like it would be required for an x86 build. Yes. > m_debuginfo/readelf.c:577 (get_elf_symbol_info): Assertion 'in_rx' failed. This happens when reading debug info (line numbers etc) from some .so, or the main executable. These tend to be hard to diagnose. First, find out which object is causing the problem, by rerunning with -v. Then, ideally send me the object. If you can't do that, get a diagnostic dump of the debuginfo reader by adding the flags --trace-symtab=yes --trace-symtab-patt=<whatever>. These generate a huge amount of output, so I suggest you play around with them first on a simple program (eg, ls) on the host, so you can see how to use --trace-symtab-patt= to get output for just the object in question. Really also you should file a bug report, since bugs reported by email tend to fall through the cracks. J |
|
From: Alan M. <ala...@jp...> - 2013-05-30 21:54:02
|
On 5/30/2013 11:22 AM, Alan Mazer wrote: > On 5/30/13 10:16 AM, Dan Kegel wrote: >> On Thu, May 30, 2013 at 9:15 AM, Alan Mazer >> <ala...@jp...> wrote: >>> This type of error is occurring in a variety of places, always >>> "definitely lost" memory at a function call. >>> >>> Anyone have any idea what I'm missing? I've tried multiple compilers >>> and a bazillion different options (on compiler and valgrind). I'm >>> stumped. >> Have you looked at what the compiler is generating, or single-stepped >> through that call? > > I just looked at the output from -S but nothing looks odd. It very > quickly jumps into strcmp calls that are near the top of the called > routine. > > I've been running this under OS X but I just tried it on a Linux > machine and there got more information: > > ==15252== 8,233 bytes in 1 blocks are definitely lost in loss record 4 > of 4 > ==15252== at 0x4A07152: operator new[](unsigned long) > (vg_replace_malloc.c:363) > ==15252== by 0x40E1A0: read_conf(char const*, char const*, char > const*, char const*, unsigned char, SL_List<char*>*, char**) > (parsing.cpp:295) > ==15252== by 0x41799A: main (main.cpp:404) > > On Linux valgrind gives me a line number within the routine (line 295) > that has a legitimate memory leak. > > So I guess I just need to run Valgrind on my Linux machine and avoid > the Mac? It looks like the problem is that valgrind can't locate the "new" usages within the routines. Looking at this more I can see that the reported memory leaks are all valid, and the tracebacks are perfect except for the locations within the "new"-using routines. That part of the traceback is omitted in every message. Compiling with the optimizer enabled heightens the effect, so I thought I might have some optimizations accidentally enabled but it doesn't look like it. It's something weird about g++ on OS X... |
|
From: Dieter, W. R <wil...@in...> - 2013-05-30 19:52:24
|
I am trying to build valgrind to help debug a native Android
application. The host I am compiling on is a Mac running Mac OS
10.8.3. The target is an internal prototype x86 tablet running
Android 4.0.4. I am using Android NDK r8e.
I started with the release version of Valgrind 3.8.1. When I ran into
the premature exit described at the end of this note, I switched to
the svn version.
I followed the instructions in README.android, setting HWKIND to
generic, and making to following changes to get valgrind to build:
1) In the environment variable definitions for the build tools,
substituted "darwin-x86_64" for "linux-x86" in the path to each of
the tools.
2) Added:
export
RANLIB=$NDKROOT/toolchains/x86-4.4.3/prebuilt/linux-x86/bin/i686-android-li
nux-ranlib
to get the right ranlib executable
3) The target cpu/kernel detection logic assumes it is building for
the host CPU. The --target and --host options cover most of the
issues, but the configure script tries to run "uname -r" to get the
kernel version.
The logic in configure.in that matches kernel versions treats 2.6.*
and 3.0.* the same way, so if you are building on a relatively
recent Linux system it will probably work fine. Mac OS is
returning an OS version of 12.3.0, which is unrelated to the
Android kernel version.
I hardcoded configure.in to use version "3.0.8" to match my actual
device, though maybe calling 'adb shell uname -r' would make more
sense for android targets.
4) The types uint32_t and uint64_t are referenced in the system elf.h,
and not defined by default on my system, so I added "#include
<stdint.h>" prior to each "#include <elf.h>"
(coregrind/m_main.c:2987, coregrind/m_coredump/coredump-elf.c:57,
coregrind/m_debuginfo/readelf.c:57,
coregrind/m_initimg/initimg-linux.c:60, coregrind/m_ume/elf.c:53,
coregrind/launcher-linux.c:47)
When I run "/data/local/Inst/bin/valgrind ls", ls runs without any
errors, and I get the expected output:
==32681== Memcheck, a memory error detector
==32681== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
==32681== Using Valgrind-3.9.0.SVN and LibVEX; rerun with -h for copyright
info
==32681== Command: ls
==32681==
[... ls output deleted ... ]
==32681==
==32681== HEAP SUMMARY:
==32681== in use at exit: 1,024 bytes in 1 blocks
==32681== total heap usage: 41 allocs, 40 frees, 5,967 bytes allocated
==32681==
==32681== LEAK SUMMARY:
==32681== definitely lost: 0 bytes in 0 blocks
==32681== indirectly lost: 0 bytes in 0 blocks
==32681== possibly lost: 0 bytes in 0 blocks
==32681== still reachable: 1,024 bytes in 1 blocks
==32681== suppressed: 0 bytes in 0 blocks
==32681== Rerun with --leak-check=full to see details of leaked memory
==32681==
==32681== For counts of detected and suppressed errors, rerun with: -v
==32681== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
I then added a wrapper, as described by jseward's blog post
(http://blog.mozilla.org/jseward/2011/09/27/valgrind-on-android-current-sta
tus/),
and included the --smc-check=all in the VGPARAMS, because it sounded
like it would be required for an x86 build. The whole
/data/local/start_valgrind_myprog file looks like this:
#!/system/bin/sh
VGPARAMS='--error-limit=no --smc-check=all'
export TMPDIR=/data/data/com.intel.central
exec /data/local/Inst/bin/valgrind $VGPARAMS $*
When I start my application with:
am start -a android.intent.action.MAIN -n
com.intel.central/.MainActivity
I see the following in logcat:
I//data/local/start_valgrind_myprog(32640): ==32641== Using
Valgrind-3.9.0.SVN and LibVEX; rerun with -h for copyright info
I//data/local/start_valgrind_myprog(32640): ==32641== Command:
/system/bin/app_process /system/bin --application
--nice-name=com.intel.central com.andr\
oid.internal.os.WrapperInit 32 17 android.app.ActivityThread
I//data/local/start_valgrind_myprog(32640): ==32641==
I//data/local/start_valgrind_myprog(32640): valgrind:
m_debuginfo/readelf.c:577 (get_elf_symbol_info): Assertion 'in_rx' failed.
I//data/local/start_valgrind_myprog(32640): ==32641== at 0x38033455:
report_and_quit (m_libcassert.c:260)
I//data/local/start_valgrind_myprog(32640): ==32641== by 0x38033851:
vgPlain_assert_fail (m_libcassert.c:340)
I//data/local/start_valgrind_myprog(32640): ==32641== by 0x3806D80E:
read_elf_symtab__normal (readelf.c:577)
I//data/local/start_valgrind_myprog(32640): ==32641== by 0x380705F3:
vgModuleLocal_read_elf_debug_info (readelf.c:2655)
I//data/local/start_valgrind_myprog(32640): ==32641== by 0x3806A449:
vgPlain_di_notify_mmap (debuginfo.c:629)
I//data/local/start_valgrind_myprog(32640): ==32641== by 0x38097510:
vgModuleLocal_generic_PRE_sys_mmap (syswrap-generic.c:2087)
I//data/local/start_valgrind_myprog(32640): ==32641== by 0x380C9A0B:
vgSysWrap_x86_linux_sys_mmap2_before (syswrap-x86-linux.c:1247)
I//data/local/start_valgrind_myprog(32640): ==32641== by 0x3808C830:
vgPlain_client_syscall (syswrap-main.c:1522)
I//data/local/start_valgrind_myprog(32640): ==32641== by 0x38089C12:
vgPlain_scheduler (scheduler.c:1066)
I//data/local/start_valgrind_myprog(32640): ==32641== by 0x380C1188:
run_a_thread_NORETURN (syswrap-linux.c:103)
I//data/local/start_valgrind_myprog(32640): sched status:
I//data/local/start_valgrind_myprog(32640): running_tid=1
I//data/local/start_valgrind_myprog(32640): Thread 1: status =
VgTs_Runnable
I//data/local/start_valgrind_myprog(32640): ==32641== at 0xB000F261:
__dl___mmap2 (in /system/bin/linker)
I//data/local/start_valgrind_myprog(32640): Note: see also the FAQ in the
source distribution.
I//data/local/start_valgrind_myprog(32640): It contains workarounds to
several common problems.
I//data/local/start_valgrind_myprog(32640): In particular, if Valgrind
aborted or crashed after
I//data/local/start_valgrind_myprog(32640): identifying problems in your
program, there's a good chance
I//data/local/start_valgrind_myprog(32640): that fixing those problems
will prevent Valgrind aborting or
I//data/local/start_valgrind_myprog(32640): crashing, especially if it
happened in m_mallocfree.c.
I//data/local/start_valgrind_myprog(32640): If that doesn't help, please
report this bug to: www.valgrind.org
I//data/local/start_valgrind_myprog(32640): In the bug report, send all
the above text, the valgrind
I//data/local/start_valgrind_myprog(32640): version, and what OS and
version you are using. Thanks.
The wrapper script appears to be launching the application, but it
looks like valgrind is exiting immediately with an assertion failure.
Any ideas what could be going wrong or how to fix it?
Thanks,
Bill.
|
|
From: Alan M. <al...@ju...> - 2013-05-30 18:51:20
|
On 5/30/13 9:43 AM, David Chapman wrote:
> On 5/30/2013 9:15 AM, Alan Mazer wrote:
>> I've been using valgrind for many years and am finally stumped. After
>> not having used it recently, I'm now getting "definitely lost" memory
>> and the location of the "operator new" usage is a function call.
>>
>> For example...
>>
>> ==35995== 2,089 bytes in 1 blocks are definitely lost in loss record 9
>> of 10
>> ==35995== at 0x10008FC16: malloc (vg_replace_malloc.c:274)
>> ==35995== by 0x1000A126C: operator new(unsigned long) (in
>> /usr/local/lib/libstdc++.6.dylib)
>> ==35995== by 0x10001984C: main (main.cpp:405)
>>
>> where main.cpp:405 is...
>>
>> read_conf(conffile, ".conf;.test;.prot", preface,
>> trigger_statement,
>> skip_preprocessor, m4dirs, &text);
>>
>> Everything being passed is a pointer.
>>
>> If I don't call the routine, the error does go away.
>>
>> This type of error is occurring in a variety of places, always
>> "definitely lost" memory at a function call.
>>
>> Anyone have any idea what I'm missing? I've tried multiple compilers
>> and a bazillion different options (on compiler and valgrind). I'm
>> stumped.
>>
>> -- Alan
>>
>
> What is the signature of the function being called? You say that
> pointers are being passed in, but what does the function expect? For
> example, is the compiler automatically creating an object for the
> string constant using a default constructor? (I realize the loss
> record is longer than the string constant, but I don't know what the
> function expects.)
Sure. Good point.
extern int read_conf(const char *filename, const char *extensions,
const char *preface, const char *postscript, u_char skip_preprocessor,
SL_List<char *> *m4dirs, char **conf_text_p);
>
> Also, is the function in a .so file that has been stripped so that
> Valgrind cannot follow it, stack trace?
No, there are no .so files here. Just a bunch of object files linked
together, all compiled with -g and -Wall, linked with -lm. I've tried
with g++ 4.5.1 and 4.7.0. They both give the same result.
>
> And of course, what version of Valgrind are you using?
3.8.1.
-- Alan
|
|
From: Alan M. <al...@ju...> - 2013-05-30 18:23:28
|
On 5/30/13 10:16 AM, Dan Kegel wrote: > On Thu, May 30, 2013 at 9:15 AM, Alan Mazer <ala...@jp...> wrote: >> This type of error is occurring in a variety of places, always >> "definitely lost" memory at a function call. >> >> Anyone have any idea what I'm missing? I've tried multiple compilers >> and a bazillion different options (on compiler and valgrind). I'm stumped. > Have you looked at what the compiler is generating, or single-stepped > through that call? I just looked at the output from -S but nothing looks odd. It very quickly jumps into strcmp calls that are near the top of the called routine. I've been running this under OS X but I just tried it on a Linux machine and there got more information: ==15252== 8,233 bytes in 1 blocks are definitely lost in loss record 4 of 4 ==15252== at 0x4A07152: operator new[](unsigned long) (vg_replace_malloc.c:363) ==15252== by 0x40E1A0: read_conf(char const*, char const*, char const*, char const*, unsigned char, SL_List<char*>*, char**) (parsing.cpp:295) ==15252== by 0x41799A: main (main.cpp:404) On Linux valgrind gives me a line number within the routine (line 295) that has a legitimate memory leak. So I guess I just need to run Valgrind on my Linux machine and avoid the Mac? -- Alan |
|
From: David C. <dcc...@ac...> - 2013-05-30 17:36:51
|
On 5/30/2013 9:15 AM, Alan Mazer wrote:
> I've been using valgrind for many years and am finally stumped. After
> not having used it recently, I'm now getting "definitely lost" memory
> and the location of the "operator new" usage is a function call.
>
> For example...
>
> ==35995== 2,089 bytes in 1 blocks are definitely lost in loss record 9
> of 10
> ==35995== at 0x10008FC16: malloc (vg_replace_malloc.c:274)
> ==35995== by 0x1000A126C: operator new(unsigned long) (in
> /usr/local/lib/libstdc++.6.dylib)
> ==35995== by 0x10001984C: main (main.cpp:405)
>
> where main.cpp:405 is...
>
> read_conf(conffile, ".conf;.test;.prot", preface, trigger_statement,
> skip_preprocessor, m4dirs, &text);
>
> Everything being passed is a pointer.
>
> If I don't call the routine, the error does go away.
>
> This type of error is occurring in a variety of places, always
> "definitely lost" memory at a function call.
>
> Anyone have any idea what I'm missing? I've tried multiple compilers
> and a bazillion different options (on compiler and valgrind). I'm stumped.
>
> -- Alan
>
What is the signature of the function being called? You say that
pointers are being passed in, but what does the function expect? For
example, is the compiler automatically creating an object for the string
constant using a default constructor? (I realize the loss record is
longer than the string constant, but I don't know what the function
expects.)
Also, is the function in a .so file that has been stripped so that
Valgrind cannot follow its stack trace?
And of course, what version of Valgrind are you using?
--
David Chapman dcc...@ac...
Chapman Consulting -- San Jose, CA
Software Development Done Right.
www.chapman-consulting-sj.com
|
|
From: Dan K. <da...@ke...> - 2013-05-30 17:17:06
|
On Thu, May 30, 2013 at 9:15 AM, Alan Mazer <ala...@jp...> wrote: > This type of error is occurring in a variety of places, always > "definitely lost" memory at a function call. > > Anyone have any idea what I'm missing? I've tried multiple compilers > and a bazillion different options (on compiler and valgrind). I'm stumped. Have you looked at what the compiler is generating, or single-stepped through that call? |
|
From: Alan M. <ala...@jp...> - 2013-05-30 16:15:40
|
I've been using valgrind for many years and am finally stumped. After
not having used it recently, I'm now getting "definitely lost" memory
and the location of the "operator new" usage is a function call.
For example...
==35995== 2,089 bytes in 1 blocks are definitely lost in loss record 9
of 10
==35995== at 0x10008FC16: malloc (vg_replace_malloc.c:274)
==35995== by 0x1000A126C: operator new(unsigned long) (in
/usr/local/lib/libstdc++.6.dylib)
==35995== by 0x10001984C: main (main.cpp:405)
where main.cpp:405 is...
read_conf(conffile, ".conf;.test;.prot", preface, trigger_statement,
skip_preprocessor, m4dirs, &text);
Everything being passed is a pointer.
If I don't call the routine, the error does go away.
This type of error is occurring in a variety of places, always
"definitely lost" memory at a function call.
Anyone have any idea what I'm missing? I've tried multiple compilers
and a bazillion different options (on compiler and valgrind). I'm stumped.
-- Alan
|
|
From: John R. <jr...@bi...> - 2013-05-27 05:06:10
|
On 05/23/2013 10:14 AM, Britton Kerin wrote:
> On Mon, May 20, 2013 at 7:53 AM, John Reiser <jr...@bi...> wrote:
>>> I'm on a beaglebone white with vanilla Angstrom (linux 3.2)
>>> distribution. Valgrind fails like this:
>>
>>> ==13719== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
>>
>>> valgrind: A must-be-redirected function
>>> valgrind: whose name matches the pattern: memcpy
>>> valgrind: in an object with soname matching: ld-linux.so.3
>>> valgrind: was not found whilst processing
>>> valgrind: symbols from the object with soname: ld-linux.so.3
>>
>>> I've installed libc6-dbg as it says. Same problem.
>>
>> Which other Linux distribution is Angstrom derived from, or is most similar to?
>
> Well, its based in part on OpenZaurus, which was debian-based. And it seems
> pretty debian-like in the way its package manager opkg works. But that
> about all I know, I'm just using it because it ships on the beaglebone.
OK. It's Debian. The specific version of Debian might matter,
but perhaps in this case that info is no longer so significant.
>
>> Try running the previous version 3.7 of valgrind.
>
> This version doesnt seem to be available on the release archive page,
> where can I get it?
In general, try the Internet Archive. But in this case the source code
to valgrind is better anyway; see below.
>
>> Also post the output from:
>> readelf --symbols ld-linux.so.3 | grep mem
>
> root@bboneumh:/# readelf --symbols /lib/ld-linux.so.3 | grep mem
> 12: 000150a8 332 FUNC WEAK DEFAULT 10 __libc_memalign@@GLIBC_2.4
> root@bboneumh:/#
The actual source of memcheck's complaint is coregrind/m_redir.c:
void VG_(redir_initialise) ( void )
<<snip>>
# elif defined(VGP_arm_linux)
/* If we're using memcheck, use these intercepts right from
the start, otherwise ld.so makes a lot of noise. */
if (0==VG_(strcmp)("Memcheck", VG_(details).name)) {
add_hardwired_spec(
"ld-linux.so.3", "strlen",
(Addr)&VG_(arm_linux_REDIR_FOR_strlen),
complain_about_stripped_glibc_ldso
);
//add_hardwired_spec(
// "ld-linux.so.3", "index",
// (Addr)&VG_(arm_linux_REDIR_FOR_index),
// NULL
//);
add_hardwired_spec(
"ld-linux.so.3", "memcpy",
(Addr)&VG_(arm_linux_REDIR_FOR_memcpy),
complain_about_stripped_glibc_ldso
);
}
So the deal is: any valgrind tool [not just memcheck] for ARM Linux
will demand that ld-linux.so.3 has a subroutine named "memcpy"
which can be re-directed to an internal valgrind routine
that "does the same thing" in a way that valgrind considers
to be easier to understand and emulate or instrument.
Perhaps your version of ld-linux.so.3 does not have such a 'memcpy'.
The "readelf --symbols" shows that ld-linux.so.3 does not have this
itself, and you say that installing the debug package still does not
have a symbol for 'memcpy'. So perhaps ld-linux.so.3 no longer uses
_subroutine_ memcpy; it was replaced by memmove, or a macro which was
expanded inline, or by an open-coded loop. Therefore, try commenting-out
the code for memcpy, just like the code for 'index' is commented out:
//add_hardwired_spec(
// "ld-linux.so.3", "memcpy",
// (Addr)&VG_(arm_linux_REDIR_FOR_memcpy),
// complain_about_stripped_glibc_ldso
//);
Visit the main web page http://www.valgrind.org/ ,
follow the "Code Repository" link, use 'svn' source code manager
to checkout some version (try for top-of-trunk first, else 3.8.1).
Modify VG_(redir_initialise) as indicated. Build, install, test, use.
--
|
|
From: Britton K. <bri...@gm...> - 2013-05-23 17:14:25
|
On Mon, May 20, 2013 at 7:53 AM, John Reiser <jr...@bi...> wrote:
>> I'm on a beaglebone white with vanilla Angstrom (linux 3.2)
>> distribution. Valgrind fails like this:
>
>> ==13719== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
>
>> valgrind: A must-be-redirected function
>> valgrind: whose name matches the pattern: memcpy
>> valgrind: in an object with soname matching: ld-linux.so.3
>> valgrind: was not found whilst processing
>> valgrind: symbols from the object with soname: ld-linux.so.3
>
>> I've installed libc6-dbg as it says. Same problem.
>
> Which other Linux distribution is Angstrom derived from, or is most similar to?
Well, its based in part on OpenZaurus, which was debian-based. And it seems
pretty debian-like in the way its package manager opkg works. But that
about all I know, I'm just using it because it ships on the beaglebone.
> Try running the previous version 3.7 of valgrind.
This version doesnt seem to be available on the release archive page,
where can I get it?
> Also post the output from:
> readelf --symbols ld-linux.so.3 | grep mem
root@bboneumh:/# readelf --symbols /lib/ld-linux.so.3 | grep mem
12: 000150a8 332 FUNC WEAK DEFAULT 10 __libc_memalign@@GLIBC_2.4
root@bboneumh:/#
Note that I had to use /lib/ld-linux.so.3, as the argument to readelf,
what you suggested said this:
root@bboneumh2:~# readelf --symbols ld-linux.so.3 | grep mem
readelf: Error: 'ld-linux.so.3': No such file
root@bboneumh2:~#
Britton
|
|
From: Phil L. <plo...@sa...> - 2013-05-22 15:52:38
|
We discussed this internally, and think that the pthread_mutex_unlock() call will provide the memory barrier to force synchronization. " Yes, pthread_mutex_unlock is a memory barrier (it would be quite useless otherwise). Chapter and verse: http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap04.html#tag_04_11 The problem with helgrind is that it will have a very hard time proving that the second thread cannot access the shared memory until after the pthread_mutex_unlock() has occurred. Some kind of helgrind annotation for this worker queue case would probably be the easiest way out." -----Original Message----- From: David Faure [mailto:fa...@kd...] Sent: Thursday, May 16, 2013 2:34 PM To: val...@li... Cc: Phil Longstaff Subject: Re: [Valgrind-users] Helgrind data race question On Tuesday 14 May 2013 20:18:44 Phil Longstaff wrote: > int* my_ptr = new int; > *my_ptr = 10; > pthread_mutex_lock(&lock); > shared_ptr = my_ptr; > pthread_mutex_unlock(&lock); > > Thread 2: > pthread_mutex_lock(&lock); > int* my_ptr = shared_ptr; > pthread_mutex_unlock(&lock); > ... = *my_ptr; You're reading a region of memory outside mutex protection, and that region of memory was written to, outside mutex protection. That's the basic definition of a data race. Getting the address of that region of memory within the mutex doesn't change that. You see it as non-racy because "how could *my_ptr ever be something else than 10" ... but if you think about a multi-processor system, the write of the value 10 might not get propagated to the cache of the other processor where the read happens, since the system had no reason to perform that synchronisation. -- David Faure, fa...@kd..., http://www.davidfaure.fr Working on KDE, in particular KDE Frameworks 5 |