valgrind-users Mailing List for Valgrind, an open-source memory debugger (Page 129)

Brought to you by: njn, sewardj, wielaard

valgrind-users — General discussion list for Valgrind users

You can subscribe to this list here.

2003	Jan	Feb	Mar (58)	Apr (261)	May (169)	Jun (214)	Jul (201)	Aug (219)	Sep (198)	Oct (203)	Nov (241)	Dec (94)
2004	Jan (137)	Feb (149)	Mar (150)	Apr (193)	May (95)	Jun (173)	Jul (137)	Aug (236)	Sep (157)	Oct (150)	Nov (136)	Dec (90)
2005	Jan (139)	Feb (130)	Mar (274)	Apr (138)	May (184)	Jun (152)	Jul (261)	Aug (409)	Sep (239)	Oct (241)	Nov (260)	Dec (137)
2006	Jan (191)	Feb (142)	Mar (169)	Apr (75)	May (141)	Jun (169)	Jul (131)	Aug (141)	Sep (192)	Oct (176)	Nov (142)	Dec (95)
2007	Jan (98)	Feb (120)	Mar (93)	Apr (96)	May (95)	Jun (65)	Jul (62)	Aug (56)	Sep (53)	Oct (95)	Nov (106)	Dec (87)
2008	Jan (58)	Feb (149)	Mar (175)	Apr (110)	May (106)	Jun (72)	Jul (55)	Aug (89)	Sep (26)	Oct (96)	Nov (83)	Dec (93)
2009	Jan (97)	Feb (106)	Mar (74)	Apr (64)	May (115)	Jun (83)	Jul (137)	Aug (103)	Sep (56)	Oct (59)	Nov (61)	Dec (37)
2010	Jan (94)	Feb (71)	Mar (53)	Apr (105)	May (79)	Jun (111)	Jul (110)	Aug (81)	Sep (50)	Oct (82)	Nov (49)	Dec (21)
2011	Jan (87)	Feb (105)	Mar (108)	Apr (99)	May (91)	Jun (94)	Jul (114)	Aug (77)	Sep (58)	Oct (58)	Nov (131)	Dec (62)
2012	Jan (76)	Feb (93)	Mar (68)	Apr (95)	May (62)	Jun (109)	Jul (90)	Aug (87)	Sep (49)	Oct (54)	Nov (66)	Dec (84)
2013	Jan (67)	Feb (52)	Mar (93)	Apr (65)	May (33)	Jun (34)	Jul (52)	Aug (42)	Sep (52)	Oct (48)	Nov (66)	Dec (14)
2014	Jan (66)	Feb (51)	Mar (34)	Apr (47)	May (58)	Jun (27)	Jul (52)	Aug (41)	Sep (78)	Oct (30)	Nov (28)	Dec (26)
2015	Jan (41)	Feb (42)	Mar (20)	Apr (73)	May (31)	Jun (48)	Jul (23)	Aug (55)	Sep (36)	Oct (47)	Nov (48)	Dec (41)
2016	Jan (32)	Feb (34)	Mar (33)	Apr (22)	May (14)	Jun (31)	Jul (29)	Aug (41)	Sep (17)	Oct (27)	Nov (38)	Dec (28)
2017	Jan (28)	Feb (30)	Mar (16)	Apr (9)	May (27)	Jun (57)	Jul (28)	Aug (43)	Sep (31)	Oct (20)	Nov (24)	Dec (18)
2018	Jan (34)	Feb (50)	Mar (18)	Apr (26)	May (13)	Jun (31)	Jul (13)	Aug (11)	Sep (15)	Oct (12)	Nov (18)	Dec (13)
2019	Jan (12)	Feb (29)	Mar (51)	Apr (22)	May (13)	Jun (20)	Jul (13)	Aug (12)	Sep (21)	Oct (6)	Nov (9)	Dec (5)
2020	Jan (13)	Feb (5)	Mar (25)	Apr (4)	May (40)	Jun (27)	Jul (5)	Aug (17)	Sep (21)	Oct (1)	Nov (5)	Dec (15)
2021	Jan (28)	Feb (6)	Mar (11)	Apr (5)	May (7)	Jun (8)	Jul (5)	Aug (5)	Sep (11)	Oct (9)	Nov (10)	Dec (12)
2022	Jan (7)	Feb (13)	Mar (8)	Apr (7)	May (12)	Jun (27)	Jul (14)	Aug (27)	Sep (27)	Oct (17)	Nov (17)	Dec
2023	Jan (10)	Feb (18)	Mar (9)	Apr (26)	May	Jun (13)	Jul (18)	Aug (5)	Sep (12)	Oct (16)	Nov (1)	Dec
2024	Jan (4)	Feb (3)	Mar (6)	Apr (17)	May (2)	Jun (33)	Jul (13)	Aug (1)	Sep (6)	Oct (8)	Nov (6)	Dec (15)
2025	Jan (5)	Feb (11)	Mar (8)	Apr (20)	May (1)	Jun	Jul	Aug (9)	Sep (1)	Oct (7)	Nov (1)	Dec

Flat | Threaded

<< < 1 .. 127 128 129 130 131 .. 698 > >> (Page 129 of 698)

Re: [Valgrind-users] Valgrind shows "Invalid write os size 4" for memory allocated for the stack

From: Masha N. (mnaret) <mn...@ci...> - 2013-06-13 06:40:31

Not yet,
Following the first answer in the forum I tried several code modifications around mprotect:
1. When releasing the memory, in addition to setting mprotect to PROT_READ and PROT_WRITE, set it PROT_EXEC
2. When allocating the memory, in addition to calling mprotect  for the upper and lower parts of the allocated stack with PROT_NONE, 
Call mprotect with PROT_WRITE and PROT_READ for the valid part of the stack.

Each one of the above solutions solved the problem. (Separately) 

Regarding the small reproducer: The problem only appears after the tests bring the process up and down several times,
And only after the pages where the problem appears were allocated for some other stack, released, and then allocated again for the current stack.
I'm not sure how to reproduce this behavior in a small executable, I'll try now.

Thank you for your reply,
Masha.

-----Original Message-----
From: Philippe Waroquiers [mailto:phi...@sk...] 
Sent: Wednesday, June 12, 2013 10:44 PM
To: Masha Naret (mnaret)
Cc: val...@li...
Subject: Re: [Valgrind-users] Valgrind shows "Invalid write os size 4" for memory allocated for the stack

On Mon, 2013-06-10 at 01:23 -0700, mnaret wrote:
> Hello,
> Recently I'm getting lot's of "invalid read/invalid write" valgrind 
> errors which point out at memory allocated for the stack. However the 
> code doesn't crush and finish running successfully.
> I'm trying to understand where the error comes from - and will be 
> grateful fo any help wih this issue.
Do you have a small (compilable) reproducer ?
Philippe

Re: [Valgrind-users] Why doesn't valgrind detect a memory leak for my application

From: David C. <dcc...@ac...> - 2013-06-13 06:27:16

On 6/12/2013 10:29 PM, Sanjay Kumar (sanjaku5) wrote:
> Hi Philippe,
> Below is part of Make file where I have done the Linking:
>
> sslLIB=$(OUT_DIR)/libsyfer_ssl.a
>
> # pull in dependency info for *existing* .o files
> *********************************
> -include $(commonOBJS:.o=.d)
> -include $(sslOBJS:.o=.d)
> -include $(cliCOMMONOBJS:.o=.d)
> -include $(cliCLIOBJS:.o=.d)
> -include $(cliSYFERSERVEROBJS:.o=.d)
> -include $(cliSYFERCARDSERVEROBJS:.o=.d)
> -include $(cliETHSERVERSERVEROBJS:.o=.d)
>
> all: $(OUT_DIR)/$(OUTNAME) $(OUT_DIR)/cli
>
> $(OUT_DIR)/cli : $(cliCOMMONOBJS) $(cliCLIOBJS)
>      @echo "LL $(LOPTS) $@" $(REDIRECT)
>      $(SILENT)$(LL) $(LOPTS) $(cliCOMMONOBJS) $(cliCLIOBJS)  -o cli -lpthread
>
> $(OUT_DIR)/$(OUTNAME): $(VERSION_FILE) $(commonOBJS) $(sslLIB) $(cliCOMMONOBJS) $(cliSYFERSERVEROBJS)
>      @echo "LL $(LOPTS) $@" $(REDIRECT)
>      $(SILENT)$(LL) $(LOPTS) $(commonOBJS) $(cliCOMMONOBJS) $(cliSYFERSERVEROBJS) ./libs/librohc.a $(LIB_GLIB_DIR)/libglib-2.0.a ./libs/libsnutils.a ./libs/libcommon.a ./libs/libasnbase.a ./libs/libs1apgen.a ./libs/libsctp.a ./libs/libpcap.a $(LIB_PATH) -lssp -lsyfer_ssl -lcrypto -leventbridge -ldl -o $(OUTNAME)

Without knowing the exact link options, there is no way for anyone on 
this list to determine what is going wrong.  Please paste the link 
command as printed during the build process, perhaps with proprietary 
object file names stripped out.  We don't know what, for example, 
"$(LOPTS)" expands to, and even "$(LL)" could have some linker flags in 
it.  These are your Makefile's definitions; they are not universal.


>
> $(OUT_DIR)/syfercard : $(cliCOMMONOBJS) $(cliSYFERCARDSERVEROBJS)
>      @echo "LL $(LOPTS) $@" $(REDIRECT)
>      $(SILENT)$(LL) $(LOPTS) $(cliCOMMONOBJS) $(cliSYFERCARDSERVEROBJS)  -o syfercard
>
> $(OUT_DIR)/ethserver : $(cliCOMMONOBJS) $(cliETHSERVERSERVEROBJS)
>      @echo "LL $(LOPTS) $@" $(REDIRECT)
>      $(SILENT)$(LL) $(LOPTS) $(cliCOMMONOBJS) $(cliETHSERVERSERVEROBJS)  -o ethserver -lpthread
>
> *********************************
>
> Could you please comments on this.
> What modification I need to do here ?
>
> Thanks,
> Sanjay
>
> -----Original Message-----
> From: Philippe Waroquiers [mailto:phi...@sk...]
> Sent: Thursday, June 13, 2013 1:38 AM
> To: David Chapman
> Cc: Sanjay Kumar (sanjaku5); val...@li...
> Subject: Re: [Valgrind-users] Why doesn't valgrind detect a memory leak for my application
>
> Sanjay,
>>> --11597--    object doesn't have a dynamic symbol table
> With the above
> and the below stacktrace, it looks like this application is statically linked (or at least, the malloc lib is statically linked).
> valgrind can find leaks if malloc lib is statically linked (you need to use --soname-synonyms=.... to indicate malloc is statically linked).
> However, a completely statically linked application causes other problems.
> You should have at least one (even dummy) dynamically linked lib to have the dynamic loader be invoked in your process otherwise valgrind cannot "LD_PRELOAD" some of its own .so.
>
> Philippe
>
>>> ==11597== Conditional jump or move depends on uninitialised value(s)
>>>
>>> ==11597==    at 0x8A838B5: __register_atfork
>>> (in /usr/local/flare/flare)
>>>
>>> ==11597==    by 0x8A68074: ptmalloc_init (in /usr/local/flare/flare)
>>>
>>> ==11597==    by 0x8A6C265: malloc_hook_ini
>>> (in /usr/local/flare/flare)
>>>
>>> ==11597==    by 0x8A6BDCE: malloc (in /usr/local/flare/flare)
>>>
>>> ==11597==    by 0x8AA893B: _dl_init_paths
>>> (in /usr/local/flare/flare)
>>>
>>> ==11597==    by 0x8A8BDBB: _dl_non_dynamic_init
>>> (in /usr/local/flare/flare)
>>>
>>> ==11597==    by 0x8A8CA75: __libc_init_first
>>> (in /usr/local/flare/flare)
>>>
>>> ==11597==    by 0x8A3E460: (below main) (in /usr/local/flare/flare)
>
>
>
>


-- 
     David Chapman      dcc...@ac...
     Chapman Consulting -- San Jose, CA
     Software Development Done Right.
     www.chapman-consulting-sj.com

Re: [Valgrind-users] Why doesn't valgrind detect a memory leak for my application

From: Sanjay K. (sanjaku5) <san...@ci...> - 2013-06-13 05:29:21

Hi Philippe,
Below is part of Make file where I have done the Linking:

sslLIB=$(OUT_DIR)/libsyfer_ssl.a

# pull in dependency info for *existing* .o files
*********************************
-include $(commonOBJS:.o=.d)
-include $(sslOBJS:.o=.d)
-include $(cliCOMMONOBJS:.o=.d)
-include $(cliCLIOBJS:.o=.d)
-include $(cliSYFERSERVEROBJS:.o=.d)
-include $(cliSYFERCARDSERVEROBJS:.o=.d)
-include $(cliETHSERVERSERVEROBJS:.o=.d)

all: $(OUT_DIR)/$(OUTNAME) $(OUT_DIR)/cli

$(OUT_DIR)/cli : $(cliCOMMONOBJS) $(cliCLIOBJS)
    @echo "LL $(LOPTS) $@" $(REDIRECT)
    $(SILENT)$(LL) $(LOPTS) $(cliCOMMONOBJS) $(cliCLIOBJS)  -o cli -lpthread

$(OUT_DIR)/$(OUTNAME): $(VERSION_FILE) $(commonOBJS) $(sslLIB) $(cliCOMMONOBJS) $(cliSYFERSERVEROBJS)
    @echo "LL $(LOPTS) $@" $(REDIRECT)
    $(SILENT)$(LL) $(LOPTS) $(commonOBJS) $(cliCOMMONOBJS) $(cliSYFERSERVEROBJS) ./libs/librohc.a $(LIB_GLIB_DIR)/libglib-2.0.a ./libs/libsnutils.a ./libs/libcommon.a ./libs/libasnbase.a ./libs/libs1apgen.a ./libs/libsctp.a ./libs/libpcap.a $(LIB_PATH) -lssp -lsyfer_ssl -lcrypto -leventbridge -ldl -o $(OUTNAME)

$(OUT_DIR)/syfercard : $(cliCOMMONOBJS) $(cliSYFERCARDSERVEROBJS)
    @echo "LL $(LOPTS) $@" $(REDIRECT)
    $(SILENT)$(LL) $(LOPTS) $(cliCOMMONOBJS) $(cliSYFERCARDSERVEROBJS)  -o syfercard

$(OUT_DIR)/ethserver : $(cliCOMMONOBJS) $(cliETHSERVERSERVEROBJS)
    @echo "LL $(LOPTS) $@" $(REDIRECT)
    $(SILENT)$(LL) $(LOPTS) $(cliCOMMONOBJS) $(cliETHSERVERSERVEROBJS)  -o ethserver -lpthread

*********************************

Could you please comments on this.
What modification I need to do here ?

Thanks,
Sanjay

-----Original Message-----
From: Philippe Waroquiers [mailto:phi...@sk...] 
Sent: Thursday, June 13, 2013 1:38 AM
To: David Chapman
Cc: Sanjay Kumar (sanjaku5); val...@li...
Subject: Re: [Valgrind-users] Why doesn't valgrind detect a memory leak for my application

Sanjay,
> > --11597--    object doesn't have a dynamic symbol table
With the above
and the below stacktrace, it looks like this application is statically linked (or at least, the malloc lib is statically linked).
valgrind can find leaks if malloc lib is statically linked (you need to use --soname-synonyms=.... to indicate malloc is statically linked).
However, a completely statically linked application causes other problems.
You should have at least one (even dummy) dynamically linked lib to have the dynamic loader be invoked in your process otherwise valgrind cannot "LD_PRELOAD" some of its own .so.

Philippe

> > ==11597== Conditional jump or move depends on uninitialised value(s)
> > 
> > ==11597==    at 0x8A838B5: __register_atfork
> > (in /usr/local/flare/flare)
> > 
> > ==11597==    by 0x8A68074: ptmalloc_init (in /usr/local/flare/flare)
> > 
> > ==11597==    by 0x8A6C265: malloc_hook_ini
> > (in /usr/local/flare/flare)
> > 
> > ==11597==    by 0x8A6BDCE: malloc (in /usr/local/flare/flare)
> > 
> > ==11597==    by 0x8AA893B: _dl_init_paths
> > (in /usr/local/flare/flare)
> > 
> > ==11597==    by 0x8A8BDBB: _dl_non_dynamic_init
> > (in /usr/local/flare/flare)
> > 
> > ==11597==    by 0x8A8CA75: __libc_init_first
> > (in /usr/local/flare/flare)
> > 
> > ==11597==    by 0x8A3E460: (below main) (in /usr/local/flare/flare)

Re: [Valgrind-users] Why doesn't valgrind detect a memory leak for my application

From: Philippe W. <phi...@sk...> - 2013-06-12 20:07:15

Sanjay,
> > --11597--    object doesn't have a dynamic symbol table
With the above
and the below stacktrace, it looks like this application is statically linked
(or at least, the malloc lib is statically linked).
valgrind can find leaks if malloc lib is statically linked
(you need to use --soname-synonyms=.... to indicate malloc is statically linked).
However, a completely statically linked application causes other problems.
You should have at least one (even dummy) dynamically linked lib to
have the dynamic loader be invoked in your process otherwise
valgrind cannot "LD_PRELOAD" some of its own .so.

Philippe

> > ==11597== Conditional jump or move depends on uninitialised value(s)
> > 
> > ==11597==    at 0x8A838B5: __register_atfork
> > (in /usr/local/flare/flare)
> > 
> > ==11597==    by 0x8A68074: ptmalloc_init (in /usr/local/flare/flare)
> > 
> > ==11597==    by 0x8A6C265: malloc_hook_ini
> > (in /usr/local/flare/flare)
> > 
> > ==11597==    by 0x8A6BDCE: malloc (in /usr/local/flare/flare)
> > 
> > ==11597==    by 0x8AA893B: _dl_init_paths
> > (in /usr/local/flare/flare)
> > 
> > ==11597==    by 0x8A8BDBB: _dl_non_dynamic_init
> > (in /usr/local/flare/flare)
> > 
> > ==11597==    by 0x8A8CA75: __libc_init_first
> > (in /usr/local/flare/flare)
> > 
> > ==11597==    by 0x8A3E460: (below main) (in /usr/local/flare/flare)

Re: [Valgrind-users] Valgrind shows "Invalid write os size 4" for memory allocated for the stack

From: Philippe W. <phi...@sk...> - 2013-06-12 19:43:41

On Mon, 2013-06-10 at 01:23 -0700, mnaret wrote:
> Hello,
> Recently I'm getting lot's of "invalid read/invalid write" valgrind errors
> which point out at memory allocated for the stack. However the code doesn't
> crush and finish running successfully.
> I'm trying to understand where the error comes from - and will be grateful
> fo any help wih this issue.
Do you have a small (compilable) reproducer ?
Philippe

Re: [Valgrind-users] Why doesn't valgrind detect a memory leak for my application

From: David C. <dcc...@ac...> - 2013-06-12 16:10:26

Please reply to the group, not just me, and please don't top-post.  My 
reply is at the bottom.

On 6/12/2013 1:35 AM, Sanjay Kumar (sanjaku5) wrote:
>
> Hi David,
>
> Below is code which I added  to create the leak in my application :
>
> /*******************/
>
> char *mleak = NULL;
>
>     static int mcnt = 0;
>
>     mleak = (char *)malloc(10000);
>
>     if(NULL == mleak)
>
> printf("\nmleak is  NULL\n");
>
> strcpy(mleak, "aaaaaaaaaaaaaaaaaaaaaaa");
>
> printf("\nmleak called:%d mleak:%s \n", mcnt++, mleak);
>
> /*******************/
>
> Below is summary of report:
>
> ==11597== Memcheck, a memory error detector
>
> ==11597== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
>
> ==11597== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright 
> info
>
> ==11597== Command: ./flare -f syfer.conf
>
> ==11597==
>
> --11597-- Valgrind options:
>
> --11597--    -v
>
> --11597-- --tool=memcheck
>
> --11597-- --leak-check=full
>
> --11597-- --leak-resolution=high
>
> --11597-- Contents of /proc/version:
>
> --11597-- Linux version 2.6.38-staros-v3-40087-deb-32 (root@releng7) 
> (gcc version 3.3.5 (Debian 1:3.3.5-13)) #1 SMP PREEMPT Sat Oct 1 
> 02:50:26 EDT 2011
>
> --11597-- Arch and hwcaps: X86, x86-sse1-sse2
>
> --11597-- Page sizes: currently 4096, max supported 4096
>
> --11597-- Valgrind library directory: /usr/local/lib/valgrind
>
> --11597-- Reading syms from /usr/local/flare/flare
>
> --11597-- object doesn't have a dynamic symbol table
>
> --11597-- warning: DiCfSI 0x0 .. 0x0 outside mapped rw segments (NONE)
>
> --11597-- warning: DiCfSI 0x1 .. 0x2 outside mapped rw segments (NONE)
>
> --11597-- warning: DiCfSI 0x3 .. 0x8 outside mapped rw segments (NONE)
>
> --11597-- warning: DiCfSI 0x9 .. 0x437 outside mapped rw segments (NONE)
>
> --11597-- warning: DiCfSI 0x0 .. 0x0 outside mapped rw segments (NONE)
>
> --11597-- warning: DiCfSI 0x1 .. 0x2 outside mapped rw segments (NONE)
>
> --11597-- warning: DiCfSI 0x3 .. 0x8 outside mapped rw segments (NONE)
>
> --11597-- warning: DiCfSI 0x9 .. 0x3a6 outside mapped rw segments (NONE)
>
> --11597-- warning: DiCfSI 0x0 .. 0x0 outside mapped rw segments (NONE)
>
> --11597-- warning: DiCfSI 0x1 .. 0x2 outside mapped rw segments (NONE)
>
> --11597-- Reading syms from /usr/local/lib/valgrind/memcheck-x86-linux
>
> --11597-- object doesn't have a dynamic symbol table
>
> --11597-- Scheduler: using generic scheduler lock implementation.
>
> --11597-- Reading suppressions file: /usr/local/lib/valgrind/default.supp
>
> ==11597== embedded gdbserver: reading from 
> /tmp/vgdb-pipe-from-vgdb-to-11597-by-root-on-???
>
> ==11597== embedded gdbserver: writing to 
> /tmp/vgdb-pipe-to-vgdb-from-11597-by-root-on-???
>
> ==11597== embedded gdbserver: shared mem 
> /tmp/vgdb-pipe-shared-mem-vgdb-11597-by-root-on-???
>
> ==11597==
>
> ==11597==
>
> ==11597== TO CONTROL THIS PROCESS USING vgdb (which you probably
>
> ==11597== don't want to do, unless you know exactly what you're doing,
>
> ==11597== or are doing some strange experiment):
>
> ==11597== /usr/local/lib/valgrind/../../bin/vgdb --pid=11597 ...command...
>
> ==11597==
>
> ==11597== TO DEBUG THIS PROCESS USING GDB: start GDB like this
>
> ==11597== /path/to/gdb ./flare
>
> ==11597== and then give GDB the following command
>
> ==11597== target remote | /usr/local/lib/valgrind/../../bin/vgdb 
> --pid=11597
>
> ==11597== --pid is optional if only one valgrind process is running
>
> ==11597==
>
> ==11597== Conditional jump or move depends on uninitialised value(s)
>
> ==11597==    at 0x8A838B5: __register_atfork (in /usr/local/flare/flare)
>
> ==11597==    by 0x8A68074: ptmalloc_init (in /usr/local/flare/flare)
>
> ==11597==    by 0x8A6C265: malloc_hook_ini (in /usr/local/flare/flare)
>
> ==11597==    by 0x8A6BDCE: malloc (in /usr/local/flare/flare)
>
> ==11597==    by 0x8AA893B: _dl_init_paths (in /usr/local/flare/flare)
>
> ==11597==    by 0x8A8BDBB: _dl_non_dynamic_init (in 
> /usr/local/flare/flare)
>
> ==11597==    by 0x8A8CA75: __libc_init_first (in /usr/local/flare/flare)
>
> ==11597==    by 0x8A3E460: (below main) (in /usr/local/flare/flare)
>
> ==11597==
>
> ==11597== Conditional jump or move depends on uninitialised value(s)
>
> ==11597==    at 0x8A83926: __register_atfork (in /usr/local/flare/flare)
>
> ==11597==    by 0x8A68074: ptmalloc_init (in /usr/local/flare/flare)
>
> ==11597==    by 0x8A6C265: malloc_hook_ini (in /usr/local/flare/flare)
>
> ==11597==    by 0x8A6BDCE: malloc (in /usr/local/flare/flare)
>
> ==11597==    by 0x8AA893B: _dl_init_paths (in /usr/local/flare/flare)
>
> ==11597==    by 0x8A8BDBB: _dl_non_dynamic_init (in 
> /usr/local/flare/flare)
>
> ==11597==    by 0x8A8CA75: __libc_init_first (in /usr/local/flare/flare)
>
> ==11597==    by 0x8A3E460: (below main) (in /usr/local/flare/flare)
>
> ......
>
> ...........
>
> .............
>
> ==11597==
>
> ==11597== 481512 errors in context 998 of 1000:
>
> ==11597== Invalid read of size 4
>
> ==11597==    at 0x8A6FBCF: memcpy (in /usr/local/flare/flare)
>
> ==11597==    by 0x80D9F80: 
> configuration::cfgparams::read_pattern(std::string, std::string, 
> fsm_t*) (stl_iterator.h:704)
>
> ==11597==    by 0x8086EE2: 
> configuration::cfg::read_pattern(configuration::configfile*, 
> std::string, bool, unsigned short) (cfgdata.cpp:1684)
>
> ==11597==    by 0x8088215: configuration::cfg::read_pattern_file() 
> (cfgdata.cpp:780)
>
> ==11597==    by 0x809935B: configuration::cfg::read_config_file() 
> (cfgdata.cpp:623)
>
> ==11597==    by 0x811576D: main (main.cpp:3930)
>
> ==11597== Address 0xbef88714 is on thread 1's stack
>
> ==11597==
>
> ==11597==
>
> ==11597== 481512 errors in context 999 of 1000:
>
> ==11597== Invalid read of size 4
>
> ==11597==    at 0x8A6FBCF: memcpy (in /usr/local/flare/flare)
>
> ==11597==    by 0x80D9CD4: 
> configuration::cfgparams::read_pattern(std::string, std::string, 
> fsm_t*) (stl_iterator.h:704)
>
> ==11597==    by 0x8086EE2: 
> configuration::cfg::read_pattern(configuration::configfile*, 
> std::string, bool, unsigned short) (cfgdata.cpp:1684)
>
> ==11597==    by 0x8088215: configuration::cfg::read_pattern_file() 
> (cfgdata.cpp:780)
>
> ==11597==    by 0x809935B: configuration::cfg::read_config_file() 
> (cfgdata.cpp:623)
>
> ==11597==    by 0x811576D: main (main.cpp:3930)
>
> ==11597== Address 0xbefc3388 is on thread 1's stack
>
> ==11597==
>
> ==11597==
>
> ==11597== 662079 errors in context 1000 of 1000:
>
> ==11597== Invalid read of size 4
>
> ==11597==    at 0x8A6FBCF: memcpy (in /usr/local/flare/flare)
>
> ==11597==    by 0x80DC54B: void 
> std::__uninitialized_fill_n_aux<__gnu_cxx::__normal_iterator<flare_stack*, 
> std::vector<flare_stack, std::allocator<flare_stack> > >, unsigned 
> int, flare_stack>(__gnu_cxx::__normal_iterator<flare_stack*, 
> std::vector<flare_stack, std::allocator<flare_stack> > >, unsigned 
> int, flare_stack const&, __false_type) (stl_construct.h:81)
>
> ==11597==    by 0x80DEAE5: std::vector<flare_stack, 
> std::allocator<flare_stack> 
> >::_M_fill_insert(__gnu_cxx::__normal_iterator<flare_stack*, 
> std::vector<flare_stack, std::allocator<flare_stack> > >, unsigned 
> int, flare_stack const&) (vector.tcc:365)
>
> ==11597==    by 0x80DB8C4: 
> configuration::cfgparams::read_pattern(std::string, std::string, 
> fsm_t*) (stl_vector.h:658)
>
> ==11597==    by 0x8086EE2: 
> configuration::cfg::read_pattern(configuration::configfile*, 
> std::string, bool, unsigned short) (cfgdata.cpp:1684)
>
> ==11597==    by 0x8088215: configuration::cfg::read_pattern_file() 
> (cfgdata.cpp:780)
>
> ==11597==    by 0x809935B: configuration::cfg::read_config_file() 
> (cfgdata.cpp:623)
>
> ==11597==    by 0x811576D: main (main.cpp:3930)
>
> ==11597== Address 0xbedb2374 is on thread 1's stack
>
> ==11597==
>
> ==11597== ERROR SUMMARY: 3704242 errors from 1000 contexts 
> (suppressed: 0 from 0)
>
> *When the same code I run as test program valgrind able to detect the 
> leak.*
>
> *But NOT when it is part of my application.***
>
> Thanks,
>
> Sanjay
>

I'll leave the details of why leaks would not be reported to the 
Valgrind development team.  But it seems to me that you have plenty to 
work on already:  3,704,242 errors of various kinds.  If these are 
cleaned up, what happens then?

I suppose it is possible that Valgrind doesn't notice leaks among all of 
the other reports, but IMO that is a much smaller problem than reading 
out-of-bounds memory and executing instructions based on uninitialized 
data.  My goal is always zero errors from Valgrind, even if the program 
seems to work fine without any changes.  Fix the errors that Valgrind 
reports now and let us know if there are still problems with leak reporting.

-- 
     David Chapman      dcc...@ac...
     Chapman Consulting -- San Jose, CA
     Software Development Done Right.
     www.chapman-consulting-sj.com

Re: [Valgrind-users] Why doesn't valgrind detect a memory leak for my application

From: David C. <dcc...@ac...> - 2013-06-12 07:38:25

On 6/12/2013 12:01 AM, Sanjay Kumar (sanjaku5) wrote:
>
> Hi,
>
>      I run my program using valgrind to detect the memory leaks
>
> valgrind -v --tool=memcheck --leak-check=full  ./binary -f conf.file
>
>
> But it doesn't show leaks, even I create one leak  of 10000 bytes in 
> the code.
>
> Any wild guess ??
>
>

What happens if you create a simple program that does nothing but 
allocate 10000 bytes and then exits?  I presume that the program you 
describe above cannot be posted on the Internet; try to create a test 
case as small as possible.

And of course it would be good to know what version of Valgrind you are 
using, and on what platform.

-- 
     David Chapman      dcc...@ac...
     Chapman Consulting -- San Jose, CA
     Software Development Done Right.
     www.chapman-consulting-sj.com

[Valgrind-users] Why doesn't valgrind detect a memory leak for my application

From: Sanjay K. (sanjaku5) <san...@ci...> - 2013-06-12 07:01:21

Hi,
     I run my program using valgrind to detect the memory leaks
valgrind -v --tool=memcheck --leak-check=full  ./binary -f conf.file


But it doesn't show leaks, even I create one leak  of 10000 bytes in the code.


Any wild guess ??

Thanks,
Sanjay

Re: [Valgrind-users] Valgrind shows "Invalid write os size 4" for memory allocated for the stack

From: John R. <jr...@bi...> - 2013-06-10 14:12:53

> (gdb) monitor v.info last_error
> ==10259== Invalid write of size 4
> ==10259==    at 0x28686C: vsnprintf (in /lib/libc-2.12.so)
> ==10259==  Address 0x4b43040 is 45,120 bytes inside a block of size 65,536
> alloc'd
> ==10259==    at 0x4005DB9: memalign (vg_replace_malloc.c:727)
> ==10259==    by 0x4005E68: posix_memalign (vg_replace_malloc.c:876)
> 
> 
> -The memory for the stack is allocated using memalign and then the upper and
> lower parts of it are protected using mprotect, so the stack looks like
> this: 16k protected with mprotect, 32K valid for usage, another 16K
> protected. The problem happens only in the valid area.

There is an inherent conflict between memcheck and mprotect.
On one side, it is desirable that after mprotect which removes all access privileges
then memcheck's accounting bits should reflect the current state, including no access.
On the other side, it is desirable that after a following mprotect which restores the
previous access privileges to the same region, then memcheck's accounting bits
should reflect the state *AS IF* those two mprotect() never had occurred.
That is, it would be nice if memcheck remembered which bytes were defined and
undefined, even though the region "disappeared behind the mprotect NO-ACCESS curtain"
for a while.

I don't know the exact situation now, but some time in the past it was impossible
to have both properties because the accounting bits couldn't handle it.

--

[Valgrind-users] Valgrind shows "Invalid write os size 4" for memory allocated for the stack

From: mnaret <mn...@ci...> - 2013-06-10 08:23:37

Hello,
Recently I'm getting lot's of "invalid read/invalid write" valgrind errors
which point out at memory allocated for the stack. However the code doesn't
crush and finish running successfully.
I'm trying to understand where the error comes from - and will be grateful
fo any help wih this issue.
Here's what I can see using vgdb:

(gdb) monitor v.info last_error
==10259== Invalid write of size 4
==10259== at 0x28686C: vsnprintf (in /lib/libc-2.12.so)
==10259== Address 0x4b43040 is 45,120 bytes inside a block of size 65,536
alloc'd
==10259== at 0x4005DB9: memalign (vg_replace_malloc.c:727)
==10259== by 0x4005E68: posix_memalign (vg_replace_malloc.c:876)

-The memory for the stack is allocated using memalign and then the upper and
lower parts of it are protected using mprotect, so the stack looks like
this: 16k protected with mprotect, 32K valid for usage, another 16K
protected. The problem happens only in the valid area.

-The problem always seems to happen in the end of the first page of the
valid memory. For example, the stack above starts with the address 0x4b44000
and goes down, the first inaccessible address is 0x4b4304b.

-The address pointed out as inaccessible is above the stack pointer:
(gdb) info registers
eax 0x100 256
ecx 0x0 0
edx 0x0 0
ebx 0x3acff4 3854324
esp 0x4b42f44 0x4b42f44
ebp 0x4b4304c 0x4b4304c
esi 0x4b43b70 78920560
edi 0x4b431d4 78918100
eip 0x28686c 0x28686c <vsnprintf+12>

-Even though addressibility is not supposed to be affected by mprotect, when
I comment out the calls to mprotect the poblem doen't happen any more.

-There's no specific line in the code that is causing the problem. It seems
that he problem always happens in the end of the first page of the stack. I
tried checkig the memory at the point of allocation - and the pointed out
address is valid. It also seems valid when the thread starts running and
becoms invalid only when the stack reaches near 4K.

Considering the fact there's no crash in the program and it runs normally
and correctly in spite of those errors, I don't understand what is the
reason of valgrind componains.

Any help with the cause of the problem or with further evaluation will be
highly appreciated, as after spending a few days on this I'm run out of
ideas.

Thank you,
Masha.

--
View this message in context: http://valgrind.10908.n7.nabble.com/Valgrind-shows-Invalid-write-os-size-4-for-memory-allocated-for-the-stack-tp45597.html
Sent from the Valgrind - Users mailing list archive at Nabble.com.

Re: [Valgrind-users] mandatory function redirection error

From: John R. <jr...@bi...> - 2013-06-05 19:09:44

>> is it certain, based on what you've seen, that I actually am using the debug
>> package?  I installed it, which apparently is all that is required to cause
>> it to get used when compiled for debugging?

> There is a chance of pathname mixup: installing into the "wrong" directory.
> 
> The way to tell is to run valgrind under strace:
> 
>   $ strace -f -o strace.out -e trace=file ~/local/bin/valgrind --leak-check=yes ./clock_gettime CLOCK_MONOTONIC
> 
> and afterwards look in strace.out for any open() on the debug symbol file(s).
> 

Also try
   $ ~/local/bin/valgrind -d -d -d -v -v -v --leak-check=yes ./clock_gettime CLOCK_MONOTONIC

and look for debug symbol loading such as:
-----
--25057:1:main     Load initial debug info
--25057-- Reading syms from /usr/bin/date
--25057--    svma 0x0000401ad0, avma 0x0000401ad0
--25057--   Considering /usr/lib/debug/.build-id/08/884e015589393715fa4c5d3e9a4ab0d1541e99.debug ..
--25057--   .. build-id is valid
-----

--

Re: [Valgrind-users] mandatory function redirection error

From: John R. <jr...@bi...> - 2013-06-05 17:57:23

> is it certain, based on what you've seen, that I actually am using the debug
> package?  I installed it, which apparently is all that is required to cause
> it to get used when compiled for debugging?

There is a chance of pathname mixup: installing into the "wrong" directory.

The way to tell is to run valgrind under strace:

  $ strace -f -o strace.out -e trace=file ~/local/bin/valgrind --leak-check=yes ./clock_gettime CLOCK_MONOTONIC

and afterwards look in strace.out for any open() on the debug symbol file(s).

--

Re: [Valgrind-users] mandatory function redirection error

From: John R. <jr...@bi...> - 2013-06-05 15:45:09

On 06/04/2013, Britton Kerin wrote about "Beagleboard white" ARM using Angstrom Linux:

>>> root@bboneumh2:~/software# ~/local/bin/valgrind --leak-check=yes
>>> ./clock_gettime CLOCK_MONOTONIC
>>> ==28151== Memcheck, a memory error detector
>>> ==28151== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
>>> ==28151== Using Valgrind-3.9.0.SVN and LibVEX; rerun with -h for copyright info
>>> ==28151== Command: ./clock_gettime CLOCK_MONOTONIC
>>> ==28151==
>>> ==28151== Conditional jump or move depends on uninitialised value(s)
>>> ==28151==    at 0x400B308: ??? (in /lib/ld-2.12.2.so)
>>> ==28151==

> I'll attach the output of 'objdump -d -S /lib/ld-2.12.2.so' as well.  In
> case I'm misunderstanding things.

The addresses such as
   ==28151==    at 0x400B308: ??? (in /lib/ld-2.12.2.so)
are "in /lib/ld-2.12.2.so", so the code at that location is what matters.

Looking at those addresses (0x4000000 less because of runtime placement
in the process versus no prelinking in the file):

$$$ b1b4:       e0828001        add     r8, r2, r1
    b1b8:       e5970008        ldr     r0, [r7, #8]
    b1bc:       e3500000        cmp     r0, #0
%%% b1c0:       0a00011f        beq     b644 <_dl_rtld_di_serinfo+0x2fbc>
    b1c4:       e1520008        cmp     r2, r8
%%% b1c8:       2a00000d        bcs     b204 <_dl_rtld_di_serinfo+0x2b7c>

    b2bc:       e30a0aab        movw    r0, #43691      ; 0xaaab
    b2c0:       e595c004        ldr     ip, [r5, #4]
    b2c4:       e34a0aaa        movt    r0, #43690      ; 0xaaaa;   0xaaaaaaab = (2<<32)/3
    b2c8:       e0815190        umull   r5, r1, r0, r1   ### (r5=hi, r1=lo) = r0 * r1;  32x32==>64
    b2cc:       e1a011a1        lsr     r1, r1, #3       ### r1 = original_r1 / 12;
    b2d0:       e151000c        cmp     r1, ip
    b2d4:       21a0100c        movcs   r1, ip   ### minimum
    b2d8:       e0811081        add     r1, r1, r1, lsl #1   ### r1 = r1 + (r1<<1);  //  * 3
    b2dc:       e1a05101        lsl     r5, r1, #2
    b2e0:       e51b7070        ldr     r7, [fp, #-112] ; 0x70
$$$ b2e4:       e0825005        add     r5, r2, r5
    b2e8:       e08f1007        add     r1, pc, r7
    b2ec:       e2811e51        add     r1, r1, #1296   ; 0x510
    b2f0:       e2811008        add     r1, r1, #8
    b2f4:       e1540001        cmp     r4, r1
    b2f8:       0a00000a        beq     b328 <_dl_rtld_di_serinfo+0x2ca0>
    b2fc:       e35a0000        cmp     sl, #0
    b300:       0a000239        beq     bbec <_dl_rtld_di_serinfo+0x3564>
    b304:       e1520005        cmp     r2, r5
%%% b308:       2a000006        bcs     b328 <_dl_rtld_di_serinfo+0x2ca0>
    b30c:       e5931008        ldr     r1, [r3, #8]
    b310:       e5932000        ldr     r2, [r3]
    b314:       e283300c        add     r3, r3, #12
    b318:       e1550003        cmp     r5, r3
    b31c:       e081100a        add     r1, r1, sl
    b320:       e782100a        str     r1, [r2, sl]
    b324:       8afffff8        bhi     b30c <_dl_rtld_di_serinfo+0x2c84>
    b328:       e59430e4        ldr     r3, [r4, #228]  ; 0xe4
    b32c:       e3530000        cmp     r3, #0
    b330:       0a000362        beq     c0c0 <_dl_rtld_di_serinfo+0x3a38>
    b334:       e51b0064        ldr     r0, [fp, #-100] ; 0x64
    b338:       e5933004        ldr     r3, [r3, #4]
    b33c:       e1500005        cmp     r0, r5
    b340:       e50b3068        str     r3, [fp, #-104] ; 0x68
%%% b344:       9a000088        bls     b56c <_dl_rtld_di_serinfo+0x2ee4>

    c0c0:       e51bc064        ldr     ip, [fp, #-100] ; 0x64
    c0c4:       e15c0005        cmp     ip, r5
%%% c0c8:       9afffd27        bls     b56c <_dl_rtld_di_serinfo+0x2ee4>

we see that the complaints are about comparisons such as
    b1c4:       e1520008        cmp     r2, r8
    b304:       e1520005        cmp     r2, r5
where one of the operands was an addend for the other:
$$$ b1b4:       e0828001        add     r8, r2, r1   ### r8  = r2 + r1;
$$$ b2e4:       e0825005        add     r5, r2, r5   ### r5 += r2;


This looks like [I saw this a decade ago]:
----- glibc-2.12/elf/do-rel.h
auto inline void __attribute__ ((always_inline))
elf_dynamic_do_rel (struct link_map *map,
                    ElfW(Addr) reladdr, ElfW(Addr) relsize,
                    int lazy)
{
  const ElfW(Rel) *r = (const void *) reladdr;
  const ElfW(Rel) *end = (const void *) (reladdr + relsize);
  ElfW(Addr) l_addr = map->l_addr;

#if (!defined DO_RELA || !defined ELF_MACHINE_PLT_REL) && !defined RTLD_BOOTSTRAP
  /* We never bind lazily during ld.so bootstrap.  Unfortunately gcc is
     not clever enough to see through all the function calls to realize
     that.  */
  if (lazy)
    {
      /* Doing lazy PLT relocations; they need very little info.  */
      for (; r < end; ++r)
        elf_machine_lazy_rel (map, l_addr, r);
    }
-----

----- glibc-2.12/elf/dynamic-link.h
#  define _ELF_DYNAMIC_DO_RELOC(RELOC, reloc, map, do_lazy, test_rel) \
  do {                                                                        \
    struct { ElfW(Addr) start, size; int lazy; } ranges[2];                   \
    ranges[0].lazy = 0;                                                       \
    ranges[0].size = ranges[1].size = 0;                                      \
    ranges[0].start = 0;                                                      \
...
      {                                                                       \
        int ranges_index;                                                     \
        for (ranges_index = 0; ranges_index < 2; ++ranges_index)              \
          elf_dynamic_do_##reloc ((map),                                      \
                                  ranges[ranges_index].start,                 \
                                  ranges[ranges_index].size,                  \
                                  ranges[ranges_index].lazy);                 \
      }                                                                       \
-----

Note that ranges[1].start never is initialized in dynamic-link.h,
while elf_dynamic_do_rel() computes the sum
  const ElfW(Rel) *r = (const void *) reladdr;
  const ElfW(Rel) *end = (const void *) (reladdr + relsize);
and then compares
      for (; r < end; ++r)
which relies on
   (uninit + 0) == the_same_uninit
Thus glibc violates the C standard, because an operation on an uninitialized
value is totally undefined.  Memcheck complains correctly; glibc stubbornly
HAS REFUSED (!!!!!) the trivial fix "ranges[1].start = 0;" which costs *1* CPU cycle.


Glibc uses this code on all platforms, so why doesn't memcheck complain everywhere?
Actually memcheck does notice, but the default suppressions hide the complaint.
Why don't the default suppressions work here?
File valgrind/glibc-2.X.supp contains various suppressions such as:
{
   dl-hack3-cond-0
   Memcheck:Cond
   fun:_dl_start
   fun:_start
}
{
   dl-hack3-cond-1
   Memcheck:Cond
   obj:*/lib*/ld-2.12*.so*
   obj:*/lib*/ld-2.12*.so*
   obj:*/lib*/ld-2.12*.so*
}
which suppress "Cond" complaints ("Conditional jump or move depends on uninitialised value(s)")
in the context of a particular traceback (_dl_start called from _start) or a pattern
of any three nested routines within ld-2.12*.so*.


THE PROBLEM with Angstrom ld-2.12.2.so is that there are so few symbols
that the traceback is only the *one* routine
   ==28151==    at 0x400B308: ??? (in /lib/ld-2.12.2.so)
whose name memcheck does not know.  [The only name that objdump discovers is
_dl_rtld_di_serinfo,  which is highly suspect.]  Thus the known suppressions
do not match the traceback.

Therefore Angstrom should provide better debugging info for developers.
[And glibc should fix its outright error.]

--

Re: [Valgrind-users] mandatory function redirection error

From: John R. <jr...@bi...> - 2013-06-03 20:20:14

[[Inconsistently quoted, for brevity.]]

> The actual source of memcheck's complaint is coregrind/m_redir.c:
> void VG_(redir_initialise) ( void )
>    <<snip>>
> #  elif defined(VGP_arm_linux)
>    /* If we're using memcheck, use these intercepts right from
>       the start, otherwise ld.so makes a lot of noise. */
>    if (0==VG_(strcmp)("Memcheck", VG_(details).name)) {
>       add_hardwired_spec(
>          "ld-linux.so.3", "strlen",
>          (Addr)&VG_(arm_linux_REDIR_FOR_strlen),
>          complain_about_stripped_glibc_ldso
>       );
>       //add_hardwired_spec(
>       //   "ld-linux.so.3", "index",
>       //   (Addr)&VG_(arm_linux_REDIR_FOR_index),
>       //   NULL
>       //);
>       add_hardwired_spec(
>          "ld-linux.so.3", "memcpy",
>          (Addr)&VG_(arm_linux_REDIR_FOR_memcpy),
>          complain_about_stripped_glibc_ldso
>       );
>    }


> I finally got around to trying this.  I used 'trunk' in the svn command,
> which I hope is what you meant by top-of-trunk.  It had both the strlen
> and memcpy sections that you mention above uncommented, so I tried
> commending out the memcpy secion.  I then got an error just like the one
> I originally described, except it referred to strlen.  So I tried again
> with that section commented out as well.  Then the program ran, but I
> got some scary messages:
>
> root@bboneumh2:~/software# ~/local/bin/valgrind --leak-check=yes
> ./clock_gettime CLOCK_MONOTONIC
> ==28151== Memcheck, a memory error detector
> ==28151== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
> ==28151== Using Valgrind-3.9.0.SVN and LibVEX; rerun with -h for copyright info
> ==28151== Command: ./clock_gettime CLOCK_MONOTONIC
> ==28151==
> ==28151== Conditional jump or move depends on uninitialised value(s)
> ==28151==    at 0x400B308: ??? (in /lib/ld-2.12.2.so)
> ==28151==
> ==28151== Conditional jump or move depends on uninitialised value(s)
> ==28151==    at 0x400B344: ??? (in /lib/ld-2.12.2.so)
> ==28151==
> ==28151== Conditional jump or move depends on uninitialised value(s)
> ==28151==    at 0x400C0C8: ??? (in /lib/ld-2.12.2.so)
> ==28151==
> ==28151== Conditional jump or move depends on uninitialised value(s)
> ==28151==    at 0x400B1C0: ??? (in /lib/ld-2.12.2.so)
> ==28151==
> ==28151== Conditional jump or move depends on uninitialised value(s)
> ==28151==    at 0x400B1C8: ??? (in /lib/ld-2.12.2.so)
> ==28151==
> 234081.669967
> ==28151==
> ==28151== HEAP SUMMARY:
> ==28151==     in use at exit: 0 bytes in 0 blocks
> ==28151==   total heap usage: 0 allocs, 0 frees, 0 bytes allocated
> ==28151==
> ==28151== All heap blocks were freed -- no leaks are possible
> ==28151==
> ==28151== For counts of detected and suppressed errors, rerun with: -v
> ==28151== Use --track-origins=yes to see where uninitialised values come from
> ==28151== ERROR SUMMARY: 17 errors from 5 contexts (suppressed: 0 from 0)
> 
> I haven't seen anything like those messages before.  Are they likely
> related to the commented-out sections?

Very probably these are due to calls on strlen() or memcpy()
which have been expanded inline because of aggressive optimization
("#include <string.h>" compiled with -O2 or -O3.)
You can check this by examining the generated assembly code
which surrounds the addresses mentioned (0x400B308, 0x400B344, 0x400C0C8, etc.,
which are runtime addresses that are 0x4000000 higher than the addresses
in the filesystem copy of ld-2.12.2.so [or 0x0 higher in case ld-linux
has been prelinked.])  It should look like strlen or memcpy.

Assuming that it does look like strlen or memcpy, then it is safe to construct
and use one or more suppression rules in order to hide these complaints
in the future.  (We know [assume] that it is safe because other platforms
have hit this problem before, and the code is "the same".  In any event,
you as a "mere" user of ld-linux cannot do anything about that code, so don't
listen to those complaints.)  Invoke valgrind using the additional argument
"--gen-suppressions=yes" and see the documentation about adding the output
to the default suppressions for your platform.

The other choice is to ask the supplier of your glibc for a version of ld-linux
that is compiled without -O2 and without -O3.  [ld-linux is a part of glibc]
Your supplier of ld-linux is acting unfriendly to developers by not making
such a version readily available.  This problem has been known generally
for several years by embedded developers on many platforms.  You should
raise your voice and complain somewhat forcefully to your supplier of ld-linux.
[Of course, this being open source, in theory you could re-compile ld-linux
yourself.  But it is somewhat difficult, cumbersome, and long.]

--

Re: [Valgrind-users] valgrind exits with "Assertion 'in_rx' failed" on Android 4.0.4 target

From: Julian S. <js...@ac...> - 2013-05-30 22:31:25

On 05/30/2013 09:52 PM, Dieter, William R wrote:

> I followed the instructions in README.android, setting HWKIND to
> generic, and making to following changes to get valgrind to build:

Cross-compiling doesn't work well from MacOS hosts.  Those instructions
work better on Linux hosts.

> and included the --smc-check=all in the VGPARAMS, because it sounded
> like it would be required for an x86 build.

Yes.

> m_debuginfo/readelf.c:577 (get_elf_symbol_info): Assertion 'in_rx' failed.

This happens when reading debug info (line numbers etc) from some .so, or
the main executable.  These tend to be hard to diagnose.  First, find out
which object is causing the problem, by rerunning with -v.  Then, ideally
send me the object.  If you can't do that, get a diagnostic dump of the
debuginfo reader by adding the flags --trace-symtab=yes
--trace-symtab-patt=<whatever>.  These generate a huge amount of output, so
I suggest you play around with them first on a simple program (eg, ls) on
the host, so you can see how to use --trace-symtab-patt= to get output for
just the object in question.

Really also you should file a bug report, since bugs reported by email
tend to fall through the cracks.

J

Re: [Valgrind-users] phantom "new" at call leads to "definitely lost"

From: Alan M. <ala...@jp...> - 2013-05-30 21:54:02

On 5/30/2013 11:22 AM, Alan Mazer wrote:
> On 5/30/13 10:16 AM, Dan Kegel wrote:
>> On Thu, May 30, 2013 at 9:15 AM, Alan Mazer 
>> <ala...@jp...> wrote:
>>> This type of error is occurring in a variety of places, always
>>> "definitely lost" memory at a function call.
>>>
>>> Anyone have any idea what I'm missing?  I've tried multiple compilers
>>> and a bazillion different options (on compiler and valgrind). I'm 
>>> stumped.
>> Have you looked at what the compiler is generating, or single-stepped
>> through that call?
>
> I just looked at the output from -S but nothing looks odd.  It very 
> quickly jumps into strcmp calls that are near the top of the called 
> routine.
>
> I've been running this under OS X but I just tried it on a Linux 
> machine and there got more information:
>
> ==15252== 8,233 bytes in 1 blocks are definitely lost in loss record 4 
> of 4
> ==15252==    at 0x4A07152: operator new[](unsigned long) 
> (vg_replace_malloc.c:363)
> ==15252==    by 0x40E1A0: read_conf(char const*, char const*, char 
> const*, char const*, unsigned char, SL_List<char*>*, char**) 
> (parsing.cpp:295)
> ==15252==    by 0x41799A: main (main.cpp:404)
>
> On Linux valgrind gives me a line number within the routine (line 295) 
> that has a legitimate memory leak.
>
> So I guess I just need to run Valgrind on my Linux machine and avoid 
> the Mac?

It looks like the problem is that valgrind can't locate the "new" usages 
within the routines.  Looking at this more I can see that the reported 
memory leaks are all valid, and the tracebacks are perfect except for 
the locations within the "new"-using routines.  That part of the 
traceback is omitted in every message.  Compiling with the optimizer 
enabled heightens the effect, so I thought I might have some 
optimizations accidentally enabled but it doesn't look like it.  It's 
something weird about g++ on OS X...

[Valgrind-users] valgrind exits with "Assertion 'in_rx' failed" on Android 4.0.4 target

From: Dieter, W. R <wil...@in...> - 2013-05-30 19:52:24

I am trying to build valgrind to help debug a native Android
application.  The host I am compiling on is a Mac running Mac OS
10.8.3.  The target is an internal prototype x86 tablet running
Android 4.0.4.  I am using Android NDK r8e.

I started with the release version of Valgrind 3.8.1.  When I ran into
the premature exit described at the end of this note, I switched to
the svn version.

I followed the instructions in README.android, setting HWKIND to
generic, and making to following changes to get valgrind to build:

1) In the environment variable definitions for the build tools,
   substituted "darwin-x86_64" for "linux-x86" in the path to each of
   the tools.

2) Added:

   export 
RANLIB=$NDKROOT/toolchains/x86-4.4.3/prebuilt/linux-x86/bin/i686-android-li
nux-ranlib

   to get the right ranlib executable

3) The target cpu/kernel detection logic assumes it is building for
   the host CPU.  The --target and --host options cover most of the
   issues, but the configure script tries to run "uname -r" to get the
   kernel version.

   The logic in configure.in that matches kernel versions treats 2.6.*
   and 3.0.* the same way, so if you are building on a relatively
   recent Linux system it will probably work fine.  Mac OS is
   returning an OS version of 12.3.0, which is unrelated to the
   Android kernel version.

   I hardcoded configure.in to use version "3.0.8" to match my actual
   device, though maybe calling 'adb shell uname -r' would make more
   sense for android targets.

4) The types uint32_t and uint64_t are referenced in the system elf.h,
   and not defined by default on my system, so I added "#include
   <stdint.h>" prior to each "#include <elf.h>"
   (coregrind/m_main.c:2987, coregrind/m_coredump/coredump-elf.c:57,
   coregrind/m_debuginfo/readelf.c:57,
   coregrind/m_initimg/initimg-linux.c:60, coregrind/m_ume/elf.c:53,
   coregrind/launcher-linux.c:47)

When I run "/data/local/Inst/bin/valgrind ls", ls runs without any
errors, and I get the expected output:


==32681== Memcheck, a memory error detector
==32681== Copyright (C) 2002-2012, and GNU GPL'd, by Julian Seward et al.
==32681== Using Valgrind-3.9.0.SVN and LibVEX; rerun with -h for copyright
info
==32681== Command: ls
==32681==
        [... ls output deleted ... ]
==32681==
==32681== HEAP SUMMARY:
==32681==     in use at exit: 1,024 bytes in 1 blocks
==32681==   total heap usage: 41 allocs, 40 frees, 5,967 bytes allocated
==32681==
==32681== LEAK SUMMARY:
==32681==    definitely lost: 0 bytes in 0 blocks
==32681==    indirectly lost: 0 bytes in 0 blocks
==32681==      possibly lost: 0 bytes in 0 blocks
==32681==    still reachable: 1,024 bytes in 1 blocks
==32681==         suppressed: 0 bytes in 0 blocks
==32681== Rerun with --leak-check=full to see details of leaked memory
==32681==
==32681== For counts of detected and suppressed errors, rerun with: -v
==32681== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

I then added a wrapper, as described by jseward's blog post
(http://blog.mozilla.org/jseward/2011/09/27/valgrind-on-android-current-sta
tus/),
and included the --smc-check=all in the VGPARAMS, because it sounded
like it would be required for an x86 build.  The whole
/data/local/start_valgrind_myprog file looks like this:

    #!/system/bin/sh
    VGPARAMS='--error-limit=no --smc-check=all'
    export TMPDIR=/data/data/com.intel.central
    exec /data/local/Inst/bin/valgrind $VGPARAMS $*

When I start my application with:

    am start -a android.intent.action.MAIN -n
com.intel.central/.MainActivity

I see the following in logcat:

I//data/local/start_valgrind_myprog(32640): ==32641== Using
Valgrind-3.9.0.SVN and LibVEX; rerun with -h for copyright info
I//data/local/start_valgrind_myprog(32640): ==32641== Command:
/system/bin/app_process /system/bin --application
--nice-name=com.intel.central com.andr\
oid.internal.os.WrapperInit 32 17 android.app.ActivityThread
I//data/local/start_valgrind_myprog(32640): ==32641==
I//data/local/start_valgrind_myprog(32640): valgrind:
m_debuginfo/readelf.c:577 (get_elf_symbol_info): Assertion 'in_rx' failed.
I//data/local/start_valgrind_myprog(32640): ==32641== at 0x38033455:
report_and_quit (m_libcassert.c:260)
I//data/local/start_valgrind_myprog(32640): ==32641== by 0x38033851:
vgPlain_assert_fail (m_libcassert.c:340)
I//data/local/start_valgrind_myprog(32640): ==32641== by 0x3806D80E:
read_elf_symtab__normal (readelf.c:577)
I//data/local/start_valgrind_myprog(32640): ==32641== by 0x380705F3:
vgModuleLocal_read_elf_debug_info (readelf.c:2655)
I//data/local/start_valgrind_myprog(32640): ==32641== by 0x3806A449:
vgPlain_di_notify_mmap (debuginfo.c:629)
I//data/local/start_valgrind_myprog(32640): ==32641== by 0x38097510:
vgModuleLocal_generic_PRE_sys_mmap (syswrap-generic.c:2087)
I//data/local/start_valgrind_myprog(32640): ==32641== by 0x380C9A0B:
vgSysWrap_x86_linux_sys_mmap2_before (syswrap-x86-linux.c:1247)
I//data/local/start_valgrind_myprog(32640): ==32641== by 0x3808C830:
vgPlain_client_syscall (syswrap-main.c:1522)
I//data/local/start_valgrind_myprog(32640): ==32641== by 0x38089C12:
vgPlain_scheduler (scheduler.c:1066)
I//data/local/start_valgrind_myprog(32640): ==32641== by 0x380C1188:
run_a_thread_NORETURN (syswrap-linux.c:103)
I//data/local/start_valgrind_myprog(32640): sched status:
I//data/local/start_valgrind_myprog(32640): running_tid=1
I//data/local/start_valgrind_myprog(32640): Thread 1: status =
VgTs_Runnable
I//data/local/start_valgrind_myprog(32640): ==32641== at 0xB000F261:
__dl___mmap2 (in /system/bin/linker)
I//data/local/start_valgrind_myprog(32640): Note: see also the FAQ in the
source distribution.
I//data/local/start_valgrind_myprog(32640): It contains workarounds to
several common problems.
I//data/local/start_valgrind_myprog(32640): In particular, if Valgrind
aborted or crashed after
I//data/local/start_valgrind_myprog(32640): identifying problems in your
program, there's a good chance
I//data/local/start_valgrind_myprog(32640): that fixing those problems
will prevent Valgrind aborting or
I//data/local/start_valgrind_myprog(32640): crashing, especially if it
happened in m_mallocfree.c.
I//data/local/start_valgrind_myprog(32640): If that doesn't help, please
report this bug to: www.valgrind.org
I//data/local/start_valgrind_myprog(32640): In the bug report, send all
the above text, the valgrind
I//data/local/start_valgrind_myprog(32640): version, and what OS and
version you are using. Thanks.

The wrapper script appears to be launching the application, but it
looks like valgrind is exiting immediately with an assertion failure.
Any ideas what could be going wrong or how to fix it?

Thanks,
Bill.

Re: [Valgrind-users] phantom "new" at call leads to "definitely lost"

From: Alan M. <al...@ju...> - 2013-05-30 18:51:20

On 5/30/13 9:43 AM, David Chapman wrote:
> On 5/30/2013 9:15 AM, Alan Mazer wrote:
>> I've been using valgrind for many years and am finally stumped. After
>> not having used it recently, I'm now getting "definitely lost" memory
>> and the location of the "operator new" usage is a function call.
>>
>> For example...
>>
>> ==35995== 2,089 bytes in 1 blocks are definitely lost in loss record 9
>> of 10
>> ==35995==    at 0x10008FC16: malloc (vg_replace_malloc.c:274)
>> ==35995==    by 0x1000A126C: operator new(unsigned long) (in
>> /usr/local/lib/libstdc++.6.dylib)
>> ==35995==    by 0x10001984C: main (main.cpp:405)
>>
>> where main.cpp:405 is...
>>
>>       read_conf(conffile, ".conf;.test;.prot", preface, 
>> trigger_statement,
>>               skip_preprocessor, m4dirs, &text);
>>
>> Everything being passed is a pointer.
>>
>> If I don't call the routine, the error does go away.
>>
>> This type of error is occurring in a variety of places, always
>> "definitely lost" memory at a function call.
>>
>> Anyone have any idea what I'm missing?  I've tried multiple compilers
>> and a bazillion different options (on compiler and valgrind). I'm 
>> stumped.
>>
>> -- Alan
>>
>
> What is the signature of the function being called?  You say that 
> pointers are being passed in, but what does the function expect? For 
> example, is the compiler automatically creating an object for the 
> string constant using a default constructor?  (I realize the loss 
> record is longer than the string constant, but I don't know what the 
> function expects.)

Sure.  Good point.

extern int read_conf(const char *filename, const char *extensions,
     const char *preface, const char *postscript, u_char skip_preprocessor,
     SL_List<char *> *m4dirs, char **conf_text_p);


>
> Also, is the function in a .so file that has been stripped so that 
> Valgrind cannot follow it, stack trace?

No, there are no .so files here.  Just a bunch of object files linked 
together, all compiled with -g and -Wall, linked with -lm. I've tried 
with g++ 4.5.1 and 4.7.0.  They both give the same result.


>
> And of course, what version of Valgrind are you using?

3.8.1.

-- Alan

Re: [Valgrind-users] phantom "new" at call leads to "definitely lost"

From: Alan M. <al...@ju...> - 2013-05-30 18:23:28

On 5/30/13 10:16 AM, Dan Kegel wrote:
> On Thu, May 30, 2013 at 9:15 AM, Alan Mazer <ala...@jp...> wrote:
>> This type of error is occurring in a variety of places, always
>> "definitely lost" memory at a function call.
>>
>> Anyone have any idea what I'm missing?  I've tried multiple compilers
>> and a bazillion different options (on compiler and valgrind).  I'm stumped.
> Have you looked at what the compiler is generating, or single-stepped
> through that call?

I just looked at the output from -S but nothing looks odd.  It very 
quickly jumps into strcmp calls that are near the top of the called routine.

I've been running this under OS X but I just tried it on a Linux machine 
and there got more information:

==15252== 8,233 bytes in 1 blocks are definitely lost in loss record 4 of 4
==15252==    at 0x4A07152: operator new[](unsigned long) 
(vg_replace_malloc.c:363)
==15252==    by 0x40E1A0: read_conf(char const*, char const*, char 
const*, char const*, unsigned char, SL_List<char*>*, char**) 
(parsing.cpp:295)
==15252==    by 0x41799A: main (main.cpp:404)

On Linux valgrind gives me a line number within the routine (line 295) 
that has a legitimate memory leak.

So I guess I just need to run Valgrind on my Linux machine and avoid the 
Mac?

-- Alan

Re: [Valgrind-users] phantom "new" at call leads to "definitely lost"

From: David C. <dcc...@ac...> - 2013-05-30 17:36:51

On 5/30/2013 9:15 AM, Alan Mazer wrote:
> I've been using valgrind for many years and am finally stumped. After
> not having used it recently, I'm now getting "definitely lost" memory
> and the location of the "operator new" usage is a function call.
>
> For example...
>
> ==35995== 2,089 bytes in 1 blocks are definitely lost in loss record 9
> of 10
> ==35995==    at 0x10008FC16: malloc (vg_replace_malloc.c:274)
> ==35995==    by 0x1000A126C: operator new(unsigned long) (in
> /usr/local/lib/libstdc++.6.dylib)
> ==35995==    by 0x10001984C: main (main.cpp:405)
>
> where main.cpp:405 is...
>
>       read_conf(conffile, ".conf;.test;.prot", preface, trigger_statement,
>               skip_preprocessor, m4dirs, &text);
>
> Everything being passed is a pointer.
>
> If I don't call the routine, the error does go away.
>
> This type of error is occurring in a variety of places, always
> "definitely lost" memory at a function call.
>
> Anyone have any idea what I'm missing?  I've tried multiple compilers
> and a bazillion different options (on compiler and valgrind).  I'm stumped.
>
> -- Alan
>

What is the signature of the function being called?  You say that 
pointers are being passed in, but what does the function expect? For 
example, is the compiler automatically creating an object for the string 
constant using a default constructor?  (I realize the loss record is 
longer than the string constant, but I don't know what the function 
expects.)

Also, is the function in a .so file that has been stripped so that 
Valgrind cannot follow its stack trace?

And of course, what version of Valgrind are you using?

-- 
     David Chapman      dcc...@ac...
     Chapman Consulting -- San Jose, CA
     Software Development Done Right.
     www.chapman-consulting-sj.com

Re: [Valgrind-users] phantom "new" at call leads to "definitely lost"

From: Dan K. <da...@ke...> - 2013-05-30 17:17:06

On Thu, May 30, 2013 at 9:15 AM, Alan Mazer <ala...@jp...> wrote:
> This type of error is occurring in a variety of places, always
> "definitely lost" memory at a function call.
>
> Anyone have any idea what I'm missing?  I've tried multiple compilers
> and a bazillion different options (on compiler and valgrind).  I'm stumped.

Have you looked at what the compiler is generating, or single-stepped
through that call?

[Valgrind-users] phantom "new" at call leads to "definitely lost"

From: Alan M. <ala...@jp...> - 2013-05-30 16:15:40

I've been using valgrind for many years and am finally stumped. After 
not having used it recently, I'm now getting "definitely lost" memory 
and the location of the "operator new" usage is a function call.

For example...

==35995== 2,089 bytes in 1 blocks are definitely lost in loss record 9 
of 10
==35995==    at 0x10008FC16: malloc (vg_replace_malloc.c:274)
==35995==    by 0x1000A126C: operator new(unsigned long) (in 
/usr/local/lib/libstdc++.6.dylib)
==35995==    by 0x10001984C: main (main.cpp:405)

where main.cpp:405 is...

     read_conf(conffile, ".conf;.test;.prot", preface, trigger_statement,
             skip_preprocessor, m4dirs, &text);

Everything being passed is a pointer.

If I don't call the routine, the error does go away.

This type of error is occurring in a variety of places, always 
"definitely lost" memory at a function call.

Anyone have any idea what I'm missing?  I've tried multiple compilers 
and a bazillion different options (on compiler and valgrind).  I'm stumped.

-- Alan

Re: [Valgrind-users] mandatory function redirection error

From: John R. <jr...@bi...> - 2013-05-27 05:06:10

On 05/23/2013 10:14 AM, Britton Kerin wrote:
> On Mon, May 20, 2013 at 7:53 AM, John Reiser <jr...@bi...> wrote:
>>> I'm on a beaglebone white with vanilla Angstrom (linux 3.2)
>>> distribution.  Valgrind fails like this:
>>
>>>    ==13719== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
>>
>>>    valgrind:  A must-be-redirected function
>>>    valgrind:  whose name matches the pattern:      memcpy
>>>    valgrind:  in an object with soname matching:   ld-linux.so.3
>>>    valgrind:  was not found whilst processing
>>>    valgrind:  symbols from the object with soname: ld-linux.so.3
>>
>>> I've installed libc6-dbg as it says.  Same problem.
>>
>> Which other Linux distribution is Angstrom derived from, or is most similar to?
> 
> Well, its based in part on OpenZaurus, which was debian-based.  And it seems
> pretty debian-like in the way its package manager opkg works.  But that
> about all I know, I'm just using it because it ships on the beaglebone.

OK.  It's Debian.  The specific version of Debian might matter,
but perhaps in this case that info is no longer so significant.

> 
>> Try running the previous version 3.7 of valgrind.
> 
> This version doesnt seem to be available on the release archive page,
> where can I get it?

In general, try the Internet Archive.  But in this case the source code
to valgrind is better anyway; see below.

> 
>> Also post the output from:
>>   readelf --symbols ld-linux.so.3  |  grep mem
> 
>   root@bboneumh:/# readelf  --symbols /lib/ld-linux.so.3 | grep mem
>       12: 000150a8   332 FUNC    WEAK   DEFAULT   10 __libc_memalign@@GLIBC_2.4
>   root@bboneumh:/#

The actual source of memcheck's complaint is coregrind/m_redir.c:
void VG_(redir_initialise) ( void )
   <<snip>>
#  elif defined(VGP_arm_linux)
   /* If we're using memcheck, use these intercepts right from
      the start, otherwise ld.so makes a lot of noise. */
   if (0==VG_(strcmp)("Memcheck", VG_(details).name)) {
      add_hardwired_spec(
         "ld-linux.so.3", "strlen",
         (Addr)&VG_(arm_linux_REDIR_FOR_strlen),
         complain_about_stripped_glibc_ldso
      );
      //add_hardwired_spec(
      //   "ld-linux.so.3", "index",
      //   (Addr)&VG_(arm_linux_REDIR_FOR_index),
      //   NULL
      //);
      add_hardwired_spec(
         "ld-linux.so.3", "memcpy",
         (Addr)&VG_(arm_linux_REDIR_FOR_memcpy),
         complain_about_stripped_glibc_ldso
      );
   }

So the deal is: any valgrind tool [not just memcheck] for ARM Linux
will demand that ld-linux.so.3 has a subroutine named "memcpy"
which can be re-directed to an internal valgrind routine
that "does the same thing" in a way that valgrind considers
to be easier to understand and emulate or instrument.
Perhaps your version of ld-linux.so.3 does not have such a 'memcpy'.
The "readelf --symbols" shows that ld-linux.so.3 does not have this
itself, and you say that installing the debug package still does not
have a symbol for 'memcpy'.  So perhaps ld-linux.so.3 no longer uses
_subroutine_ memcpy; it was replaced by memmove, or a macro which was
expanded inline, or by an open-coded loop.  Therefore, try commenting-out
the code for memcpy, just like the code for 'index' is commented out:
      //add_hardwired_spec(
      //   "ld-linux.so.3", "memcpy",
      //   (Addr)&VG_(arm_linux_REDIR_FOR_memcpy),
      //   complain_about_stripped_glibc_ldso
      //);
Visit the main web page  http://www.valgrind.org/ ,
follow the "Code Repository" link, use 'svn' source code manager
to checkout some version (try for top-of-trunk first, else 3.8.1).
Modify VG_(redir_initialise) as indicated.  Build, install, test, use.

--

Re: [Valgrind-users] mandatory function redirection error

From: Britton K. <bri...@gm...> - 2013-05-23 17:14:25

On Mon, May 20, 2013 at 7:53 AM, John Reiser <jr...@bi...> wrote:
>> I'm on a beaglebone white with vanilla Angstrom (linux 3.2)
>> distribution.  Valgrind fails like this:
>
>>    ==13719== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
>
>>    valgrind:  A must-be-redirected function
>>    valgrind:  whose name matches the pattern:      memcpy
>>    valgrind:  in an object with soname matching:   ld-linux.so.3
>>    valgrind:  was not found whilst processing
>>    valgrind:  symbols from the object with soname: ld-linux.so.3
>
>> I've installed libc6-dbg as it says.  Same problem.
>
> Which other Linux distribution is Angstrom derived from, or is most similar to?

Well, its based in part on OpenZaurus, which was debian-based.  And it seems
pretty debian-like in the way its package manager opkg works.  But that
about all I know, I'm just using it because it ships on the beaglebone.

> Try running the previous version 3.7 of valgrind.

This version doesnt seem to be available on the release archive page,
where can I get it?

> Also post the output from:
>   readelf --symbols ld-linux.so.3  |  grep mem

  root@bboneumh:/# readelf  --symbols /lib/ld-linux.so.3 | grep mem
      12: 000150a8   332 FUNC    WEAK   DEFAULT   10 __libc_memalign@@GLIBC_2.4
  root@bboneumh:/#

Note that I had to use /lib/ld-linux.so.3, as the argument to readelf,
what you suggested said this:

  root@bboneumh2:~# readelf --symbols ld-linux.so.3  |  grep mem
  readelf: Error: 'ld-linux.so.3': No such file
  root@bboneumh2:~#

Britton

Re: [Valgrind-users] Helgrind data race question

From: Phil L. <plo...@sa...> - 2013-05-22 15:52:38

We discussed this internally, and think that the pthread_mutex_unlock() call will provide the memory barrier to force synchronization.

" Yes, pthread_mutex_unlock is a memory barrier (it would be quite useless otherwise).  Chapter and verse:

http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap04.html#tag_04_11

The problem with helgrind is that it will have a very hard time proving that the second thread cannot access the shared memory until after the pthread_mutex_unlock() has occurred.  Some kind of helgrind annotation for this worker queue case would probably be the easiest way out."

-----Original Message-----
From: David Faure [mailto:fa...@kd...] 
Sent: Thursday, May 16, 2013 2:34 PM
To: val...@li...
Cc: Phil Longstaff
Subject: Re: [Valgrind-users] Helgrind data race question

On Tuesday 14 May 2013 20:18:44 Phil Longstaff wrote:
> int* my_ptr = new int;
> *my_ptr = 10;
> pthread_mutex_lock(&lock);
> shared_ptr = my_ptr;
> pthread_mutex_unlock(&lock);
> 
> Thread 2:
> pthread_mutex_lock(&lock);
> int* my_ptr = shared_ptr;
> pthread_mutex_unlock(&lock);
> ... = *my_ptr;

You're reading a region of memory outside mutex protection, and that region of memory was written to, outside mutex protection. That's the basic definition of a data race.

Getting the address of that region of memory within the mutex doesn't change that.

You see it as non-racy because "how could *my_ptr ever be something else than 10" ... but if you think about a multi-processor system, the write of the value 10 might not get propagated to the cache of the other processor where the read happens, since the system had no reason to perform that synchronisation.

-- 
David Faure, fa...@kd..., http://www.davidfaure.fr
Working on KDE, in particular KDE Frameworks 5

188 messages has been excluded from this view by a project administrator.

Flat | Threaded

<< < 1 .. 127 128 129 130 131 .. 698 > >> (Page 129 of 698)