|
From: Vojtech F. <voj...@se...> - 2008-03-26 16:04:53
|
Hi, I am trying to use helgrind in a project that uses STLPort (a well known and widely used C++ STL library implementation http://www.stlport.org/). Helgrind did find some problems in our code but there are also some strange errors usually connected with STLPort. My configuration: OpenSuse 10.2 gcc 3.3.3 Valgrind 3.3.0 STLPort 4.? (I do not know the exact version, it is nowhere to be found) STLPort is configured to run without compiled library and with "newalloc" allocator. That is why I think there should be no problem with pooled memory reuse. I would expect problems with default "nodealloc" allocator. STLport uses pthreads, however I do not know what exactly from them. Our test suite is configured as a queue from which 3 (or any number) of threads read configuration for each job and do the work. But the threads do not share memory. (They share only shared libraries.) There were some problems because the test were run from Python like not calling pthread_join when joining Python threads or destruction of object from garbage collector thread but we have fixed it. The reported errors look like this: (I had to change <> to [] not to be accused of top posting :) ) Possible data race during write of size 4 at 0xB964010 at 0x6FF878B: _STL::pair[...]::pair(_STL::pair[...] const&) (_pair.h:54) by 0x6FF8777: void _STL::_Construct[_STL::pair[...]*], _STL::pair[...] ] (_STL::pair[...]*, _STL::pair[...] const&) (_construct.h:97) by 0x6FF722B: _STL::vector[_STL::pair[...], _STL::allocator [_STL::pair[...] ] ]::push_back(_STL::pair[...] const&) (_vector.h:333) ...our code... ...python code... Old state: owned exclusively by thread #3 New state: shared-modified by threads #3, #4 Reason: this thread, #4, holds no locks at all Or Possible data race during write of size 4 at 0x9FC3010 ...our code... (destructors) by 0x7124761: void _STL::_Destroy[...::UndoManagerImpl::CmdHistoryItem] (...::UndoManagerImpl::CmdHistoryItem*) (_construct.h:67) by 0x71293E8: void _STL::__destroy_aux[...::UndoManagerImpl::CmdHistoryItem*] (...::UndoManagerImpl::CmdHistoryItem*, ...::UndoManagerImpl:: CmdHistoryItem*, _STL::__false_type const&) (_construct.h:124) by 0x71293B6: void _STL::__destroy[...::UndoManagerImpl::CmdHistoryItem*, ...::UndoManagerImpl::CmdHistoryItem](...::UndoManagerImpl::CmdHistoryItem*, ...::UndoManagerImpl::CmdHistoryItem*, ...::UndoManagerImpl::CmdHistoryItem*) (_construct.h:134) by 0x7129388: void _STL::_Destroy[...](..., ...) (_construct.h:139) by 0x7127EE7: _STL::deque[...::UndoManagerImpl::CmdHistoryItem, _STL::allocator[...::UndoManagerImpl::CmdHistoryItem] ]::clear() (_deque.c:249) ...our code... ...python code... Old state: shared-readonly by threads #2, #4 New state: shared-modified by threads #2, #4 Reason: this thread, #4, holds no consistent locks Location 0x9FC3010 has never been protected by any lock But the stack is in the classes that are not shared between the worker threads. I tried to put some checks to see if this is true. The checker read thread ID in its constructor and compared it with actual thread on demand in some methods and destructors. The checks didn't report any conflicting threads. Any ideas or any experiences with helgrind and STLPort are welcome. Thanks, Vojtech |
|
From: Bart V. A. <bar...@gm...> - 2008-03-26 16:30:38
|
On Wed, Mar 26, 2008 at 5:04 PM, Vojtech Fried <voj...@se...> wrote: > Possible data race during write of size 4 at 0x9FC3010 > ...our code... (destructors) Hello Vojtech, It would help a lot if you could post a small example such that the issue you see can be reproduced. And with regard to destructors: does your class have virtual functions ? Please keep in mind that in the destructor of a class with virtual functions the compiler always modifies the vtable pointer. This can trigger data race reports that are hard to track down if you are unaware of this effect. Bart. |
|
From: Vojtech F. <voj...@se...> - 2008-03-26 16:54:28
|
> Hello Vojtech, > > It would help a lot if you could post a small example such that the > issue you see can be reproduced. And with regard to destructors: does > your class have virtual functions ? Please keep in mind that in the > destructor of a class with virtual functions the compiler always > modifies the vtable pointer. This can trigger data race reports that > are hard to track down if you are unaware of this effect. > > Bart. > Hi, I will try to make some small sample. Yes, the classes have virtual destructors. But I don't think it is a problem. The problem is that the threads do not share (at least should not share) the memory they seem to share according to helgrind reports. Vojtech. |
|
From: Konstantin S. <kon...@gm...> - 2008-03-26 18:07:26
|
Hi Vojtech, Yes, a minimized example will help a lot. Also, you may try helgrind from HGDEV branch instead of using Valgrind 3.3.0. >> Our test suite is configured as a queue from which 3 (or any number) of threads read configuration for each job and do the work. If you use a message queue it might require annotations... See example here: http://code.google.com/p/data-race-test/source/browse/trunk/unittest/thread_wrappers_pthread.h#254 --kcc On Wed, Mar 26, 2008 at 7:53 PM, Vojtech Fried <voj...@se...> wrote: > > Hello Vojtech, > > > > It would help a lot if you could post a small example such that the > > issue you see can be reproduced. And with regard to destructors: does > > your class have virtual functions ? Please keep in mind that in the > > destructor of a class with virtual functions the compiler always > > modifies the vtable pointer. This can trigger data race reports that > > are hard to track down if you are unaware of this effect. > > > > Bart. > > > > Hi, > > I will try to make some small sample. > > Yes, the classes have virtual destructors. But I don't think it is a problem. > The problem is that the threads do not share (at least should not share) the > memory they seem to share according to helgrind reports. > > Vojtech. > > > > > > > > ------------------------------------------------------------------------- > Check out the new SourceForge.net Marketplace. > It's the best place to buy or sell services for > just about anything Open Source. > http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace > _______________________________________________ > Valgrind-users mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-users > |
|
From: Vojtech F. <voj...@se...> - 2008-03-27 15:25:11
|
> Hi Vojtech, > > Yes, a minimized example will help a lot. > Also, you may try helgrind from HGDEV branch instead of using Valgrind 3.3.0. > > >> Our test suite is configured as a queue from which 3 (or any > number) of threads read configuration for each job and do the work. > If you use a message queue it might require annotations... See example > here: http://code.google.com/p/data-race-test/source/browse/trunk/unittest/thread_wrappers_pthread.h#254 > > --kcc Hi, I am afraid I am unable to replicate the problem on a small sample. The queue should not be a problem. It is only a python list of strings that encode a configuration for a job to be done by the worker thread. Vojtech |
|
From: Tom H. <to...@co...> - 2008-03-27 15:29:43
|
In message <loo...@po...>
Vojtech Fried <voj...@se...> wrote:
> The queue should not be a problem. It is only a python list of strings that
> encode a configuration for a job to be done by the worker thread.
So are you constructing a string in one thread, then passing it
through a queue to another thread where it is eventually freed?
I suspect that will cause a problem won't it as helgrind will see
accesses in different threads with no locks? Because the locks are
on the queue passing...
I saw something similar with helgrind reporting problems for a
database connection pool that was actually quite safe as a given
connection was only ever used by one thread but as the locks were
only held while connections were added to/removed from the shared
pool it looked like the connection was being used in multiple
threads without any locking.
Tom
--
Tom Hughes (to...@co...)
http://www.compton.nu/
|
|
From: Vojtech F. <voj...@se...> - 2008-03-27 15:46:32
|
Tom Hughes <tom <at> compton.nu> writes: > > In message <loom.20080327T151721-715 <at> post.gmane.org> > Vojtech Fried <vojtech.fried <at> seznam.cz> wrote: > > > The queue should not be a problem. It is only a python list of strings that > > encode a configuration for a job to be done by the worker thread. > > So are you constructing a string in one thread, then passing it > through a queue to another thread where it is eventually freed? > > I suspect that will cause a problem won't it as helgrind will see > accesses in different threads with no locks? Because the locks are > on the queue passing... > > I saw something similar with helgrind reporting problems for a > database connection pool that was actually quite safe as a given > connection was only ever used by one thread but as the locks were > only held while connections were added to/removed from the shared > pool it looked like the connection was being used in multiple > threads without any locking. > > Tom > The queue is a python list. Python guarantees thread safety for lists. But even if helgrind would not "see" python locks it would report a problem in python and not in our C++ code. I mentioned the queue only because I wanted to explain that how the threads are related: they only share shared libraries and the queue. Still helgrind reports some errors. Just an idea: does helgrind support realloc? (Probably yes.) And could helgrind have some problems with C++ "placement new"? (Probably no. Why should he?) Vojtech |
|
From: Julian S. <js...@ac...> - 2008-03-27 15:59:25
|
> I mentioned the queue only because I wanted to explain that how the threads > are related: they only share shared libraries and the queue. Still helgrind > reports some errors. Which Helgrind version is this? - 3.3.0? - svn trunk? - svn HGDEV branch? > Just an idea: does helgrind support realloc? (Probably yes.) Yes. But does it support realloc correctly? I don't know. What possible problem are you thinking of, with realloc? > And could helgrind have some problems with C++ "placement new"? What is "placement new" ? I never heard of it (but I don't know much C++). Is it possible you can make a small demonstration program that shows the problem? J |
|
From: Konstantin S. <kon...@gm...> - 2008-03-27 16:05:48
|
> > And could helgrind have some problems with C++ "placement new"? > > What is "placement new" ? I never heard of it (but I don't know > much C++). 'Placement new' is constructing objects with constructors in a given piece of memory (not necessary taken directly from new or malloc). Vojtech, how do you use placement new? If you use it to implement memory recycling, you will need to annotate the code with VG_USERREQ__HG_CLEAN_MEMORY. --kcc |
|
From: Tom H. <to...@co...> - 2008-03-27 16:06:45
|
In message <200...@ac...>
Julian Seward <js...@ac...> wrote:
>> And could helgrind have some problems with C++ "placement new"?
>
> What is "placement new" ? I never heard of it (but I don't know
> much C++).
It's a new that doesn't do any memory allocation - you give it an
address and it constructs an object in that piece of memory.
Tom
--
Tom Hughes (to...@co...)
http://www.compton.nu/
|
|
From: Vojtech F. <voj...@se...> - 2008-03-27 16:14:45
|
> > I mentioned the queue only because I wanted to explain that how the threads > > are related: they only share shared libraries and the queue. Still helgrind > > reports some errors. > > Which Helgrind version is this? > - 3.3.0? > - svn trunk? > - svn HGDEV branch? It's 3.3.0. (See the top post of this thread :) ) > > Just an idea: does helgrind support realloc? (Probably yes.) > > Yes. But does it support realloc correctly? I don't know. What > possible problem are you thinking of, with realloc? When shrinking the size, helgrind should handle the shrinked part as deleted (freed) and when expanding the new part should be handled as new (malloced). Vojtech |
|
From: Vojtech F. <voj...@se...> - 2008-03-27 16:20:27
|
> 'Placement new' is constructing objects with constructors in a given > piece of memory (not necessary taken directly from new or malloc). > Vojtech, how do you use placement new? > If you use it to implement memory recycling, you will need to annotate > the code with VG_USERREQ__HG_CLEAN_MEMORY. > > --kcc > I have noticed it is used in STLPort. I mentioned it only because it is not widely used C++ "idiom". But I do not see any direct way it could cause problems to helgrind. I do not know STLPort internals well. It is hard to read. Yes, memory recycling is my suspect but it should be turned off because we use only standard allocator (_STLP_USE_NEWALLOC). It would be quite difficult for me to inject any code into STLPort. Vojtech |
|
From: Konstantin S. <kon...@gm...> - 2008-03-28 06:20:42
|
Vojtech, Could you please try HGDEV branch? svn co svn://svn.valgrind.org/valgrind/branches/HGDEV valgrind && \ cd valgrind && \ ./autogen.sh && \ ./configure --prefix=`pwd`/Inst && \ make && \ make install --kcc On Thu, Mar 27, 2008 at 7:20 PM, Vojtech Fried <voj...@se...> wrote: > > 'Placement new' is constructing objects with constructors in a given > > piece of memory (not necessary taken directly from new or malloc). > > Vojtech, how do you use placement new? > > If you use it to implement memory recycling, you will need to annotate > > the code with VG_USERREQ__HG_CLEAN_MEMORY. > > > > --kcc > > > > I have noticed it is used in STLPort. I mentioned it only because it is not > widely used C++ "idiom". But I do not see any direct way it could cause problems > to helgrind. > I do not know STLPort internals well. It is hard to read. > Yes, memory recycling is my suspect but it should be turned off because we use > only standard allocator (_STLP_USE_NEWALLOC). > It would be quite difficult for me to inject any code into STLPort. > > Vojtech > > > > > > > ------------------------------------------------------------------------- > Check out the new SourceForge.net Marketplace. > It's the best place to buy or sell services for > just about anything Open Source. > http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace > _______________________________________________ > Valgrind-users mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-users > |
|
From: Bart V. A. <bar...@gm...> - 2008-03-28 08:54:40
|
On Thu, Mar 27, 2008 at 5:20 PM, Vojtech Fried <voj...@se...> wrote:
> > 'Placement new' is constructing objects with constructors in a given
> > piece of memory (not necessary taken directly from new or malloc).
> > Vojtech, how do you use placement new?
> > If you use it to implement memory recycling, you will need to annotate
> > the code with VG_USERREQ__HG_CLEAN_MEMORY.
>
> I have noticed it is used in STLPort. I mentioned it only because it is not
> widely used C++ "idiom". But I do not see any direct way it could cause problems
> to helgrind.
> I do not know STLPort internals well. It is hard to read.
> Yes, memory recycling is my suspect but it should be turned off because we use
> only standard allocator (_STLP_USE_NEWALLOC).
> It would be quite difficult for me to inject any code into STLPort.
Placement new should not require any additional client requests as
long as all C++ objects that have POSIX synchronization objects as
members explicitly clean up these (by calling pthread_mutex_destroy(),
pthread_cond_destroy() or pthread_rwlock_destroy()).
As a sidenote, some C++ developers use the following shortcut to
implement the assignment operator for a class with name C:
C::C& C::operator=(const C& RHS)
{
if (this != &RHS)
{
/* Explicit destructor call: calls the destructor but does not
deallocate memory. */
this->~C();
/* Placement new: calls the copy constructor but does not allocate
memory. */
new(this) C(RHS);
}
return *this;
}
Bart.
|
|
From: Vojtech F. <voj...@se...> - 2008-03-28 09:48:59
|
> Vojtech, > > Could you please try HGDEV branch? > > svn co svn://svn.valgrind.org/valgrind/branches/HGDEV valgrind && \ > cd valgrind && \ > ./autogen.sh && \ > ./configure --prefix=`pwd`/Inst && \ > make && \ > make install > > --kcc I tried and the result is ... quite different. There showed up a problem with Python allocator and memory recycling in Python. I will try to disable it a see what happens. There are still errors that seem to be related with STLPort T4: Possible data race during write of size 4 at 0xB843010 at 0x766FE0E: void _STL::_Construct<...>(...*, ... const&) (_construct.h:97) by 0x766C1BF: _STL::vector<..., _STL::allocator<...> >:: push_back(... const&) (_vector.h:333) by 0x766A6C4: ...our code... (layout_buffer_impl.cxx:473) old state: a0083ed000000000 R #SS=1 #LS=0 S33773/T2 new state: c0094eb000000000 W #SS=2 #LS=0 S33746/T4 S33773/T2 Location 0xB843010 has never been protected by any lock Location 0xB843010 is 432 bytes inside a block of size 512 alloc'd at 0x4023099: operator new(unsigned) (vg_replace_malloc.c:224) by 0x5AD3BA9: _STL::__stl_new(unsigned) (_new.h:86) by 0x5AC31DD: _STL::__new_alloc::allocate(unsigned) (_alloc.h:134) by 0x766FFCF: _STL::allocator<...>::allocate(unsigned, void const*) (_alloc.h:355) by 0x766FE89: _STL::vector<..., _STL::allocator<...> >:: _M_insert_overflow(...*, ... const&, _STL::__false_type const&, unsigned, bool) (_vector.h:130) by 0x766C1E7: _STL::vector<..., _STL::allocator<...> >::push_back(... const&) (_vector.h:337) by 0x766A6C4: ...our code... (layout_buffer_impl.cxx:473) And no there are no global data in layout_buffer_impl.cxx:473. By the way: how shall I interpret "new state: c0094eb000000000 W #SS=2 #LS=0 S33746/T4 S33773/T2"? And there were also some warning: parse_type_DIE: confused by: <2><149be1>: DW_TAG_structure_type DW_AT_abstract_ori: <149a77> --28862-- WARNING: Serious error when reading debug info --28862-- When reading debug info from .../libpython2.5.so.1.0: --28862-- parse_type_DIE: confused by the above DIE --28862-- WARNING: Serious error when reading debug info --28862-- When reading debug info from .../libicudata.so.36.0: --28862-- Can't make sense of .text section mapping Vojtech |
|
From: Julian S. <js...@ac...> - 2008-03-28 10:20:46
|
> And there were also some warning: > parse_type_DIE: confused by: > <2><149be1>: DW_TAG_structure_type > DW_AT_abstract_ori: <149a77> > > --28862-- WARNING: Serious error when reading debug info > --28862-- When reading debug info from .../libpython2.5.so.1.0: > --28862-- parse_type_DIE: confused by the above DIE > --28862-- WARNING: Serious error when reading debug info > --28862-- When reading debug info from .../libicudata.so.36.0: > --28862-- Can't make sense of .text section mapping These are problems in the post-3.3.0 debuginfo reader rework. Pls can you send me bzip2'd copies of these two .so's. Thanks. J |
|
From: Bart V. A. <bar...@gm...> - 2008-03-28 12:11:30
|
On Fri, Mar 28, 2008 at 10:48 AM, Vojtech Fried <voj...@se...> wrote: > There are still errors that seem to be related with STLPort > T4: Possible data race during write of size 4 at 0xB843010 > at 0x766FE0E: void _STL::_Construct<...>(...*, ... const&) (_construct.h:97) > by 0x766C1BF: _STL::vector<..., _STL::allocator<...> >:: > push_back(... const&) (_vector.h:333) > by 0x766A6C4: ...our code... (layout_buffer_impl.cxx:473) The class std::vector<> of the STL (and most likely _STL::vector<> in STLPort too) uses a custom memory allocator. This is well explained in every book about C++ in general or the STL in particular. There is a chapter in Valgrind's manual about which client requests have to be inserted in client code such that memcheck can properly recognize memory pools. But what is not clear to me is whether the approach explained in memcheck's manual only works for memcheck or that these client requests should also be supported by other tools like helgrind or exp-drd ? See also http://www.valgrind.org/docs/manual/mc-manual.html#mc-manual.mempools Bart. |
|
From: Vojtech F. <voj...@se...> - 2008-03-28 12:29:13
|
> The class std::vector<> of the STL (and most likely _STL::vector<> in > STLPort too) uses a custom memory allocator. This is well explained in > every book about C++ in general or the STL in particular. We use STLPort configured with "_STLP_USE_NEWALLOC", which means that the allocator should not use memory pools and stuff like that. At least I think so. Vojtech |
|
From: Bart V. A. <bar...@gm...> - 2008-03-28 12:39:31
|
On Fri, Mar 28, 2008 at 1:29 PM, Vojtech Fried <voj...@se...> wrote:
> > The class std::vector<> of the STL (and most likely _STL::vector<> in
> > STLPort too) uses a custom memory allocator. This is well explained in
> > every book about C++ in general or the STL in particular.
>
> We use STLPort configured with "_STLP_USE_NEWALLOC", which means that the
> allocator should not use memory pools and stuff like that. At least I think so.
Vectors are special: I'm not sure "_STLP_USE_NEWALLOC" catches
vector<> resizing. Consider e.g. the following code:
struct C { int i; };
int main()
{
std::vector<C> V;
V.resize(2);
V[1].i = 1;
V.resize(1);
V.resize(2);
printf("V[1].i = %d\n", V[1].i);
return 0;
}
Since std::vector<> does not reallocate the vector when resizing it
from two to one elements, memcheck doesn't complain that the value
read by the printf() call is undefined.
Bart.
|
|
From: Konstantin S. <kon...@gm...> - 2008-03-28 12:40:51
|
Hmmm. Another wild guess: maybe you use vector::swap, which swaps internal pointers... If properly synchronized, it is safe, but Helgrind-unfriendly. (A small test will help so much!!) Julian, when we print Location 0xB843010 is 432 bytes inside a block of size 512 alloc'd at 0x4023099: operator new(unsigned) (vg_replace_malloc.c:224) by 0x5AD3BA9: _STL::__stl_new(unsigned) (_new.h:86) ... can we also print the thread and the segment where it happened? --kcc On Fri, Mar 28, 2008 at 3:29 PM, Vojtech Fried <voj...@se...> wrote: > > The class std::vector<> of the STL (and most likely _STL::vector<> in > > STLPort too) uses a custom memory allocator. This is well explained in > > every book about C++ in general or the STL in particular. > > We use STLPort configured with "_STLP_USE_NEWALLOC", which means that the > allocator should not use memory pools and stuff like that. At least I think so. > > Vojtech > > > > > > > ------------------------------------------------------------------------- > Check out the new SourceForge.net Marketplace. > It's the best place to buy or sell services for > just about anything Open Source. > http://ad.doubleclick.net/clk;164216239;13503038;w?http://sf.net/marketplace > _______________________________________________ > Valgrind-users mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-users > |
|
From: Vojtech F. <voj...@se...> - 2008-03-28 13:25:10
|
> Vectors are special: I'm not sure "_STLP_USE_NEWALLOC" catches
> vector<> resizing. Consider e.g. the following code:
>
> struct C { int i; };
> int main()
> {
> std::vector<C> V;
> V.resize(2);
> V[1].i = 1;
> V.resize(1);
> V.resize(2);
> printf("V[1].i = %d\n", V[1].i);
> return 0;
> }
>
> Since std::vector<> does not reallocate the vector when resizing it
> from two to one elements, memcheck doesn't complain that the value
> read by the printf() call is undefined.
>
> Bart.
You are right. It is more tricky than I thought.
But if the vector is accessed only from one thread it should not be a problem
for helgrind.
Probably helgrind does not see that the memory the vector occupies is eventually
deleted or STLPort recycles memory even with "new_alloc".
I don't know. I will have to investigate it further.
Vojtech
|
|
From: Vojtech F. <voj...@se...> - 2008-03-28 13:35:04
|
> (A small test will help so much!!) It seems to me that preparing a small test that replicates the problem is as difficult as finding the problem :) Vojtech |