You can subscribe to this list here.
| 2003 |
Jan
|
Feb
|
Mar
(58) |
Apr
(261) |
May
(169) |
Jun
(214) |
Jul
(201) |
Aug
(219) |
Sep
(198) |
Oct
(203) |
Nov
(241) |
Dec
(94) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2004 |
Jan
(137) |
Feb
(149) |
Mar
(150) |
Apr
(193) |
May
(95) |
Jun
(173) |
Jul
(137) |
Aug
(236) |
Sep
(157) |
Oct
(150) |
Nov
(136) |
Dec
(90) |
| 2005 |
Jan
(139) |
Feb
(130) |
Mar
(274) |
Apr
(138) |
May
(184) |
Jun
(152) |
Jul
(261) |
Aug
(409) |
Sep
(239) |
Oct
(241) |
Nov
(260) |
Dec
(137) |
| 2006 |
Jan
(191) |
Feb
(142) |
Mar
(169) |
Apr
(75) |
May
(141) |
Jun
(169) |
Jul
(131) |
Aug
(141) |
Sep
(192) |
Oct
(176) |
Nov
(142) |
Dec
(95) |
| 2007 |
Jan
(98) |
Feb
(120) |
Mar
(93) |
Apr
(96) |
May
(95) |
Jun
(65) |
Jul
(62) |
Aug
(56) |
Sep
(53) |
Oct
(95) |
Nov
(106) |
Dec
(87) |
| 2008 |
Jan
(58) |
Feb
(149) |
Mar
(175) |
Apr
(110) |
May
(106) |
Jun
(72) |
Jul
(55) |
Aug
(89) |
Sep
(26) |
Oct
(96) |
Nov
(83) |
Dec
(93) |
| 2009 |
Jan
(97) |
Feb
(106) |
Mar
(74) |
Apr
(64) |
May
(115) |
Jun
(83) |
Jul
(137) |
Aug
(103) |
Sep
(56) |
Oct
(59) |
Nov
(61) |
Dec
(37) |
| 2010 |
Jan
(94) |
Feb
(71) |
Mar
(53) |
Apr
(105) |
May
(79) |
Jun
(111) |
Jul
(110) |
Aug
(81) |
Sep
(50) |
Oct
(82) |
Nov
(49) |
Dec
(21) |
| 2011 |
Jan
(87) |
Feb
(105) |
Mar
(108) |
Apr
(99) |
May
(91) |
Jun
(94) |
Jul
(114) |
Aug
(77) |
Sep
(58) |
Oct
(58) |
Nov
(131) |
Dec
(62) |
| 2012 |
Jan
(76) |
Feb
(93) |
Mar
(68) |
Apr
(95) |
May
(62) |
Jun
(109) |
Jul
(90) |
Aug
(87) |
Sep
(49) |
Oct
(54) |
Nov
(66) |
Dec
(84) |
| 2013 |
Jan
(67) |
Feb
(52) |
Mar
(93) |
Apr
(65) |
May
(33) |
Jun
(34) |
Jul
(52) |
Aug
(42) |
Sep
(52) |
Oct
(48) |
Nov
(66) |
Dec
(14) |
| 2014 |
Jan
(66) |
Feb
(51) |
Mar
(34) |
Apr
(47) |
May
(58) |
Jun
(27) |
Jul
(52) |
Aug
(41) |
Sep
(78) |
Oct
(30) |
Nov
(28) |
Dec
(26) |
| 2015 |
Jan
(41) |
Feb
(42) |
Mar
(20) |
Apr
(73) |
May
(31) |
Jun
(48) |
Jul
(23) |
Aug
(55) |
Sep
(36) |
Oct
(47) |
Nov
(48) |
Dec
(41) |
| 2016 |
Jan
(32) |
Feb
(34) |
Mar
(33) |
Apr
(22) |
May
(14) |
Jun
(31) |
Jul
(29) |
Aug
(41) |
Sep
(17) |
Oct
(27) |
Nov
(38) |
Dec
(28) |
| 2017 |
Jan
(28) |
Feb
(30) |
Mar
(16) |
Apr
(9) |
May
(27) |
Jun
(57) |
Jul
(28) |
Aug
(43) |
Sep
(31) |
Oct
(20) |
Nov
(24) |
Dec
(18) |
| 2018 |
Jan
(34) |
Feb
(50) |
Mar
(18) |
Apr
(26) |
May
(13) |
Jun
(31) |
Jul
(13) |
Aug
(11) |
Sep
(15) |
Oct
(12) |
Nov
(18) |
Dec
(13) |
| 2019 |
Jan
(12) |
Feb
(29) |
Mar
(51) |
Apr
(22) |
May
(13) |
Jun
(20) |
Jul
(13) |
Aug
(12) |
Sep
(21) |
Oct
(6) |
Nov
(9) |
Dec
(5) |
| 2020 |
Jan
(13) |
Feb
(5) |
Mar
(25) |
Apr
(4) |
May
(40) |
Jun
(27) |
Jul
(5) |
Aug
(17) |
Sep
(21) |
Oct
(1) |
Nov
(5) |
Dec
(15) |
| 2021 |
Jan
(28) |
Feb
(6) |
Mar
(11) |
Apr
(5) |
May
(7) |
Jun
(8) |
Jul
(5) |
Aug
(5) |
Sep
(11) |
Oct
(9) |
Nov
(10) |
Dec
(12) |
| 2022 |
Jan
(7) |
Feb
(13) |
Mar
(8) |
Apr
(7) |
May
(12) |
Jun
(27) |
Jul
(14) |
Aug
(27) |
Sep
(27) |
Oct
(17) |
Nov
(17) |
Dec
|
| 2023 |
Jan
(10) |
Feb
(18) |
Mar
(9) |
Apr
(26) |
May
|
Jun
(13) |
Jul
(18) |
Aug
(5) |
Sep
(6) |
Oct
|
Nov
|
Dec
|
| S | M | T | W | T | F | S |
|---|---|---|---|---|---|---|
|
1
|
2
(7) |
3
|
4
(1) |
5
(10) |
6
(10) |
7
(3) |
|
8
|
9
(4) |
10
|
11
|
12
(11) |
13
(11) |
14
|
|
15
(5) |
16
(8) |
17
(4) |
18
(4) |
19
(12) |
20
(5) |
21
(4) |
|
22
|
23
(2) |
24
(3) |
25
|
26
|
27
|
28
(2) |
|
From: Bart V. A. <bar...@gm...> - 2009-02-07 14:51:49
|
On Fri, Feb 6, 2009 at 8:19 PM, Julian Seward <js...@ac...> wrote: > If you can suggest some criteria that allows to distinguish the case you > consider an error, from a "safe" destruction of a barrier, that would be > very helpful. But given that the POSIX spec is basically broken, I don't > see how it would be possible to construct such a criteria. How about comparing the vector clocks of the most recent barrier_wait() calls with the vector clock of the thread destroying the barrier ? This should allow to find out whether or not barrier_wait() calls and a barrier_destroy() call that explicitly destroys a barrier or any free() call that implicitly destroys a barrier were ordered via a synchronization operation. Bart. |
|
From: Julian S. <js...@ac...> - 2009-02-06 19:20:37
|
> Thank you. So is the correct reading that the first block gives the > stack trace of the call that triggered the error, and the second block > gives the stack trace of the earlier code (whose execution is now > complete) that freed the memory? Yes. J |
|
From: Julian S. <js...@ac...> - 2009-02-06 19:19:37
|
Christoph, I understand, by reading the thread that Tom Fogal refers to .. > http://groups.google.com/group/comp.programming.threads/browse_thread/thread/ > 4f65535d6192aa50/a5f4bf1e3b437c4d?lnk=st&q=#a5f4bf1e3b437c4d .. that the POSIX pthreads standard is in a way broken: you cannot know when you are the last thread to leave the barrier, and so there is no safe way to destroy the barrier without using yet another synchronisation operation to somehow guarantee that all the threads really have left the barrier. That is a correct understanding, yes? Now it is indeed the case that both Helgrind and DRD do report destruction of a barrier which has waiting threads. You can easily verify this using the regression test case helgrind/tests/bad_bar.c. However, in your example, all threads are considered by Helgrind and DRD to have left the barrier before you destroy it. Hence no error is reported. If you can suggest some criteria that allows to distinguish the case you consider an error, from a "safe" destruction of a barrier, that would be very helpful. But given that the POSIX spec is basically broken, I don't see how it would be possible to construct such a criteria. J On Thursday 05 February 2009, Christoph Bartoschek wrote: > Am Donnerstag, 5. Februar 2009 schrieb tom fogal: > > I'm getting a bit off topic, but .. > > > > Perhaps I'm just not understanding the linked-to discussion, but given > > this interpretation -- how could one ever delete a barrier? > > > > It sounds like the only safe way to destroy the barrier is if you've > > joined every thread which could have possibly used it. Given that > > constraint, I'm not sure how real world software could reasonably deal > > with this. > > > > So is the idea essentially that we might as well forget about > > destroying barriers? What am I missing? > > 1. There is the proposal to fix the standard by allowing the thread that > gets the return value of PTHREAD_BARRIER_SERIAL_THREAD to destroy the > barrier. > > 2. You can delete the barrier as soon as you know that all threads left the > call to pthread_barrier_wait(). This can be done by other synchronisation > primitives like another barrier, a lock, a condvar or a join. > > Christoph > > --------------------------------------------------------------------------- >--- Create and Deploy Rich Internet Apps outside the browser with > Adobe(R)AIR(TM) software. With Adobe AIR, Ajax developers can use existing > skills and code to build responsive, highly engaging applications that > combine the power of local resources and data with the reach of the web. > Download the Adobe AIR SDK and Ajax docs to start building applications > today-http://p.sf.net/sfu/adobe-com > _______________________________________________ > Valgrind-users mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-users |
|
From: Tom H. <to...@co...> - 2009-02-06 19:18:34
|
Ross Boylan wrote: > On Fri, 2009-02-06 at 19:01 +0000, Tom Hughes wrote: >> Ross Boylan wrote: >>> On Fri, 2009-02-06 at 19:47 +0100, Julian Seward wrote: >>>>> invalid read. However, there is no indication of why it is invalid. >>>> Um, it says it is invalid because you are reading freed memory: >>> I believe those are 2 separate error reports. The addresses are >>> different, and the indentation seems to indicate they are distinct >>> items. Am I misreading? >> Yes - there is an instruction at address 0x40904A which is reading >> memory at address 0x5588760 that was previously freed. >> >> Tom > Thank you. So is the correct reading that the first block gives the > stack trace of the call that triggered the error, and the second block > gives the stack trace of the earlier code (whose execution is now > complete) that freed the memory? Exactly. Tom -- Tom Hughes (to...@co...) http://www.compton.nu/ |
|
From: Ross B. <ro...@bi...> - 2009-02-06 19:12:42
|
On Fri, 2009-02-06 at 19:01 +0000, Tom Hughes wrote: > Ross Boylan wrote: > > On Fri, 2009-02-06 at 19:47 +0100, Julian Seward wrote: > >>> invalid read. However, there is no indication of why it is invalid. > >> Um, it says it is invalid because you are reading freed memory: > > I believe those are 2 separate error reports. The addresses are > > different, and the indentation seems to indicate they are distinct > > items. Am I misreading? > > Yes - there is an instruction at address 0x40904A which is reading > memory at address 0x5588760 that was previously freed. > > Tom Thank you. So is the correct reading that the first block gives the stack trace of the call that triggered the error, and the second block gives the stack trace of the earlier code (whose execution is now complete) that freed the memory? Ross > > >>> ==27712== Invalid read of size 8 > >>> ==27712== at 0x40904A: std::valarray<int>::size() const > >>> (valarray:772) > >> [...] > >>> ==27712== Address 0x5588760 is 0 bytes inside a block of size 104 > >>> free'd > >>> ==27712== at 0x4A1B17F: operator delete(void*) > >>> (vg_replace_malloc.c:244) > >>> ==27712== by 0x4968F1: data() (Data_test.cc:123) > >>> ==27712== by 0x43903E: boost::unit_test::ut_detail::unused > >>> boost::unit_test::ut_detail::invoker<boost::unit_test::ut_detail::unused>:: > >>> invoke<void (*)()>(void (*)()&) (callback.hpp:56) > >> [...] > >> > >> J > > |
|
From: Julian S. <js...@ac...> - 2009-02-06 19:08:09
|
On Friday 06 February 2009, Ross Boylan wrote: > On Fri, 2009-02-06 at 19:47 +0100, Julian Seward wrote: > > > invalid read. However, there is no indication of why it is invalid. > > > > Um, it says it is invalid because you are reading freed memory: > > I believe those are 2 separate error reports. The addresses are > different, and the indentation seems to indicate they are distinct > items. What you have here is one single error report, with two stacks. It's unfortunate that Valgrind's text output is sometimes so verbose, so it's hard to see the boundaries of each error. Errors are separated by a line containing only "==PID==" and nothing else, and you'll find there are no such in what you pasted in. I was just thinking, it is unfortunate these errors are so verbose. One solution is to use the Valkyrie GUI to display the errors (available at http://valgrind.org/downloads). This makes it much easier to see where the error boundaries are, and also much easier to navigate the Memcheck output for large programs. J |
|
From: Tom H. <to...@co...> - 2009-02-06 19:02:19
|
Ross Boylan wrote: > On Fri, 2009-02-06 at 19:47 +0100, Julian Seward wrote: >>> invalid read. However, there is no indication of why it is invalid. >> Um, it says it is invalid because you are reading freed memory: > I believe those are 2 separate error reports. The addresses are > different, and the indentation seems to indicate they are distinct > items. Am I misreading? Yes - there is an instruction at address 0x40904A which is reading memory at address 0x5588760 that was previously freed. Tom >>> ==27712== Invalid read of size 8 >>> ==27712== at 0x40904A: std::valarray<int>::size() const >>> (valarray:772) >> [...] >>> ==27712== Address 0x5588760 is 0 bytes inside a block of size 104 >>> free'd >>> ==27712== at 0x4A1B17F: operator delete(void*) >>> (vg_replace_malloc.c:244) >>> ==27712== by 0x4968F1: data() (Data_test.cc:123) >>> ==27712== by 0x43903E: boost::unit_test::ut_detail::unused >>> boost::unit_test::ut_detail::invoker<boost::unit_test::ut_detail::unused>:: >>> invoke<void (*)()>(void (*)()&) (callback.hpp:56) >> [...] >> >> J -- Tom Hughes (to...@co...) http://www.compton.nu/ |
|
From: Michael P. <md...@tr...> - 2009-02-06 19:01:20
|
Ross Boylan writes: > ==27712== > Running 81 test cases... > ==27712== Invalid read of size 8 > ==27712== at 0x40904A: std::valarray<int>::size() const > (valarray:772) [snip] > ==27712== Address 0x5588760 is 0 bytes inside a block of size 104 > free'd > ==27712== at 0x4A1B17F: operator delete(void*) > (vg_replace_malloc.c:244) "Address [] inside a block of size [] free'd" indicates this is a read-after-free bug: The valarray<int> was delete'd before its size() method was called. Presumably the glibc error is because you also write to some of the released memory -- memory that became part of the heap's free-list structure. Michael Poole |
|
From: Ross B. <ro...@bi...> - 2009-02-06 18:53:57
|
On Fri, 2009-02-06 at 19:47 +0100, Julian Seward wrote: > > invalid read. However, there is no indication of why it is invalid. > > Um, it says it is invalid because you are reading freed memory: I believe those are 2 separate error reports. The addresses are different, and the indentation seems to indicate they are distinct items. Am I misreading? > > > ==27712== Invalid read of size 8 > > ==27712== at 0x40904A: std::valarray<int>::size() const > > (valarray:772) > [...] > > ==27712== Address 0x5588760 is 0 bytes inside a block of size 104 > > free'd > > ==27712== at 0x4A1B17F: operator delete(void*) > > (vg_replace_malloc.c:244) > > ==27712== by 0x4968F1: data() (Data_test.cc:123) > > ==27712== by 0x43903E: boost::unit_test::ut_detail::unused > > boost::unit_test::ut_detail::invoker<boost::unit_test::ut_detail::unused>:: > >invoke<void (*)()>(void (*)()&) (callback.hpp:56) > [...] > > J -- Ross Boylan wk: (415) 514-8146 185 Berry St #5700 ro...@bi... Dept of Epidemiology and Biostatistics fax: (415) 514-8150 University of California, San Francisco San Francisco, CA 94107-1739 hm: (415) 550-1062 |
|
From: Julian S. <js...@ac...> - 2009-02-06 18:47:03
|
> invalid read. However, there is no indication of why it is invalid. Um, it says it is invalid because you are reading freed memory: > ==27712== Invalid read of size 8 > ==27712== at 0x40904A: std::valarray<int>::size() const > (valarray:772) [...] > ==27712== Address 0x5588760 is 0 bytes inside a block of size 104 > free'd > ==27712== at 0x4A1B17F: operator delete(void*) > (vg_replace_malloc.c:244) > ==27712== by 0x4968F1: data() (Data_test.cc:123) > ==27712== by 0x43903E: boost::unit_test::ut_detail::unused > boost::unit_test::ut_detail::invoker<boost::unit_test::ut_detail::unused>:: >invoke<void (*)()>(void (*)()&) (callback.hpp:56) [...] J |
|
From: Ross B. <ro...@bi...> - 2009-02-06 18:34:36
|
When I run my program under valgrind the first reported error is an invalid read. However, there is no indication of why it is invalid. The same code has worked OK under 32 bit linux (AMD Athlon) and OS-X (64 bit powerPC, though OS may have been mostly 32 bit). I am now running under 64 bit Xeon. Under gdb, the memory looks OK. First, could this be an alignment issue? That is, can valgrind detect such problems, is the error report consistent with such a problem, and could it actually be a problem. The web seems to indicate that alignment is a performance issue only, but I'm not sure I've found the right docs. Second, could this be some kind of race problem? I notice the multi-threaded boost code is running the tests. Third, any other clues about what this might be, or what it definitely isn't, or what to do would be great. I am assuming it is not an "accessing memory already freed" problem, because the report does not say that. The program was compiled -O0 -d, and is mostly in C++. Here are the first couple of errors; there were lots more. The early errors appear not to be fatal, though a later one is. t$ valgrind --leak-check=yes ./test1 --report_level=detailed /home/ross/mspath/src/test/inputs ==27712== Memcheck, a memory error detector. ==27712== Copyright (C) 2002-2006, and GNU GPL'd, by Julian Seward et al. ==27712== Using LibVEX rev 1658, a library for dynamic binary translation. ==27712== Copyright (C) 2004-2006, and GNU GPL'd, by OpenWorks LLP. ==27712== Using valgrind-3.2.1-Debian, a dynamic binary instrumentation framework. ==27712== Copyright (C) 2000-2006, and GNU GPL'd, by Julian Seward et al. ==27712== For more details, rerun with: -v ==27712== Running 81 test cases... ==27712== Invalid read of size 8 ==27712== at 0x40904A: std::valarray<int>::size() const (valarray:772) ==27712== by 0x409064: mspath::Data::nObs() const (mspath.web:6990) ==27712== by 0x408B03: mspath::DataIterator::advanceToNext(unsigned long) (mspath.web:9353) ==27712== by 0x408C32: mspath::DataIterator::next() (mspath.web:9345) ==27712== by 0x496946: data() (Data_test.cc:129) ==27712== by 0x43903E: boost::unit_test::ut_detail::unused boost::unit_test::ut_detail::invoker<boost::unit_test::ut_detail::unused>::invoke<void (*)()>(void (*)()&) (callback.hpp:56) ==27712== by 0x439062: boost::unit_test::ut_detail::callback0_impl_t<boost::unit_test::ut_detail::unused, void (*)()>::invoke() (callback.hpp:89) ==27712== by 0x4B40FE0: boost::unit_test::ut_detail::callback0_impl_t<int, boost::unit_test::(anonymous namespace)::zero_return_wrapper>::invoke() (in /usr/lib/libboost_unit_test_framework-gcc-mt-1_33_1.so.1.33.1) ==27712== by 0x4B3343C: boost::execution_monitor::catch_signals(boost::unit_test::callback0<int> const&, bool, int) (in /usr/lib/libboost_unit_test_framework-gcc-mt-1_33_1.so.1.33.1) ==27712== by 0x4B3353F: boost::execution_monitor::execute(boost::unit_test::callback0<int> const&, bool, int) (in /usr/lib/libboost_unit_test_framework-gcc-mt-1_33_1.so.1.33.1) ==27712== by 0x4B40DB1: boost::unit_test::unit_test_monitor_t::execute_and_translate(boost::unit_test::test_case const&) (in /usr/lib/libboost_unit_test_framework-gcc-mt-1_33_1.so.1.33.1) ==27712== by 0x4B360C9: boost::unit_test::framework_impl::visit(boost::unit_test::test_case const&) (in /usr/lib/libboost_unit_test_framework-gcc-mt-1_33_1.so.1.33.1) ==27712== Address 0x5588760 is 0 bytes inside a block of size 104 free'd ==27712== at 0x4A1B17F: operator delete(void*) (vg_replace_malloc.c:244) ==27712== by 0x4968F1: data() (Data_test.cc:123) ==27712== by 0x43903E: boost::unit_test::ut_detail::unused boost::unit_test::ut_detail::invoker<boost::unit_test::ut_detail::unused>::invoke<void (*)()>(void (*)()&) (callback.hpp:56) ==27712== by 0x439062: boost::unit_test::ut_detail::callback0_impl_t<boost::unit_test::ut_detail::unused, void (*)()>::invoke() (callback.hpp:89) ==27712== by 0x4B40FE0: boost::unit_test::ut_detail::callback0_impl_t<int, boost::unit_test::(anonymous namespace)::zero_return_wrapper>::invoke() (in /usr/lib/libboost_unit_test_framework-gcc-mt-1_33_1.so.1.33.1) ==27712== by 0x4B3343C: boost::execution_monitor::catch_signals(boost::unit_test::callback0<int> const&, bool, int) (in /usr/lib/libboost_unit_test_framework-gcc-mt-1_33_1.so.1.33.1) ==27712== by 0x4B3353F: boost::execution_monitor::execute(boost::unit_test::callback0<int> const&, bool, int) (in /usr/lib/libboost_unit_test_framework-gcc-mt-1_33_1.so.1.33.1) ==27712== by 0x4B40DB1: boost::unit_test::unit_test_monitor_t::execute_and_translate(boost::unit_test::test_case const&) (in /usr/lib/libboost_unit_test_framework-gcc-mt-1_33_1.so.1.33.1) ==27712== by 0x4B360C9: boost::unit_test::framework_impl::visit(boost::unit_test::test_case const&) (in /usr/lib/libboost_unit_test_framework-gcc-mt-1_33_1.so.1.33.1) ==27712== by 0x4B47F1A: boost::unit_test::traverse_test_tree(boost::unit_test::test_suite const&, boost::unit_test::test_tree_visitor&) (in /usr/lib/libboost_unit_test_framework-gcc-mt-1_33_1.so.1.33.1) ==27712== by 0x4B352D7: boost::unit_test::framework::run(unsigned long, bool) (in /usr/lib/libboost_unit_test_framework-gcc-mt-1_33_1.so.1.33.1) ==27712== by 0x4B40B34: main (in /usr/lib/libboost_unit_test_framework-gcc-mt-1_33_1.so.1.33.1) /home/ross/mspath/src/test/Data_test.cc(129): error in "data": check pdi4->next() == true failed [0 != 1] ==27712== The fatal error: *** glibc detected *** corrupted double-linked list: 0x00000000005fd2a0 *** Program received signal SIGABRT, Aborted. 0x00002affd559207b in *__GI_raise () from /usr/lib/debug/libc.so.6 Current language: auto; currently c (gdb) where #0 0x00002affd559207b in *__GI_raise () from /usr/lib/debug/libc.so.6 #1 0x00002affd559384e in *__GI_abort () from /usr/lib/debug/libc.so.6 #2 0x00002affd55c85f9 in __libc_message () from /usr/lib/debug/libc.so.6 #3 0x00002affd55cd9cc in malloc_consolidate () from /usr/lib/debug/libc.so.6 #4 0x00002affd55cf789 in _int_malloc () from /usr/lib/debug/libc.so.6 #5 0x00002affd55d116d in *__GI___libc_malloc () from /usr/lib/debug/libc.so.6 #6 0x00002affd518f93d in operator new () from /usr/lib/libstdc++.so.6 #7 0x00002affd516e5d1 in std::string::_Rep::_S_create () from /usr/lib/libstdc++.so.6 #8 0x00002affd516ef8b in std::string::_Rep::_M_clone () from /usr/lib/libstdc++.so.6 #9 0x00002affd516f895 in std::string::reserve () from /usr/lib/libstdc ++.so.6 #10 0x00002affd51690a5 in std::basic_stringbuf<char, std::char_traits<char>, std::allocator<char> >::overflow () from /usr/lib/libstdc++.so.6 #11 0x00002affd516da8d in std::basic_streambuf<char, std::char_traits<char> >::xsputn () from /usr/lib/libstdc++.so.6 #12 0x00002affd5163512 in std::operator<< <std::char_traits<char> > () from /usr/lib/libstdc++.so.6 #13 0x00002affd4fbb175 in boost::unit_test::results_collector_t::test_unit_finish () from /usr/lib/libboost_unit_test_framework-gcc-mt-1_33_1.so.1.33.1 #14 0x00002affd4fb8180 in boost::unit_test::framework_impl::visit () from /usr/lib/libboost_unit_test_framework-gcc-mt-1_33_1.so.1.33.1 #15 0x00002affd4fc9f1b in boost::unit_test::traverse_test_tree () from /usr/lib/libboost_unit_test_framework-gcc-mt-1_33_1.so.1.33.1 #16 0x00002affd4fb72d8 in boost::unit_test::framework::run () from /usr/lib/libboost_unit_test_framework-gcc-mt-1_33_1.so.1.33.1 #17 0x00002affd4fc2b35 in main () from /usr/lib/libboost_unit_test_framework-gcc-mt-1_33_1.so.1.33.1 #18 0x00002affd557f4ca in __libc_start_main () from /usr/lib/debug/libc.so.6 #19 0x000000000040420a in _start () at ../sysdeps/x86_64/elf/start.S:113 -- Ross Boylan wk: (415) 514-8146 185 Berry St #5700 ro...@bi... Dept of Epidemiology and Biostatistics fax: (415) 514-8150 University of California, San Francisco San Francisco, CA 94107-1739 hm: (415) 550-1062 |
|
From: <wim...@ad...> - 2009-02-05 23:34:29
|
Hi all, When I run valgrind like : valgrind --suppressions=/home/u19809/valgrind.supp --db-attach=yes --num-callers=20 and when asked to attach to the debugger I say 'y' I get this ... ==6488== by 0x5951D4B: Wrap_QTA_Run(Instance_S*, Instance_S*, AC_Variant_S*) (QtApp.c:231) ==6488== by 0x804B61E: main (main.c:714) ==6488== ==6488== ---- Attach to debugger ? --- [Return/N/n/Y/y/C/c] ---- y ==6488== starting debugger with cmd: /usr/bin/gdb -nw /proc/6614/fd/1014 6614 GNU gdb 6.8-debian Copyright (C) 2008 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "i486-linux-gnu"... Attaching to program: /proc/6614/fd/1014, process 6614 Cannot access memory at address 0x448a0003 A program is being debugged already. Kill it? (y or n) n Program not killed. (gdb) What is wronng ? Thx W |
|
From: Vikas R. <min...@gm...> - 2009-02-05 21:09:37
|
Valgrind does not consider memory leaks to be errors. e.g.: valgrind --leak-check=full --error-exitcode=1 ./a.out ==13671== ==13671== 38 (8 direct, 30 indirect) bytes in 1 blocks are definitely lost in loss record 1 of 2 ==13671== at 0x4A06019: operator new(unsigned long) (vg_replace_malloc.c:167) ==13671== by 0x400A31: main (in /home/vikas/valgrind_tests/a.out) myhost:myname valgrind_tests $ valgrind --leak-check=full --error-exitcode=1 ./a.out ==13672== Memcheck, a memory error detector. ==13672== Copyright (C) 2002-2006, and GNU GPL'd, by Julian Seward et al. ==13672== Using LibVEX rev 1658, a library for dynamic binary translation. ==13672== Copyright (C) 2004-2006, and GNU GPL'd, by OpenWorks LLP. ==13672== Using valgrind-3.2.1, a dynamic binary instrumentation framework. ==13672== Copyright (C) 2000-2006, and GNU GPL'd, by Julian Seward et al. ==13672== For more details, rerun with: -v ==13672== ==13672== ==13672== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 4 from 1) ==13672== malloc/free: in use at exit: 38 bytes in 2 blocks. ==13672== malloc/free: 3 allocs, 1 frees, 68 bytes allocated. ==13672== For counts of detected errors, rerun with: -v ==13672== searching for pointers to 2 not-freed blocks. ==13672== checked 162,952 bytes. ==13672== ==13672== 38 (8 direct, 30 indirect) bytes in 1 blocks are definitely lost in loss record 1 of 2 ==13672== at 0x4A06019: operator new(unsigned long) (vg_replace_malloc.c:167) ==13672== by 0x400A31: main (in /home/vikas/valgrind_tests/a.out) ==13672== ==13672== LEAK SUMMARY: ==13672== definitely lost: 8 bytes in 1 blocks. ==13672== indirectly lost: 30 bytes in 1 blocks. ==13672== possibly lost: 0 bytes in 0 blocks. ==13672== still reachable: 0 bytes in 0 blocks. ==13672== suppressed: 0 bytes in 0 blocks. ==13672== Reachable blocks (those to which a pointer was found) are not shown. ==13672== To see them, rerun with: --show-reachable=yes myhost:myname valgrind_tests $ echo $? 0 As a consequence (or perhaps in addition), --db-attach does not cause valgrind to break into the debugger when a memory leak is encountered. Is there a way to tell valgrind to treat memory leaks as errors? If not, is there a way to break into the debugger when a memory leak is encountered? Thanks, -Vikas |
|
From: Christoph B. <bar...@or...> - 2009-02-05 18:35:38
|
Am Donnerstag, 5. Februar 2009 schrieb tom fogal: > I'm getting a bit off topic, but .. > > Perhaps I'm just not understanding the linked-to discussion, but given > this interpretation -- how could one ever delete a barrier? > > It sounds like the only safe way to destroy the barrier is if you've > joined every thread which could have possibly used it. Given that > constraint, I'm not sure how real world software could reasonably deal > with this. > > So is the idea essentially that we might as well forget about > destroying barriers? What am I missing? 1. There is the proposal to fix the standard by allowing the thread that gets the return value of PTHREAD_BARRIER_SERIAL_THREAD to destroy the barrier. 2. You can delete the barrier as soon as you know that all threads left the call to pthread_barrier_wait(). This can be done by other synchronisation primitives like another barrier, a lock, a condvar or a join. Christoph |
|
From: tom f. <tf...@al...> - 2009-02-05 17:47:24
|
Christoph Bartoschek <bar...@or...> writes: > Am Donnerstag, 5. Februar 2009 schrieb Julian Seward: > > On Thursday 05 February 2009, Christoph Bartoschek wrote: > > > Hi, > > > > > > today I had to learn that the attached program is incorrect. It is not > > > allowed to destroy the barrier while not all threads have left the > > > pthread_barrier_wait() call. > > > > > > Unfortunately neither DRD nor Helgrind warn about this error. Could you > > > please improve the tools to detect such errors? > > > > Um, where is the bug in this program? To me it looks OK: the barrier > > is not destroyed until after both parent and child have passed it. > > > > For me this also looked ok till now. But the standard seems not to guarantee > that this works. > > I have asked about this in comp.programming.threads: > > http://groups.google.com/group/comp.programming.threads/browse_thread/thread/ > 4f65535d6192aa50/a5f4bf1e3b437c4d?lnk=st&q=#a5f4bf1e3b437c4d > > My explanation is that the threads need still access to the barrier after > being woken up from the wait. > > When the last thread reaches the barrier all waiting threads are woken up but > they are not yet finished with pthread_barrier_wait(). When the first thread > leaving pthread_barrier_wait() destroys the barrier, then the other threads > cannot perform their final tasks in pthread_barrier_wait(). I'm getting a bit off topic, but .. Perhaps I'm just not understanding the linked-to discussion, but given this interpretation -- how could one ever delete a barrier? It sounds like the only safe way to destroy the barrier is if you've joined every thread which could have possibly used it. Given that constraint, I'm not sure how real world software could reasonably deal with this. So is the idea essentially that we might as well forget about destroying barriers? What am I missing? -tom |
|
From: Christoph B. <bar...@or...> - 2009-02-05 17:35:16
|
Am Donnerstag, 5. Februar 2009 schrieb Julian Seward: > On Thursday 05 February 2009, Christoph Bartoschek wrote: > > Hi, > > > > today I had to learn that the attached program is incorrect. It is not > > allowed to destroy the barrier while not all threads have left the > > pthread_barrier_wait() call. > > > > Unfortunately neither DRD nor Helgrind warn about this error. Could you > > please improve the tools to detect such errors? > > Um, where is the bug in this program? To me it looks OK: the barrier > is not destroyed until after both parent and child have passed it. > For me this also looked ok till now. But the standard seems not to guarantee that this works. I have asked about this in comp.programming.threads: http://groups.google.com/group/comp.programming.threads/browse_thread/thread/4f65535d6192aa50/a5f4bf1e3b437c4d?lnk=st&q=#a5f4bf1e3b437c4d My explanation is that the threads need still access to the barrier after being woken up from the wait. When the last thread reaches the barrier all waiting threads are woken up but they are not yet finished with pthread_barrier_wait(). When the first thread leaving pthread_barrier_wait() destroys the barrier, then the other threads cannot perform their final tasks in pthread_barrier_wait(). Christoph |
|
From: Julian S. <js...@ac...> - 2009-02-05 15:42:35
|
First of all, are you able to compile and run (on Valgrind) any simple program (eg, hello world), in 64-bit mode? J > ==7237910== Process terminating with default action of signal 11 (SIGSEGV): > dump > ing core > ==7237910== Access not within mapped region at address 0xFFFFFFFFFFFFFFFE > ==7237910== at 0x9FFFFFFF0004BFC: usl_relocate1 (in /usr/ccs/bin/usla64) > ==7237910== by 0x9FFFFFFF000722B: usl_relocate (in /usr/ccs/bin/usla64) > ==7237910== by 0x9FFFFFFF000086F: usla_main (in /usr/ccs/bin/usla64) > ==7237910== by 0x9FFFFFFF000024B: ustart (in /usr/ccs/bin/usla64) |
|
From: Julian S. <js...@ac...> - 2009-02-05 15:34:03
|
On Thursday 05 February 2009, Christoph Bartoschek wrote: > Hi, > > today I had to learn that the attached program is incorrect. It is not > allowed to destroy the barrier while not all threads have left the > pthread_barrier_wait() call. > > Unfortunately neither DRD nor Helgrind warn about this error. Could you > please improve the tools to detect such errors? Um, where is the bug in this program? To me it looks OK: the barrier is not destroyed until after both parent and child have passed it. J |
|
From: Christoph B. <bar...@or...> - 2009-02-05 15:26:51
|
Hi,
today I had to learn that the attached program is incorrect. It is not allowed
to destroy the barrier while not all threads have left the
pthread_barrier_wait() call.
Unfortunately neither DRD nor Helgrind warn about this error. Could you please
improve the tools to detect such errors?
Christoph
#include <pthread.h>
#include <stdlib.h>
pthread_barrier_t * barrier;
void * thread(void * arg) {
pthread_barrier_wait(barrier);
return NULL;
}
int main() {
pthread_t tid;
barrier = (pthread_barrier_t *) malloc(sizeof(*barrier));
pthread_barrier_init(barrier, NULL, 2);
pthread_create(&tid, NULL, thread, NULL);
pthread_barrier_wait(barrier);
pthread_barrier_destroy(barrier);
free(barrier);
pthread_join(tid, NULL);
return 0;
}
|
|
From: Ashley P. <as...@pi...> - 2009-02-05 14:46:31
|
[sending again ccing the list this time]
2009/2/4 Meseret Gebre <mez...@gm...>
> Greetings,
>
> I have a simple mpi program, please find it attached.
> I am using valgrind-3.3.0
>
> I run my code with the following command:
> mpiexec -n 2 valgrind --gen-suppressions=yes ./mpitest
--gen-suppressions=all is probably closer to what you want, this won't
prompt you for each one.
> my question is about the ( --gen-suppressions=yes).
> There are too many suppression that come from MPI and
> all I want to be able to see is checks from my code alone.
Due to the nature of how code works this isn't really possible, in most
cases errors reported by valgrind aren't errors in the bottom level of the
stack but come from further up the call chain. If your program calls a MPI
function with bogus values it can appear the problem is in the MPI library
but this often isn't the case. I'm afraid you need to look at every error
and deal with it on a case by case bases.
> I am very new to valgrind and I have checked to the best of my ability
> and probably have missed it, but is there a way to get valgrind to just add
>
> all the suppression generated from the ( --gen-suppressions=yes) command
> into
> a local file?
You can use the --log-file=<filename> option to put the output of valgrind
to a file, this will interspere it with the error messages so you'll need to
manually edit these files still. For MPI jobs I'd recommend using
--log-file=val-log.%q{XXX} where XXX is the name of an enviromnent variable,
most MPI's set a variable to be the processes rank and you can name your
files according to rank if you do this. You'll need to find out what name
mvapich uses for this.
Ashley Pittman.
|
|
From: denis j. <den...@gm...> - 2009-02-05 14:21:33
|
Hello, On a very big software we develop on AIX 5.3 I got this error after trying to increase --main-stackssize at 133000000 (because previous similar errors) but with a smaller software (very very small) i have the same error, is it because i use xlc 9.0 as compiler ? ==7237910== Process terminating with default action of signal 11 (SIGSEGV): dump ing core ==7237910== Access not within mapped region at address 0xFFFFFFFFFFFFFFFE ==7237910== at 0x9FFFFFFF0004BFC: usl_relocate1 (in /usr/ccs/bin/usla64) ==7237910== by 0x9FFFFFFF000722B: usl_relocate (in /usr/ccs/bin/usla64) ==7237910== by 0x9FFFFFFF000086F: usla_main (in /usr/ccs/bin/usla64) ==7237910== by 0x9FFFFFFF000024B: ustart (in /usr/ccs/bin/usla64) ==7237910== If you believe this happened as a result of a stack overflow in your ==7237910== program's main thread (unlikely but possible), you can try to increase ==7237910== the size of the main thread stack using the --main-stacksize= flag. ==7237910== The main thread stack size used in this run was 134045696. I can't increase main-stacksize furthermore as said by valgrind. so my questions is : or there are no chance it works ? or is there another way to make it works with valgrind ? thanks |
|
From: Meseret G. <mez...@gm...> - 2009-02-04 23:15:12
|
#include <mpi.h>
#include <iostream>
#include <vector>
#include <map>
using namespace std;
int main(int argc, char* argv[]){
int rank;
MPI::Init(argc,argv);
rank = MPI::COMM_WORLD.Get_rank();
if (rank == 0){
char* hello = "hello mez";
MPI::COMM_WORLD.Send(hello, 10, MPI::CHAR,1,1);
}
else{
MPI::Status status;
char hello[10];
MPI::COMM_WORLD.Recv( hello,10, MPI::CHAR , 0 , 1, status);
cout << "Got From Root: " << hello << endl;
}//end if
MPI::Finalize();
return 0;
}
|
|
From: Tom H. <to...@co...> - 2009-02-02 18:42:29
|
James wrote: > First, when I do the configure, I get messages > > VG_ARCH_MAX = amd64 > VG_ARCH_PRI = amd64 > > Primary build target: AMD64_LINUX > Secondary build target: X86_LINUX > > The machine has an Intel Core2 Duo processor, not AMD. I don't see any > options in > the configure or any of the included files that would change it. How do I > get it > configured for the correct architecture? AMD64 is just the name we use for the 64 bit x86 architecture, which was after all created by AMD. It works just fine on 64 bit Intel processors. > /usr/lib64/gcc/x86_64-suse-linux/4.3/../../../../x86_64-suse-linux/bin/ld: > /opt/mpich2/lib/libmpich.a(comm_rank.o): relocation R_X86_64_32 against > `a local symbol' > can not be used when making a shared object; recompile with -fPIC > /opt/mpich2/lib/libmpich.a: could not read symbols: Bad value > collect2: ld returned 1 exit status > > which aborts the compilation. See Julian's email of a few hours ago - either uninstall mpicc or configure --with-mpicc=/some/path/which/does/not/exist to stop it trying to build the mpi wrappers. Either that or you will need to find a shared library version of the mpich library as you can't link a static library into a shared library on amd64. Tom -- Tom Hughes (to...@co...) http://www.compton.nu/ |
|
From: James <ja...@ch...> - 2009-02-02 18:24:30
|
Hi,
I'm trying to install the latest release of Valgrind (3.4.0) on my
machine, which
has an Intel Core2 Duo processor running SUSE 11.0 (64 bit), but am
getting errors
in configuration and compilation.
First, when I do the configure, I get messages
VG_ARCH_MAX = amd64
VG_ARCH_PRI = amd64
Primary build target: AMD64_LINUX
Secondary build target: X86_LINUX
The machine has an Intel Core2 Duo processor, not AMD. I don't see any
options in
the configure or any of the included files that would change it. How do I
get it
configured for the correct architecture?
Second, I tried to do a make anyway, on the chance that it might work for
my particular
problem even with the wrong architecture, but I get a number of warnings
about casting
pointer to integer of different size, then a final error
/usr/lib64/gcc/x86_64-suse-linux/4.3/../../../../x86_64-suse-linux/bin/ld:
/opt/mpich2/lib/libmpich.a(comm_rank.o): relocation R_X86_64_32 against
`a local symbol'
can not be used when making a shared object; recompile with -fPIC
/opt/mpich2/lib/libmpich.a: could not read symbols: Bad value
collect2: ld returned 1 exit status
which aborts the compilation.
Any suggestions on how I can get Valgrind to compile & install? The
problem I need to use
Valgrind on doesn't use MPI, BTW, so that part could just be skipped if
there's a way to
do it.
Thanks,
James
|
|
From: Matt F. <mat...@gm...> - 2009-02-02 16:20:18
|
Thank you, that seemed to have worked. matt On Monday 02 February 2009, Julian Seward wrote: > On Monday 02 February 2009, Matt Funk wrote: > > When i do a simple: > > ./configure --prefix=/home/mafunk/Packages/valgrind-3.4.0/BUILD > > it configures. However it fails during the build with: > > > > mpicc -g -O -fno-omit-frame-pointer -Wall -fpic -shared -m64 \ > > -I../include \ > > -o libmpiwrap-AMD64_LINUX.so libmpiwrap.c > > libmpiwrap.c: In function 'walk_type': > > libmpiwrap.c:651: warning: cast from pointer to integer of different size > > libmpiwrap.c:652: warning: cast from pointer to integer of different size > > libmpiwrap.c:657: warning: cast from pointer to integer of different size > > libmpiwrap.c:658: warning: cast from pointer to integer of different size > > libmpiwrap.c:663: warning: cast from pointer to integer of different size > > libmpiwrap.c:664: warning: cast from pointer to integer of different size > > libmpiwrap.c:669: warning: cast from pointer to integer of different size > > libmpiwrap.c:670: warning: cast from pointer to integer of different size > > libmpiwrap.c:675: warning: cast from pointer to integer of different size > > libmpiwrap.c:676: warning: cast from pointer to integer of different size > > libmpiwrap.c:681: warning: cast from pointer to integer of different size > > libmpiwrap.c:682: warning: cast from pointer to integer of different size > > libmpiwrap.c: In function 'maybe_complete': > > libmpiwrap.c:1306: warning: format '%p' expects type 'void *', but > > argument 5 has type 'MPI_Request' > > /usr/bin/ld: /usr/local/lib/libmpich.a(comm_rank.o): relocation > > R_X86_64_32 against `a local symbol' can not be used when making a shared > > object; recompile with -fPIC > > /usr/local/lib/libmpich.a: could not read symbols: Bad value > > collect2: ld returned 1 exit status > > make[2]: *** [libmpiwrap-AMD64_LINUX.so] Error 1 > > make[2]: Leaving directory > > `/home/mafunk/Packages/valgrind-3.4.0/auxprogs' make[1]: *** > > [all-recursive] Error 1 > > make[1]: Leaving directory `/home/mafunk/Packages/valgrind-3.4.0' > > make: *** [all] Error 2 > > > > So that is why it tried to pass some of those flags ... > > Yes. Known problem. Configure again > > with --with-mpicc=/some/path/that/does/not/exist, > > so that it will not try to use mpicc at all. > > J |