|
From: Carl L. <ce...@us...> - 2023-04-24 21:29:35
|
Mark:
Thanks for the pointer on IRC chat to the RC2 tarball. I must have
missed the email.
I noticed the following message on a Power 10 system.
./autogen.sh
running: aclocal
running: autoheader
running: automake -a
Unescaped left brace in regex is passed through in regex; marked by <--
HERE in m/\${ <-- HERE ([^ \t=:+{}]+)}/ at /home/carll/bin/automake
line 3936.
running: autoconf
The autoconf version is:
autoconf --version
autoconf (GNU Autoconf) 2.69
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+/Autoconf: GNU GPL version 3 or later
Note the autoconf seemed to work ok. I was still able to configure and
build Valgrind and run the testsuite.
The test failures on Power 10 that I saw on one of the Power 10 systems
was fixed, as expected.
I see the following on both Power 10 systems.
== 714 tests, 2 stderr failures, 0 stdout failures, 0 stderrB failures,
0 stdoutB failures, 2 post failures ==
memcheck/tests/bug340392 (stderr)
memcheck/tests/linux/rfcomm (stderr)
massif/tests/new-cpp (post)
massif/tests/overloaded-new (post)
The results on Power 9 and Power 8 are fine as well.
Other than the autoconf message, the RC2 testing looks good on Power.
Note, the autoconf issue doesn't seem to impact the ability to
configure and run Valgrind.
Carl
|
|
From: Mark W. <ma...@kl...> - 2023-04-27 16:29:39
|
Hi Carl,
On Mon, Apr 24, 2023 at 02:29:13PM -0700, Carl Love wrote:
> Mark:
>
> Thanks for the pointer on IRC chat to the RC2 tarball. I must have
> missed the email.
>
> I noticed the following message on a Power 10 system.
>
> ./autogen.sh
> running: aclocal
> running: autoheader
> running: automake -a
> Unescaped left brace in regex is passed through in regex; marked by <--
> HERE in m/\${ <-- HERE ([^ \t=:+{}]+)}/ at /home/carll/bin/automake
> line 3936.
> running: autoconf
>
> The autoconf version is:
>
> autoconf --version
> autoconf (GNU Autoconf) 2.69
> Copyright (C) 2012 Free Software Foundation, Inc.
> License GPLv3+/Autoconf: GNU GPL version 3 or later
>
> Note the autoconf seemed to work ok. I was still able to configure and
> build Valgrind and run the testsuite.
It is actually automake that produces the warning. But it is indeed
harmless. It is produced by a newer perl (>= 5.21.1) used with an
older automake (< 1.15.1).
> The test failures on Power 10 that I saw on one of the Power 10 systems
> was fixed, as expected.
>
> I see the following on both Power 10 systems.
>
> == 714 tests, 2 stderr failures, 0 stdout failures, 0 stderrB failures,
> 0 stdoutB failures, 2 post failures ==
> memcheck/tests/bug340392 (stderr)
> memcheck/tests/linux/rfcomm (stderr)
> massif/tests/new-cpp (post)
> massif/tests/overloaded-new (post)
So pretty close to zero fail.
bug340392 is mentioned as failing for ppc64 in in
https://bugs.kde.org/show_bug.cgi?id=352364 I wonder if we can mark it
known-fail/ignore with a reference to that bug?
I cannot easily test rfcomm my self since I don't seem to have
bluetooth enabled on my powerpc setup.
new-cpp/overload-new (post) checks are very dependent on the version
of libstdc++ installed. I don't know how to make these tests less
fragile.
Cheers,
Mark
|
|
From: Carl L. <ce...@us...> - 2023-04-27 16:47:56
|
On Thu, 2023-04-27 at 18:29 +0200, Mark Wielaard wrote: > Hi Carl, > > <snip> > > The test failures on Power 10 that I saw on one of the Power 10 > > systems > > was fixed, as expected. > > > > I see the following on both Power 10 systems. > > > > == 714 tests, 2 stderr failures, 0 stdout failures, 0 stderrB > > failures, > > 0 stdoutB failures, 2 post failures == > > memcheck/tests/bug340392 (stderr) > > memcheck/tests/linux/rfcomm (stderr) > > massif/tests/new-cpp (post) > > massif/tests/overloaded-new (post) > > So pretty close to zero fail. > > bug340392 is mentioned as failing for ppc64 in in > https://bugs.kde.org/show_bug.cgi?id=352364 I wonder if we can mark > it > known-fail/ignore with a reference to that bug? I read thru the bugzilla. I have never delved into this bug. From the sounds, this is very PPC specific. The effort to make this comparison highly accurate is not trivial. The bug has been around for 8 years and other than this test, no one seems to have needed the really high level of accuracy in the comparison. So, yea, I would be fine as marking this as a known fail and reference the bug. Carl |
|
From: Paul F. <pj...@wa...> - 2023-04-27 20:02:06
|
On 24/04/2023 23:29, Carl Love via Valgrind-developers wrote: > == 714 tests, 2 stderr failures, 0 stdout failures, 0 stderrB failures, > 0 stdoutB failures, 2 post failures == > massif/tests/new-cpp (post) > massif/tests/overloaded-new (post) > And these two might be easy to fix if you run the testcase normally and see if you can identify the extra allocating functions and add them to the vgtest files in the --ignore-fn list. E.g., for new-exp there is vgopts: --stacks=no --time-unit=B --massif-out-file=massif.out vgopts: --ignore-fn=__part_load_locale --ignore-fn=__time_load_locale --ignore-fn=dwarf2_unwind_dyld_add_image_hook vgopts: --ignore-fn=get_or_create_key_element --ignore-fn=_GLOBAL__sub_I_eh_alloc.cc --ignore-fn=call_init.part.0 (On FreeBSD I ended up adding a couple of extra expected for when running GCC builds because stripped libstdc++ and no debuginfo package). A+ Paul |
|
From: Carl L. <ce...@us...> - 2023-04-28 17:42:13
|
Paul:
On Thu, 2023-04-27 at 22:01 +0200, Paul Floyd wrote:
> On 24/04/2023 23:29, Carl Love via Valgrind-developers wrote:
> > == 714 tests, 2 stderr failures, 0 stdout failures, 0 stderrB
> > failures,
> > 0 stdoutB failures, 2 post failures ==
> > massif/tests/new-cpp (post)
> > massif/tests/overloaded-new (post)
> >
> And these two might be easy to fix if you run the testcase normally
> and
> see if you can identify the extra allocating functions and add them
> to
> the vgtest files in the --ignore-fn list.
So, just playing around with this to try and understand what causes the
post failures.
Looking at the new-cpp.vgtest file and the output from running regtes,
It looks like the test is run with the command (edited a little to make
it readable):
valgrind --tool=massif --stacks=no --time-unit=B --massif-out-file=massif.out
--ignore-fn=__part_load_locale --ignore-fn=__time_load_locale
--ignore-fn=dwarf2_unwind_dyld_add_image_hook --ignore-fn=get_or_create_key_element
--ignore-fn=_GLOBAL__sub_I_eh_alloc.cc --ignore-fn=call_init.part.0
--ignore-fn=call_init ./new-cpp
The command line output is:
==789558== Massif, a heap profiler
==789558== Copyright (C) 2003-2017, and GNU GPL'd, by Nicholas Nethercote
==789558== Using Valgrind-3.21.0.RC2 and LibVEX; rerun with -h for copyright info
==789558== Command: ./new-cpp
==789558==
==789558==
The contents of the output file massif.out is:
time_unit: B
#-----------
snapshot=0
#-----------
time=0
mem_heap_B=0
mem_heap_extra_B=0
mem_stacks_B=0
heap_tree=empty
#-----------
snapshot=1
#-----------
time=4008
mem_heap_B=4000
mem_heap_extra_B=8
mem_stacks_B=0
heap_tree=empty
#-----------
snapshot=2
#-----------
time=8016
mem_heap_B=8000
mem_heap_extra_B=16
mem_stacks_B=0
heap_tree=empty
#-----------
snapshot=3
#-----------
time=10024
mem_heap_B=10000
mem_heap_extra_B=24
mem_stacks_B=0
heap_tree=empty
#-----------
snapshot=4
#-----------
time=12032
mem_heap_B=12000
mem_heap_extra_B=32
mem_stacks_B=0
heap_tree=empty
#-----------
snapshot=5
#-----------
time=12032
mem_heap_B=12000
mem_heap_extra_B=32
mem_stacks_B=0
heap_tree=peak
n4: 12000 (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
n0: 4000 0x10000973: main (new-cpp.cpp:19)
n0: 4000 0x1000098F: main (new-cpp.cpp:20)
n0: 2000 0x100009A3: main (new-cpp.cpp:21)
n0: 2000 0x100009BB: main (new-cpp.cpp:22)
#-----------
snapshot=6
#-----------
time=16040
mem_heap_B=8000
mem_heap_extra_B=24
mem_stacks_B=0
heap_tree=empty
#-----------
snapshot=7
#-----------
time=20048
mem_heap_B=4000
mem_heap_extra_B=16
mem_stacks_B=0
heap_tree=empty
#-----------
snapshot=8
#-----------
time=22056
mem_heap_B=2000
mem_heap_extra_B=8
mem_stacks_B=0
heap_tree=empty
#-----------
snapshot=9
#-----------
time=24064
mem_heap_B=0
mem_heap_extra_B=0
mem_stacks_B=0
heap_tree=empty
I then ran the perl script:
perl ../../massif/ms_print massif.out | sed 's/gcc[0-9]*/gcc/' | ../../tests/filter_addresses
KB
11.75^ ###########
| #
| #
| #
| :::::::#
| : #
| : #
| ::::::: # ::::::::::::
| : : # :
| : : # :
| : : # :
| : : # :
| : : # :
| : : # :
| ::::::::::::: : # : ::::::
| : : : # : :
| : : : # : :
| : : : # : : ::::::
| : : : # : : :
| : : : # : : :
0 +----------------------------------------------------------------------->KB
0 23.50
Number of snapshots: 10
Detailed snapshots: [5 (peak)]
--------------------------------------------------------------------------------
n time(B) total(B) useful-heap(B) extra-heap(B) stacks(B)
--------------------------------------------------------------------------------
0 0 0 0 0 0
1 4,008 4,008 4,000 8 0
2 8,016 8,016 8,000 16 0
3 10,024 10,024 10,000 24 0
4 12,032 12,032 12,000 32 0
5 12,032 12,032 12,000 32 0
99.73% (12,000B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc.
->33.24% (4,000B) 0x........: main (new-cpp.cpp:19)
|
->33.24% (4,000B) 0x........: main (new-cpp.cpp:20)
|
->16.62% (2,000B) 0x........: main (new-cpp.cpp:21)
|
->16.62% (2,000B) 0x........: main (new-cpp.cpp:22)
--------------------------------------------------------------------------------
n time(B) total(B) useful-heap(B) extra-heap(B) stacks(B)
--------------------------------------------------------------------------------
6 16,040 8,024 8,000 24 0
7 20,048 4,016 4,000 16 0
8 22,056 2,008 2,000 8 0
9 24,064 0 0 0 0
I don't see any explicit "post" errors printed? So I am clearly
missing something. :-)
I am hoping you can give me a hint here as I am not seeing where the
issue with the test is. Thanks.
Carl
|
|
From: Paul F. <pj...@wa...> - 2023-04-28 18:31:10
|
On 28-04-23 18:04, Carl Love wrote:
> I don't see any explicit "post" errors printed? So I am clearly
> missing something. :-)
>
> I am hoping you can give me a hint here as I am not seeing where the
> issue with the test is. Thanks.
Hi Carl
The test is supposed to be profiling calls to new. That should turn
s* p1 = new s;
s* p2 = new (std::nothrow) s;
char* c1 = new char[2000];
char* c2 = new (std::nothrow) char[2000];
into
KB
11.75^ ###########
| #
| #
| #
| :::::::#
| : #
| : #
| ::::::: # ::::::::::::
| : : # :
| : : # :
| : : # :
| : : # :
| : : # :
| : : # :
| ::::::::::::: : # : ::::::
| : : : # : :
| : : : # : :
| : : : # : : :
| : : : # : : :
| : : : # : : :
0
+----------------------------------------------------------------------->KB
0
23.50
Unfortunately the pesky libc and libstdc++ can get in the way and do
other, system dependent allocations. These allocations are for stuff
like exception handlers and dynamic loading.
The --ignore-fn options are supposed to filter out all those extras,
leaving us with only the calls to operator new and delete.
I thought that you might have a different function on power 10. But
looking at the diff from what you posted I only see
47c42
< ->33.24% (4,000B) 0x........: main (new-cpp.cpp:20)
---
> ->33.24% (4,000B) 0x........: main (new-cpp.cpp:20)
That's an extra trailing whitespace.
Could you post the post.diff files, if they aren't too big?
A+
Paul
|
|
From: Carl L. <ce...@us...> - 2023-04-28 19:57:10
|
On Fri, 2023-04-28 at 20:30 +0200, Paul Floyd wrote:
>
> On 28-04-23 18:04, Carl Love wrote:
>
> > I don't see any explicit "post" errors printed? So I am clearly
> > missing something. :-)
> >
>
< snip >
> Unfortunately the pesky libc and libstdc++ can get in the way and do
> other, system dependent allocations. These allocations are for stuff
> like exception handlers and dynamic loading.
>
> The --ignore-fn options are supposed to filter out all those extras,
> leaving us with only the calls to operator new and delete.
>
> I thought that you might have a different function on power 10. But
> looking at the diff from what you posted I only see
>
> 47c42
> < ->33.24% (4,000B) 0x........: main (new-cpp.cpp:20)
> ---
> > ->33.24% (4,000B) 0x........: main (new-cpp.cpp:20)
>
> That's an extra trailing whitespace.
>
> Could you post the post.diff files, if they aren't too big?
OK, not too big. Here is the diff.
--- new-cpp.post.exp 2023-04-21 20:54:40.000000000 -0400
+++ new-cpp.post.out 2023-04-24 16:29:56.907990371 -0400
@@ -6,54 +6,61 @@
KB
-11.75^ ###########
- | #
- | #
- | #
- | :::::::#
- | : #
- | : #
- | ::::::: # ::::::::::::
- | : : # :
- | : : # :
- | : : # :
- | : : # :
- | : : # :
- | : : # :
- | ::::::::::::: : # : ::::::
- | : : : # : :
- | : : : # : :
- | : : : # : : ::::::
- | : : : # : : :
- | : : : # : : :
+82.76^ #
+ | ::#::
+ | ::::#: :
+ | ::: ::#: ::::::::::::::::::::::::::::::::
+ | : : ::#: :::
+ | : : ::#: :::
+ | : : ::#: :::
+ | : : ::#: :::
+ | : : ::#: :::
+ | : : ::#: :::
+ | : : ::#: :::
+ | : : ::#: :::
+ | : : ::#: :::
+ | : : ::#: :::
+ | : : ::#: :::
+ | : : ::#: :::
+ | : : ::#: :::
+ | : : ::#: :::
+ | : : ::#: :::
+ | : : ::#: :::
0 +----------------------------------------------------------------------->K
B
- 0 23.50
+ 0 165.5
-Number of snapshots: 10
- Detailed snapshots: [5 (peak)]
+Number of snapshots: 12
+ Detailed snapshots: [6 (peak)]
-------------------------------------------------------------------------------
-
n time(B) total(B) useful-heap(B) extra-heap(B) stacks(B)
-------------------------------------------------------------------------------
-
0 0 0 0 0 0
- 1 4,008 4,008 4,000 8 0
- 2 8,016 8,016 8,000 16 0
- 3 10,024 10,024 10,000 24 0
- 4 12,032 12,032 12,000 32 0
- 5 12,032 12,032 12,000 32 0
-99.73% (12,000B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc
.
-->33.24% (4,000B) 0x........: main (new-cpp.cpp:19)
+ 1 72,712 72,712 72,704 8 0
+ 2 76,720 76,720 76,704 16 0
+ 3 80,728 80,728 80,704 24 0
+ 4 82,736 82,736 82,704 32 0
+ 5 84,744 84,744 84,704 40 0
+ 6 84,744 84,744 84,704 40 0
+99.95% (84,704B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc
.
+->85.79% (72,704B) 0x........: ??? (m_trampoline.S:458)
+| ->85.79% (72,704B) 0x........: call_init (dl-init.c:70)
+| ->85.79% (72,704B) 0x........: _dl_init (dl-init.c:117)
+| ->85.79% (72,704B) 0x........: _dl_start_user (in /usr/lib64/ld64.so.2)
+|
+->04.72% (4,000B) 0x........: main (new-cpp.cpp:19)
|
-->33.24% (4,000B) 0x........: main (new-cpp.cpp:20)
+->04.72% (4,000B) 0x........: main (new-cpp.cpp:20)
|
-->16.62% (2,000B) 0x........: main (new-cpp.cpp:21)
+->02.36% (2,000B) 0x........: main (new-cpp.cpp:21)
|
-->16.62% (2,000B) 0x........: main (new-cpp.cpp:22)
+->02.36% (2,000B) 0x........: main (new-cpp.cpp:22)
-------------------------------------------------------------------------------
-
n time(B) total(B) useful-heap(B) extra-heap(B) stacks(B)
-------------------------------------------------------------------------------
-
- 6 16,040 8,024 8,000 24 0
- 7 20,048 4,016 4,000 16 0
- 8 22,056 2,008 2,000 8 0
- 9 24,064 0 0 0 0
+ 7 88,752 80,736 80,704 32 0
+ 8 92,760 76,728 76,704 24 0
+ 9 94,768 74,720 74,704 16 0
+ 10 96,776 72,712 72,704 8 0
+ 11 169,488 0 0 0 0
|
|
From: Paul F. <pj...@wa...> - 2023-04-29 06:20:22
|
On 28-04-23 21:33, Carl Love wrote: > . > +->85.79% (72,704B) 0x........: ??? (m_trampoline.S:458) > +| ->85.79% (72,704B) 0x........: call_init (dl-init.c:70) The should be filtered by vgopts: --ignore-fn=call_init I need to do some debugging to see what is happening - I can reproduce the error on one of the gccfarm machines. A+ Paul |
|
From: Paul F. <pj...@wa...> - 2023-04-29 16:22:27
|
On 29-04-23 08:20, Paul Floyd wrote:
>
>
> On 28-04-23 21:33, Carl Love wrote:
>
>> .
>> +->85.79% (72,704B) 0x........: ??? (m_trampoline.S:458)
>> +| ->85.79% (72,704B) 0x........: call_init (dl-init.c:70)
>
> The should be filtered by
>
> vgopts: --ignore-fn=call_init
>
> I need to do some debugging to see what is happening - I can reproduce
> the error on one of the gccfarm machines.
I see what is happening now. The stack in question is
==2756940== at 0x48A4C8C: malloc (vg_replace_malloc.c:431)
==2756940== by 0x58025633: ??? (m_trampoline.S:458)
==2756940== by 0x4007D17: call_init (dl-init.c:70)
==2756940== by 0x4007D17: _dl_init (dl-init.c:117)
==2756940== by 0x40311E7: _dl_start_user (in
/usr/lib/powerpc64-linux-gnu/ld64.so.1)
Note the identical addresses for call_init and _dl_init. I believe that
means that call_init is inlined.
This bit of code in ms_main.c skips over call_init
// top has no fnname => search for the first entry that has a fnname
for (i = *top; i < n_ips && !top_has_fnname; i++) {
top_has_fnname = VG_(get_fnname)(ep, ips[i], &fnname);
}
The workaround is to add _dl_init to the ignore functions.
Otherwise I think that the above loop needs to be modified to use
VG_(next_IIPC)(InlIPCursor *iipc)
I've created a bugzilla item for this
https://bugs.kde.org/show_bug.cgi?id=469146
A+
Paul
|