You can subscribe to this list here.
| 2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
(122) |
Nov
(152) |
Dec
(69) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2003 |
Jan
(6) |
Feb
(25) |
Mar
(73) |
Apr
(82) |
May
(24) |
Jun
(25) |
Jul
(10) |
Aug
(11) |
Sep
(10) |
Oct
(54) |
Nov
(203) |
Dec
(182) |
| 2004 |
Jan
(307) |
Feb
(305) |
Mar
(430) |
Apr
(312) |
May
(187) |
Jun
(342) |
Jul
(487) |
Aug
(637) |
Sep
(336) |
Oct
(373) |
Nov
(441) |
Dec
(210) |
| 2005 |
Jan
(385) |
Feb
(480) |
Mar
(636) |
Apr
(544) |
May
(679) |
Jun
(625) |
Jul
(810) |
Aug
(838) |
Sep
(634) |
Oct
(521) |
Nov
(965) |
Dec
(543) |
| 2006 |
Jan
(494) |
Feb
(431) |
Mar
(546) |
Apr
(411) |
May
(406) |
Jun
(322) |
Jul
(256) |
Aug
(401) |
Sep
(345) |
Oct
(542) |
Nov
(308) |
Dec
(481) |
| 2007 |
Jan
(427) |
Feb
(326) |
Mar
(367) |
Apr
(255) |
May
(244) |
Jun
(204) |
Jul
(223) |
Aug
(231) |
Sep
(354) |
Oct
(374) |
Nov
(497) |
Dec
(362) |
| 2008 |
Jan
(322) |
Feb
(482) |
Mar
(658) |
Apr
(422) |
May
(476) |
Jun
(396) |
Jul
(455) |
Aug
(267) |
Sep
(280) |
Oct
(253) |
Nov
(232) |
Dec
(304) |
| 2009 |
Jan
(486) |
Feb
(470) |
Mar
(458) |
Apr
(423) |
May
(696) |
Jun
(461) |
Jul
(551) |
Aug
(575) |
Sep
(134) |
Oct
(110) |
Nov
(157) |
Dec
(102) |
| 2010 |
Jan
(226) |
Feb
(86) |
Mar
(147) |
Apr
(117) |
May
(107) |
Jun
(203) |
Jul
(193) |
Aug
(238) |
Sep
(300) |
Oct
(246) |
Nov
(23) |
Dec
(75) |
| 2011 |
Jan
(133) |
Feb
(195) |
Mar
(315) |
Apr
(200) |
May
(267) |
Jun
(293) |
Jul
(353) |
Aug
(237) |
Sep
(278) |
Oct
(611) |
Nov
(274) |
Dec
(260) |
| 2012 |
Jan
(303) |
Feb
(391) |
Mar
(417) |
Apr
(441) |
May
(488) |
Jun
(655) |
Jul
(590) |
Aug
(610) |
Sep
(526) |
Oct
(478) |
Nov
(359) |
Dec
(372) |
| 2013 |
Jan
(467) |
Feb
(226) |
Mar
(391) |
Apr
(281) |
May
(299) |
Jun
(252) |
Jul
(311) |
Aug
(352) |
Sep
(481) |
Oct
(571) |
Nov
(222) |
Dec
(231) |
| 2014 |
Jan
(185) |
Feb
(329) |
Mar
(245) |
Apr
(238) |
May
(281) |
Jun
(399) |
Jul
(382) |
Aug
(500) |
Sep
(579) |
Oct
(435) |
Nov
(487) |
Dec
(256) |
| 2015 |
Jan
(338) |
Feb
(357) |
Mar
(330) |
Apr
(294) |
May
(191) |
Jun
(108) |
Jul
(142) |
Aug
(261) |
Sep
(190) |
Oct
(54) |
Nov
(83) |
Dec
(22) |
| 2016 |
Jan
(49) |
Feb
(89) |
Mar
(33) |
Apr
(50) |
May
(27) |
Jun
(34) |
Jul
(53) |
Aug
(53) |
Sep
(98) |
Oct
(206) |
Nov
(93) |
Dec
(53) |
| 2017 |
Jan
(65) |
Feb
(82) |
Mar
(102) |
Apr
(86) |
May
(187) |
Jun
(67) |
Jul
(23) |
Aug
(93) |
Sep
(65) |
Oct
(45) |
Nov
(35) |
Dec
(17) |
| 2018 |
Jan
(26) |
Feb
(35) |
Mar
(38) |
Apr
(32) |
May
(8) |
Jun
(43) |
Jul
(27) |
Aug
(30) |
Sep
(43) |
Oct
(42) |
Nov
(38) |
Dec
(67) |
| 2019 |
Jan
(32) |
Feb
(37) |
Mar
(53) |
Apr
(64) |
May
(49) |
Jun
(18) |
Jul
(14) |
Aug
(53) |
Sep
(25) |
Oct
(30) |
Nov
(49) |
Dec
(31) |
| 2020 |
Jan
(87) |
Feb
(45) |
Mar
(37) |
Apr
(51) |
May
(99) |
Jun
(36) |
Jul
(11) |
Aug
(14) |
Sep
(20) |
Oct
(24) |
Nov
(40) |
Dec
(23) |
| 2021 |
Jan
(14) |
Feb
(53) |
Mar
(85) |
Apr
(15) |
May
(19) |
Jun
(3) |
Jul
(14) |
Aug
(1) |
Sep
(57) |
Oct
(73) |
Nov
(56) |
Dec
(22) |
| 2022 |
Jan
(3) |
Feb
(22) |
Mar
(6) |
Apr
(55) |
May
(46) |
Jun
(39) |
Jul
(15) |
Aug
(9) |
Sep
(11) |
Oct
(34) |
Nov
(20) |
Dec
(36) |
| 2023 |
Jan
(79) |
Feb
(41) |
Mar
(99) |
Apr
(169) |
May
(48) |
Jun
(16) |
Jul
(16) |
Aug
(57) |
Sep
(32) |
Oct
|
Nov
|
Dec
|
|
From: Mark W. <ma...@so...> - 2023-09-02 00:03:57
|
https://sourceware.org/git/gitweb.cgi?p=valgrind.git;h=8228fe7f696b30c7b6b6daf576fc189bf8d6f8c2 commit 8228fe7f696b30c7b6b6daf576fc189bf8d6f8c2 Author: Mark Wielaard <ma...@kl...> Date: Fri Sep 1 19:10:17 2023 +0200 Explicitly load libc and any sonames that contain mandatory specs We really need symtab for glibc and ld.so libraries early for redir. Some distros move these into separate debuginfo files, which means we need to fully load them early. https://bugs.kde.org/show_bug.cgi?id=473745 Diff: --- NEWS | 1 + coregrind/m_debuginfo/debuginfo.c | 41 ++++++++++++++++++--------------------- coregrind/m_redir.c | 23 ++++++++++++++++------ coregrind/pub_core_debuginfo.h | 2 ++ coregrind/pub_core_redir.h | 2 +- 5 files changed, 40 insertions(+), 29 deletions(-) diff --git a/NEWS b/NEWS index 6f13d53569..832d24e45a 100644 --- a/NEWS +++ b/NEWS @@ -52,6 +52,7 @@ are not entered into bugzilla tend to get forgotten about or ignored. 472963 Broken regular expression in configure.ac 473604 Fix bug472219.c compile failure with Clang 16 473677 make check compile failure with Clang 16 based on GCC 13.x +473745 must-be-redirected function - strlen 473870 FreeBSD 14 applications fail early at startup 473944 Handle mold linker split RW PT_LOAD segments correctly n-i-bz Allow arguments with spaces in .valgrindrc files diff --git a/coregrind/m_debuginfo/debuginfo.c b/coregrind/m_debuginfo/debuginfo.c index c37e50b9d3..0c4eb99c0d 100644 --- a/coregrind/m_debuginfo/debuginfo.c +++ b/coregrind/m_debuginfo/debuginfo.c @@ -1445,27 +1445,34 @@ ULong VG_(di_notify_mmap)( Addr a, Bool allow_SkFileV, Int use_fd ) } } +/* Load DI if it hasn't already been been loaded. */ +void VG_(di_load_di)( DebugInfo *di ) +{ + if (di->deferred) { + di->deferred = False; + ML_(read_elf_debug) (di); + ML_(canonicaliseTables)( di ); + + /* Check invariants listed in + Comment_on_IMPORTANT_REPRESENTATIONAL_INVARIANTS in + priv_storage.h. */ + check_CFSI_related_invariants(di); + ML_(finish_CFSI_arrays)(di); + } +} + /* Load DI if it has a text segment containing A and DI hasn't already been loaded. */ void VG_(load_di)( DebugInfo *di, Addr a) { - if (!di->deferred - || !di->text_present + if (!di->text_present || di->text_size <= 0 || di->text_avma > a || a >= di->text_avma + di->text_size) return; - di->deferred = False; - ML_(read_elf_debug) (di); - ML_(canonicaliseTables)( di ); - - /* Check invariants listed in - Comment_on_IMPORTANT_REPRESENTATIONAL_INVARIANTS in - priv_storage.h. */ - check_CFSI_related_invariants(di); - ML_(finish_CFSI_arrays)(di); + VG_(di_load_di)(di); } /* Attempt to load DebugInfo with a text segment containing A, @@ -1477,17 +1484,7 @@ void VG_(addr_load_di)( Addr a ) di = VG_(find_DebugInfo)(VG_(current_DiEpoch)(), a); if (di != NULL) - if (di->deferred) { - di->deferred = False; - ML_(read_elf_debug) (di); - ML_(canonicaliseTables)( di ); - - /* Check invariants listed in - Comment_on_IMPORTANT_REPRESENTATIONAL_INVARIANTS in - priv_storage.h. */ - check_CFSI_related_invariants(di); - ML_(finish_CFSI_arrays)(di); - } + VG_(di_load_di)(di); } /* Unmap is simpler - throw away any SegInfos intersecting diff --git a/coregrind/m_redir.c b/coregrind/m_redir.c index 37c67f4c13..cef241b4f8 100644 --- a/coregrind/m_redir.c +++ b/coregrind/m_redir.c @@ -255,7 +255,7 @@ typedef typedef struct _TopSpec { struct _TopSpec* next; /* linked list */ - const DebugInfo* seginfo; /* symbols etc */ + DebugInfo* seginfo; /* symbols etc */ Spec* specs; /* specs pulled out of seginfo */ Bool mark; /* transient temporary used during deletion */ } @@ -312,7 +312,7 @@ static void show_active ( const HChar* left, const Active* act ); static void handle_maybe_load_notifier( const HChar* soname, const HChar* symbol, Addr addr ); -static void handle_require_text_symbols ( const DebugInfo* ); +static void handle_require_text_symbols ( DebugInfo* ); /*------------------------------------------------------------*/ /*--- NOTIFICATIONS ---*/ @@ -324,7 +324,7 @@ void generate_and_add_actives ( Spec* specs, TopSpec* parent_spec, /* debuginfo and the owning TopSpec */ - const DebugInfo* di, + DebugInfo* di, TopSpec* parent_sym ); @@ -385,7 +385,7 @@ static HChar const* advance_to_comma ( HChar const* c ) { topspecs list, and (2) figure out what new binding are now active, and, as a result, add them to the actives mapping. */ -void VG_(redir_notify_new_DebugInfo)( const DebugInfo* newdi ) +void VG_(redir_notify_new_DebugInfo)( DebugInfo* newdi ) { Bool ok, isWrap, isGlobal; Int i, nsyms, becTag, becPrio; @@ -421,6 +421,12 @@ void VG_(redir_notify_new_DebugInfo)( const DebugInfo* newdi ) newdi_soname = VG_(DebugInfo_get_soname)(newdi); vg_assert(newdi_soname != NULL); + /* libc is special, because it contains some of the core redirects. + Make sure it is fully loaded. */ + if (0 == VG_(strcmp)(newdi_soname, libc_soname) || + 0 == VG_(strcmp)(newdi_soname, pthread_soname)) + VG_(di_load_di)(newdi); + #ifdef ENABLE_INNER { /* When an outer Valgrind is executing an inner Valgrind, the @@ -814,7 +820,7 @@ void generate_and_add_actives ( Spec* specs, TopSpec* parent_spec, /* seginfo and the owning TopSpec */ - const DebugInfo* di, + DebugInfo* di, TopSpec* parent_sym ) { @@ -846,6 +852,11 @@ void generate_and_add_actives ( sp->mark = VG_(string_match)( sp->from_sopatt, soname ); anyMark = anyMark || sp->mark; + + /* The symtab might be in a separate debuginfo file. Make sure the + debuginfo is fully loaded. */ + if (sp->mark && sp->mandatory) + VG_(di_load_di)(di); } /* shortcut: if none of the sonames match, there will be no bindings. */ @@ -1792,7 +1803,7 @@ void handle_maybe_load_notifier( const HChar* soname, symbols that satisfy any --require-text-symbol= specifications that apply to it, and abort the run with an error message if not. */ -static void handle_require_text_symbols ( const DebugInfo* di ) +static void handle_require_text_symbols ( DebugInfo* di ) { /* First thing to do is figure out which, if any, --require-text-symbol specification strings apply to this diff --git a/coregrind/pub_core_debuginfo.h b/coregrind/pub_core_debuginfo.h index 6e93bb93c5..4d6ebda816 100644 --- a/coregrind/pub_core_debuginfo.h +++ b/coregrind/pub_core_debuginfo.h @@ -78,6 +78,8 @@ extern void VG_(di_notify_vm_protect)( Addr a, SizeT len, UInt prot ); extern void VG_(addr_load_di)( Addr a ); +extern void VG_(di_load_di)( DebugInfo *di ); + extern void VG_(load_di)( DebugInfo *di, Addr a ); extern void VG_(di_discard_ALL_debuginfo)( void ); diff --git a/coregrind/pub_core_redir.h b/coregrind/pub_core_redir.h index b88ca98f92..0cf0b009ea 100644 --- a/coregrind/pub_core_redir.h +++ b/coregrind/pub_core_redir.h @@ -61,7 +61,7 @@ //-------------------------------------------------------------------- /* Notify the module of a new DebugInfo (called from m_debuginfo). */ -extern void VG_(redir_notify_new_DebugInfo)( const DebugInfo* ); +extern void VG_(redir_notify_new_DebugInfo)( DebugInfo* ); /* Notify the module of the disappearance of a DebugInfo (also called from m_debuginfo). */ |
|
From: Mark W. <ma...@so...> - 2023-09-02 00:03:46
|
https://sourceware.org/git/gitweb.cgi?p=valgrind.git;h=a01d8e01fceb9e8e1c5fba67d9322fb2c45e9c83 commit a01d8e01fceb9e8e1c5fba67d9322fb2c45e9c83 Author: Aaron Merey <am...@re...> Date: Wed Aug 30 14:49:09 2023 -0400 Fix lazy debuginfo loading on ppc64le Lazy debuginfo loading introduced in commit 60f7e89ba32 assumed that either describe_IP or find_DiCfSI will be called before stacktrace printing. describe_IP and find_DiCfSI cause debuginfo to be lazily loaded before symtab lookup occurs during stacktraces. However this assumption does not hold true on ppc64le, resulting in debuginfo failing to load in time for stacktraces. Fix this by loading debuginfo during get_StackTrace_wrk on ppc arches. Diff: --- coregrind/m_stacktrace.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/coregrind/m_stacktrace.c b/coregrind/m_stacktrace.c index 308bebdd86..0ec6f5993a 100644 --- a/coregrind/m_stacktrace.c +++ b/coregrind/m_stacktrace.c @@ -772,6 +772,8 @@ UInt VG_(get_StackTrace_wrk) ( ThreadId tid_if_known, # endif Addr fp_min = sp - VG_STACK_REDZONE_SZB; + VG_(addr_load_di)(ip); + /* Snaffle IPs from the client's stack into ips[0 .. max_n_ips-1], stopping when the trail goes cold, which we guess to be when FP is not a reasonable stack location. */ @@ -913,6 +915,7 @@ UInt VG_(get_StackTrace_wrk) ( ThreadId tid_if_known, play safe, a la x86/amd64 above. See extensive comments above. */ RECURSIVE_MERGE(cmrf,ips,i); + VG_(addr_load_di)(ip); continue; } |
|
From: Philippe W. <phi...@so...> - 2023-09-01 19:05:46
|
https://sourceware.org/git/gitweb.cgi?p=valgrind.git;h=c0b2c786d38002a20f845a9fb739b8659fa87bcc commit c0b2c786d38002a20f845a9fb739b8659fa87bcc Author: Philippe Waroquiers <phi...@sk...> Date: Fri Sep 1 20:23:46 2023 +0200 Fix 473944 Handle mold linker split RW PT_LOAD segments correctly Condition to consider segments will be merged has to be more specific than just having a page rounded file offset p_offset. Regtested on debian, somewhat poorly due to the amount of tests failing due to: 473745 must-be-redirected function - strlen - for valgrind 3.22 but not 3.21 Diff: --- NEWS | 3 +- coregrind/m_debuginfo/readelf.c | 72 ++++++++++++++++++++++++++++++++--------- 2 files changed, 59 insertions(+), 16 deletions(-) diff --git a/NEWS b/NEWS index 519ed664c1..6f13d53569 100644 --- a/NEWS +++ b/NEWS @@ -12,7 +12,7 @@ AMD64/macOS 10.13 and nanoMIPS/Linux. * A new configure option --with-gdbscripts-dir lets you install the gdb valgrind python monitor scripts in a specific location. - For example an distro could use it to install the scripts in a + For example a distro could use it to install the scripts in a safe load location --with-gdbscripts-dir=%{_datadir}/gdb/auto-load It is also possible to configure --without-gdb-scripts-dir so no .debug_gdb_scripts section is added to the vgpreload library and @@ -53,6 +53,7 @@ are not entered into bugzilla tend to get forgotten about or ignored. 473604 Fix bug472219.c compile failure with Clang 16 473677 make check compile failure with Clang 16 based on GCC 13.x 473870 FreeBSD 14 applications fail early at startup +473944 Handle mold linker split RW PT_LOAD segments correctly n-i-bz Allow arguments with spaces in .valgrindrc files To see details of a given bug, visit diff --git a/coregrind/m_debuginfo/readelf.c b/coregrind/m_debuginfo/readelf.c index ac72f98fb5..a4c79efd0f 100644 --- a/coregrind/m_debuginfo/readelf.c +++ b/coregrind/m_debuginfo/readelf.c @@ -32,6 +32,7 @@ #include "pub_core_basics.h" #include "pub_core_vki.h" #include "pub_core_vkiscnums.h" +#include "pub_core_debuglog.h" #include "pub_core_debuginfo.h" #include "pub_core_libcbase.h" #include "pub_core_libcprint.h" @@ -3793,6 +3794,7 @@ Bool ML_(check_elf_and_get_rw_loads) ( Int fd, const HChar* filename, Int * rw_l DiOffT phdr_mioff = 0; UWord phdr_mnent = 0U; UWord phdr_ment_szB = 0U; + ElfXX_Phdr previous_rw_a_phdr; res = False; @@ -3830,6 +3832,9 @@ Bool ML_(check_elf_and_get_rw_loads) ( Int fd, const HChar* filename, Int * rw_l phdr_mnent = ehdr_m.e_phnum; phdr_ment_szB = ehdr_m.e_phentsize; + /* Sets p_memsz to 0 to indicate we have not yet a previous a_phdr. */ + previous_rw_a_phdr.p_memsz = 0; + for (i = 0U; i < phdr_mnent; i++) { ElfXX_Phdr a_phdr; ML_(img_get)(&a_phdr, mimg, @@ -3841,22 +3846,59 @@ Bool ML_(check_elf_and_get_rw_loads) ( Int fd, const HChar* filename, Int * rw_l if (((a_phdr.p_flags & (PF_R | PF_W)) == (PF_R | PF_W)) && ((a_phdr.p_flags & flag_x) == 0)) { ++*rw_load_count; - } + if (VG_(debugLog_getLevel)() > 1) + VG_(message)(Vg_DebugMsg, "check_elf_and_get_rw_loads: " + "++*rw_load_count to %d for %s " + "p_vaddr %#lx p_offset %lu, p_filesz %lu\n", + *rw_load_count, filename, + (UWord)a_phdr.p_vaddr, (UWord)a_phdr.p_offset, + (UWord)a_phdr.p_filesz); + /* + * Hold your horses + * Just because The ELF file contains 2 RW PT_LOAD segments + * doesn't mean that Valgrind will also make 2 calls to + * VG_(di_notify_mmap): in some cases, the 2 NSegments will get + * merged and VG_(di_notify_mmap) only gets called once. + * How to detect that the segments will be merged ? + * Logically, they will be merged if the first segment ends + * at the beginning of the second segment: + * Seg1 virtual address + Seg1 segment_size + * == Seg2 virtual address. + * However, it is not very clear how the file section will be + * loaded: the PT_LOAD specifies a file size and a memory size. + * Logically, the memory size should be used in the above + * condition, but strangely enough, in some cases the file size + * can be smaller than the memory size. And that then result in + * an "anonymous" mapping done between the 2 segments that + * otherwise would be consecutive. + * This has been seen in an executable linked by the mold linker + * (see bug 473944). In this case, the file segments were loaded + * with a "page rounded up" file size (observed on RHEL 8.6, + * ld-2.28.so, mold 1.5.1). + * However, in FreeBSD with lld (see 452802 and 473944), rounding + * up p_filesz in the below condition makes at least one test + * fail. + * As on the mold case, the below condition correctly ensures + * the 2 different segments loaded separately are both counted + * here, we use the non rounded up p_filesz. + * This is all a nightmare/hack. Something cleaner should be + * done than trying to guess here if segments will or will not + * be merged later depending on how the loader will load + * with or without rounding up. + * */ + if (previous_rw_a_phdr.p_memsz > 0 && + ehdr_m.e_type == ET_EXEC && + previous_rw_a_phdr.p_vaddr + previous_rw_a_phdr.p_filesz + == a_phdr.p_vaddr) + { + --*rw_load_count; + if (VG_(debugLog_getLevel)() > 1) + VG_(message)(Vg_DebugMsg, "check_elf_and_get_rw_loads: " + " --*rw_load_count to %d for %s\n", + *rw_load_count, filename); + } - /* - * Hold your horses - * Just because The ELF file contains 2 RW PT_LOAD segments it - * doesn't mean that Valgrind will also make 2 calls to - * VG_(di_notify_mmap). If the stars are all aligned - * (which usually means that the ELF file is the client - * executable with the segment offset for the - * second PT_LOAD falls exactly on 0x1000) then the NSegements - * will get merged and VG_(di_notify_mmap) only gets called once. */ - if (*rw_load_count == 2 && - ehdr_m.e_type == ET_EXEC && - a_phdr.p_offset == VG_PGROUNDDN(a_phdr.p_offset) ) - { - *rw_load_count = 1; + previous_rw_a_phdr = a_phdr; } } } |
|
From: Carl L. <ce...@us...> - 2023-09-01 18:59:57
|
Mark: On Fri, 2023-09-01 at 16:21 +0200, Mark Wielaard wrote: > Hi Carl, > > On Thu, 2023-08-31 at 15:38 -0700, Carl Love wrote: > > So, I then tried to run the same test on a Power 8LE system Ubuntu > > 20.04.5 LTS (Focal Fossa). I get: > > > > valgrind --tool=memcheck -q ./memcheck/tests/doublefree > out- > > current > > > > valgrind: Fatal error at startup: a function redirection > > valgrind: which is mandatory for this platform-tool combination > > valgrind: cannot be set up. Details of the redirection are: > > valgrind: > > valgrind: A must-be-redirected function > > valgrind: whose name matches the pattern: strlen > > valgrind: in an object with soname matching: ld64.so.2 > > valgrind: was not found whilst processing > > valgrind: symbols from the object with soname: ld64.so.2 > > valgrind: > > valgrind: Possible fixes: (1, short term): install glibc's > > debuginfo > > valgrind: package on this machine. (2, longer term): ask the > > packagers > > valgrind: for your Linux distribution to please in future ship a > > non- > > valgrind: stripped ld.so (or whatever the dynamic linker .so is > > called) > > valgrind: that exports the above-named function using the standard > > valgrind: calling conventions for this platform. The package you > > need > > valgrind: to install for fix (1) is called > > valgrind: > > valgrind: On Debian, Ubuntu: libc6-dbg > > valgrind: On SuSE, openSuSE, Fedora, RHEL: glibc-debuginfo > > valgrind: > > valgrind: Note that if you are debugging a 32 bit process on a > > valgrind: 64 bit system, you will need a corresponding 32 bit > > debuginfo > > valgrind: package (e.g. libc6-dbg:i386). > > valgrind: > > valgrind: Cannot continue -- exiting now. Sorry. > > > > > > When I put in my print statements, I see the call to > > read_elf_symtab__normal instead of read_elf_symtab__ppc64be_linux > > as > > expected. It appears that some of the image file is read as I see a > > second call to di_notify_ACHIEVE_ACCEPT_STATE, read_elf_object > > which I > > don't see on the BE system before the run fails. > > So the above is indeed not architecture, but Debian/Ubuntu specific. > It is tracked as > https://bugs.kde.org/show_bug.cgi?id=473745 > > > It is because the ld.so symtab is in a separate dbg package, which > (now) isn't loaded early anymore when resolving the hardwired > redirects. It doesn't happen on other distros because they keep > symtab > in ld.so. I added the attached patch and tested on the four different platforms. The git tree for all four systems was: commit 053cf5ff31e4a2d65726af431824bf30172d21ed (HEAD -> master) Author: Mark Wielaard <ma...@kl...> Date: Fri Sep 1 19:10:17 2023 +0200 Explicitly load libc and any sonames that contain mandatory specs https://bugs.kde.org/show_bug.cgi?id=473745 commit d76ddc0981862bde160a92baf362d3baf2633368 Author: Aaron Merey <am...@re...> Date: Wed Aug 30 14:49:09 2023 -0400 Fix lazy debuginfo loading on ppc64le Lazy debuginfo loading introduced in commit 60f7e89ba32 assumed that either describe_IP or find_DiCfSI will be called before stacktrace printing. describe_IP and find_DiCfSI cause debuginfo to be lazily loaded before symtab lookup occurs during stacktraces. However this assumption does not hold true on ppc64le, resulting in debuginfo failing to load in time for stacktraces. Fix this by loading debuginfo during get_StackTrace_wrk on ppc arches. commit c934430d56c2add25002ea8e321bd8bdab80fc99 (origin/master, origin/HEAD) Author: Paul Floyd <pj...@wa...> Date: Thu Aug 31 15:32:21 2023 +0200 Bug 473870 - FreeBSD 14 applications fail early at startup FreeBSD recently started adding some functions using @gnu_indirect_function, specifically strpcmp which was causing this crash. When running and encountering this ifunc Valgrind looked for the ifunc_handler. But there wasn't one for FreeBSD so Valgrind asserted. The test results are: machine pre-lazy-load current mainline with ppc debuginfo fix with Explicitly-load- (as of 8/31/2023) libc-and-any-sonames Power 8 LE Red Hat Enterprise Linux Server 7.9 (Maipo) 707 tests, 708 tests, 708 tests 708 tests, 4 stderr failures, 280 stderr failures, 247 stderr failures, 4 stderr failures, 0 stdout failures, 54 stdout failures, 54 stdout failures, 0 stdout failures, 13 stderrB failures, 16 stderrB failures, 16 stderrB failures, 13 stderrB failures, 0 stdoutB failures, 11 stdoutB failures, 12 stdoutB failures 0 stdoutB failures 9 post failures 13 post failures 9 post failures 9 post failures Power 8 BE Ubuntu 20.04.5 LTS (Focal Fossa) 742 tests, 743 tests, 743 tests, 743 tests, 2 stderr failures, 671 stderr failures, 671 stderr failures, 671 stderr failures 0 stdout failures, 152 stdout failures, 152 stdout failures, 152 stdout failures, 0 stderrB failures, 14 stderrB failures, 14 stderrB failures, 14 stderrB failures, 2 stdoutB failures, 20 stdoutB failures, 20 stdoutB failures, 20 stdoutB failures, 9 post failures 43 post failures 43 post failures 43 post failures Power 9 LE Ubuntu 20.04.5 LTS (Focal Fossa) 711 tests, 712 tests, 712 tests, 712 tests, 4 stderr failures, 280 stderr failures, 247 stderr failures, 4 stderr failures, 0 stdout failures, 54 stdout failures, 54 stdout failures, 0 stdout failures, 13 stderrB failures, 16 stderrB failures, 16 stderrB failures, 13 stderrB failures, 0 stdoutB failures, 12 stdoutB failures, 12 stdoutB failures 0 stdoutB failures, 9 post failures 13 post failures 9 post failures 9 post failures Power 10 LE Red Hat Enterprise Linux 9.0 (Plow) 719 tests 720 tests, 720 tests, 720 tests, 2 stderr failures, 42 stderr failures, 2 stderr failures, 2 stderr failures, 0 stdout failures, 0 stdout failures, 0 stdout failures, 0 stdout failures, 2 stderrB failures, 2 stderrB failures, 2 stderrB failures, 2 stderrB failures, 10 stdoutB failures, 10 stdoutB failures, 10 stdoutB failures, 10 stdoutB failures 0 post failures 3 post failures 0 post failures 0 post failures The Explicitly-load-libc-and-any-sonames-that-contain-ma.patch seems to fix the issues across the various OS distributions for the LE machines. It does appear that there is a separate issue with the original patch to lazily load the debug info on PowerPC BE. Hopefully we can sort that issue out when Aaron gets back from vacation. Thanks for your help with the latest patch. Carl |
|
From: Mark W. <ma...@kl...> - 2023-09-01 17:47:42
|
On Fri, 2023-09-01 at 16:21 +0200, Mark Wielaard wrote: > So the above is indeed not architecture, but Debian/Ubuntu specific. > It is tracked as https://bugs.kde.org/show_bug.cgi?id=473745 > > It is because the ld.so symtab is in a separate dbg package, which > (now) isn't loaded early anymore when resolving the hardwired > redirects. It doesn't happen on other distros because they keep symtab > in ld.so. I have attached a patch that seems to work for me. |
|
From: Mark W. <ma...@kl...> - 2023-09-01 14:21:34
|
Hi Carl, On Thu, 2023-08-31 at 15:38 -0700, Carl Love wrote: > So, I then tried to run the same test on a Power 8LE system Ubuntu > 20.04.5 LTS (Focal Fossa). I get: > > valgrind --tool=memcheck -q ./memcheck/tests/doublefree > out- > current > > valgrind: Fatal error at startup: a function redirection > valgrind: which is mandatory for this platform-tool combination > valgrind: cannot be set up. Details of the redirection are: > valgrind: > valgrind: A must-be-redirected function > valgrind: whose name matches the pattern: strlen > valgrind: in an object with soname matching: ld64.so.2 > valgrind: was not found whilst processing > valgrind: symbols from the object with soname: ld64.so.2 > valgrind: > valgrind: Possible fixes: (1, short term): install glibc's debuginfo > valgrind: package on this machine. (2, longer term): ask the > packagers > valgrind: for your Linux distribution to please in future ship a non- > valgrind: stripped ld.so (or whatever the dynamic linker .so is > called) > valgrind: that exports the above-named function using the standard > valgrind: calling conventions for this platform. The package you need > valgrind: to install for fix (1) is called > valgrind: > valgrind: On Debian, Ubuntu: libc6-dbg > valgrind: On SuSE, openSuSE, Fedora, RHEL: glibc-debuginfo > valgrind: > valgrind: Note that if you are debugging a 32 bit process on a > valgrind: 64 bit system, you will need a corresponding 32 bit > debuginfo > valgrind: package (e.g. libc6-dbg:i386). > valgrind: > valgrind: Cannot continue -- exiting now. Sorry. > > > When I put in my print statements, I see the call to > read_elf_symtab__normal instead of read_elf_symtab__ppc64be_linux as > expected. It appears that some of the image file is read as I see a > second call to di_notify_ACHIEVE_ACCEPT_STATE, read_elf_object which I > don't see on the BE system before the run fails. So the above is indeed not architecture, but Debian/Ubuntu specific. It is tracked as https://bugs.kde.org/show_bug.cgi?id=473745 It is because the ld.so symtab is in a separate dbg package, which (now) isn't loaded early anymore when resolving the hardwired redirects. It doesn't happen on other distros because they keep symtab in ld.so. Cheers, Mark |
|
From: Paul F. <pa...@so...> - 2023-09-01 06:26:48
|
https://sourceware.org/git/gitweb.cgi?p=valgrind.git;h=8a545a3684024f4a54bb20b7ad1d0fe9949bbede commit 8a545a3684024f4a54bb20b7ad1d0fe9949bbede Author: Paul Floyd <pj...@wa...> Date: Fri Sep 1 08:26:24 2023 +0200 FreeBSD: add common failure causes to README.freebsd Also fix the name of one of the fields of vki_kinfo_vmentry. Diff: --- README.freebsd | 15 ++++++++++++++- coregrind/m_aspacemgr/aspacemgr-linux.c | 2 +- include/vki/vki-freebsd.h | 2 +- 3 files changed, 16 insertions(+), 3 deletions(-) diff --git a/README.freebsd b/README.freebsd index eb6a510ada..ce1e435f5d 100644 --- a/README.freebsd +++ b/README.freebsd @@ -130,7 +130,20 @@ These are much easier. They just contain a POST_MEM_WRITE macro for each output argument. -1. Running regression tests +1. Frequent causes of problems + +- New _umtx_op codes. Valgrind will print "WARNING: _umtx_op unsupported value". + See syswrap-freebsd.c and add new cases for the new codes. +- Additions to auxv. Depending on the entry it may need to be simply copied + from the host to the guest, it may need to be modified for the guest or + it may need to be ignored. See initimg-freebsd.c. +- ELF PT_LOAD mappings. Either Valgrind will assert or there will be no source + information in error reports. See VG_(di_notify_mmap) in debuginfo.c +- Because they contain many deliberate errors the regression tests are prone + to change with changes of compiler. Liberal use of 'volatile' and + '-Wno-warning-flag' can help - see configure.ac + +2. Running regression tests In order to run all of the regression tests you will need to install the following packages diff --git a/coregrind/m_aspacemgr/aspacemgr-linux.c b/coregrind/m_aspacemgr/aspacemgr-linux.c index ae38d8bd05..ba8964928e 100644 --- a/coregrind/m_aspacemgr/aspacemgr-linux.c +++ b/coregrind/m_aspacemgr/aspacemgr-linux.c @@ -3943,7 +3943,7 @@ static void parse_procselfmaps ( endPlusOne = (UWord)kve->kve_end; foffset = kve->kve_offset; filename = kve->kve_path; - dev = kve->kve_fsid_freebsd11; + dev = kve->kve_vn_fsid_freebsd11; ino = kve->kve_fileid; if (filename[0] != '/') { filename = NULL; diff --git a/include/vki/vki-freebsd.h b/include/vki/vki-freebsd.h index 92ca8a6878..b30b2933ec 100644 --- a/include/vki/vki-freebsd.h +++ b/include/vki/vki-freebsd.h @@ -2170,7 +2170,7 @@ struct vki_kinfo_vmentry { ULong kve_end; ULong kve_offset; ULong kve_fileid; - UInt kve_fsid_freebsd11; + UInt kve_vn_fsid_freebsd11; int kve_flags; int kve_resident; int kve_private_resident; |
|
From: Carl L. <ce...@us...> - 2023-08-31 22:38:32
|
On Thu, 2023-08-31 at 10:31 -0700, Carl Love wrote:
> Mark, Aaron:
>
> So, I tried running the doublefree test by hand with the intention of
> then adding some debug prints to see which routines were being
> called.
> I am seeing the following:
>
> valgrind --tool=memcheck -q ./memcheck/tests/doublefree > out-
> current
>
> valgrind: m_debuginfo/image.c:1106 (vgModuleLocal_img_valid):
> Assertion 'img != NULL' failed.
> Segmentation fault
>
> I rolled back the git tree to the commit prior to the initial patch
> to
> do the lazy load,
>
> commit 6ce0979884a8f246c80a098333ceef1a7b7f694d
> Author: Paul Floyd <pj...@wa...>
> Date: Mon Jul 24 22:06:00 2023 +0200
>
> Bug 472219 - Syscall param ppoll(ufds.events) points to
> uninitialised byte(s
>
> Add checks that (p)poll fd is not negative. If it is negative,
> don't check
> the events field.
>
> I re-compliled, re-installed and tested again and get:
>
> valgrind --tool=memcheck -q ./memcheck/tests/doublefree > out-
> current
> ==124807== Invalid free() / delete / delete[] / realloc()
> ==124807== at 0x409B680: free (vg_replace_malloc.c:974)
> ==124807== by 0x1000063B: main (doublefree.c:10)
> ==124807== Address 0x42f0040 is 0 bytes inside a block of size 177
> free'd
> ==124807== at 0x409B680: free (vg_replace_malloc.c:974)
> ==124807== by 0x1000063B: main (doublefree.c:10)
> ==124807== Block was alloc'd at
> ==124807== at 0x409858C: malloc (vg_replace_malloc.c:431)
> ==124807== by 0x1000061B: main (doublefree.c:8)
> ==124807==
>
> So it seems with the initial patch and the PPC patch we are hitting
> an
> assertion issue. I will try and pursue a bit more.
The system I was testing on is Power 8 BE
system is Red Hat Enterprise Linux Server 7.9 (Maipo)
The assertion is in function ML_(img_valid), file
coregrind/m_debuginfo/image.c. I put a print statement in before each
of the 18 calls to determine which of the calls fails. The failure is
in readelf.c, line ~ 609
Bool get_elf_symbol_info (... )
{
...
/* Now we want to know what's at that offset in the .opd
section. We can't look in the running image since it won't
necessarily have been mapped. But we can consult the oimage.
opd_img is the start address of the .opd in the oimage.
Hence: */
ULong fn_descr[2]; /* is actually 3 words, but we need only 2 */
VG_(printf)("CARLL, img_valid 2\n");
if (!ML_(img_valid)(escn_opd->img, escn_opd->ioff + offset_in_opd,
sizeof(fn_descr))) {
if (TRACE_SYMTAB_ENABLED) {
HChar* sym_name = ML_(img_strdup)(escn_strtab->img,
"di.gesi.6b", sym_name_ioff);
TRACE_SYMTAB(" ignore -- invalid OPD fn_descr offset: %s\n",
sym_name);
if (sym_name) ML_(dinfo_free)(sym_name);
}
return False;
}
...
The function is called from
static
__attribute__((unused)) /* not referred to on all targets */
void read_elf_symtab__ppc64be_linux(
struct _DebugInfo* di, const HChar* tab_name,
DiSlice* escn_symtab,
DiSlice* escn_strtab,
DiSlice* escn_opd, /* ppc64be-linux only */
Bool symtab_in_debug
)
{
...
}
in the same file. There is an #if def to select which of the two calls
to make
# if defined(VGP_ppc64be_linux)
read_elf_symtab = read_elf_symtab__ppc64be_linux;
# else
read_elf_symtab = read_elf_symtab__normal;
# endif
in function read_elf_object. Which is called from
di_notify_ACHIEVE_ACCEPT_STATE in debuginfo.c.
I believe we need to call read_elf_debug to actually load the image. I
am not seeing any calls to read_elf_debug. It is called in load_di,
addr_load_di and load_all_debuginfo. I don't see any of these
functions getting called. describe_IP calls load_di or addr_load_di;
find_DiCfSI will call load_di. Again, I don't see describe_IP or
find_DiCfSI being called.
----------------------------------------
So, I then tried to run the same test on a Power 8LE system Ubuntu
20.04.5 LTS (Focal Fossa). I get:
valgrind --tool=memcheck -q ./memcheck/tests/doublefree > out-
current
valgrind: Fatal error at startup: a function redirection
valgrind: which is mandatory for this platform-tool combination
valgrind: cannot be set up. Details of the redirection are:
valgrind:
valgrind: A must-be-redirected function
valgrind: whose name matches the pattern: strlen
valgrind: in an object with soname matching: ld64.so.2
valgrind: was not found whilst processing
valgrind: symbols from the object with soname: ld64.so.2
valgrind:
valgrind: Possible fixes: (1, short term): install glibc's debuginfo
valgrind: package on this machine. (2, longer term): ask the
packagers
valgrind: for your Linux distribution to please in future ship a non-
valgrind: stripped ld.so (or whatever the dynamic linker .so is
called)
valgrind: that exports the above-named function using the standard
valgrind: calling conventions for this platform. The package you need
valgrind: to install for fix (1) is called
valgrind:
valgrind: On Debian, Ubuntu: libc6-dbg
valgrind: On SuSE, openSuSE, Fedora, RHEL: glibc-debuginfo
valgrind:
valgrind: Note that if you are debugging a 32 bit process on a
valgrind: 64 bit system, you will need a corresponding 32 bit
debuginfo
valgrind: package (e.g. libc6-dbg:i386).
valgrind:
valgrind: Cannot continue -- exiting now. Sorry.
When I put in my print statements, I see the call to
read_elf_symtab__normal instead of read_elf_symtab__ppc64be_linux as
expected. It appears that some of the image file is read as I see a
second call to di_notify_ACHIEVE_ACCEPT_STATE, read_elf_object which I
don't see on the BE system before the run fails.
Carl
|
|
From: Carl L. <ce...@us...> - 2023-08-31 22:38:31
|
Aaron, Mark:
On Wed, 2023-08-30 at 15:48 -0700, Carl Love wrote:
> Thanks for taking a look at the issue. I tested the patch an a variety
> of machines and get mixed results. Here is what I am seeing before the
> commit to add the lazy loading, with the current Valgrind mainline
> (includes the lazy commit) and with the patch to fix the lazy load on
> Power:
>
> machine pre-lazy-load current mainline with ppc debuginfo fix
> Power 8 LE 707 tests, 708 tests, 708 tests
> 4 stderr failures, 280 stderr failures, 247 stderr failures,
> 0 stdout failures, 54 stdout failures, 54 stdout failures,
> 13 stderrB failures, 16 stderrB failures, 16 stderrB failures,
> 0 stdoutB failures, 11 stdoutB failures, 12 stdoutB failures
> 9 post failures 13 post failures 9 post failures
>
> Power 8 BE 742 tests, 743 tests, 743 tests,
> 2 stderr failures, 671 stderr failures, 671 stderr failures,
> 0 stdout failures, 152 stdout failures, 152 stdout failures,
> 0 stderrB failures, 14 stderrB failures, 14 stderrB failures,
> 2 stdoutB failures, 20 stdoutB failures, 20 stdoutB failures,
> 9 post failures 43 post failures 43 post failures
>
> Power 9 LE 711 tests, 712 tests, 712 tests,
> 4 stderr failures, 280 stderr failures, 247 stderr failures,
> 0 stdout failures, 54 stdout failures, 54 stdout failures,
> 13 stderrB failures, 16 stderrB failures, 16 stderrB failures,
> 0 stdoutB failures, 12 stdoutB failures, 12 stdoutB failures
> 9 post failures 13 post failures 9 post failures
>
> Power 10 LE 719 tests 720 tests, 720 tests,
> 2 stderr failures, 42 stderr failures, 2 stderr failures,
> 0 stdout failures, 0 stdout failures, 0 stdout failures,
> 2 stderrB failures, 2 stderrB failures, 2 stderrB failures,
> 10 stdoutB failures, 10 stdoutB failures, 10 stdoutB failures,
> 0 post failures 3 post failures 0 post failures
I was thinking about what else could cause the differences in the test
results. I was wondering if the OS distribution might be an issue.
So, I tried some different OS distributions on the same hardware.
First here is the OS distribution for the above testing.
The Power 8 BE system is Red Hat Enterprise Linux Server 7.9 (Maipo)
The Power 8 LE system is Ubuntu 20.04.5 LTS (Focal Fossa)
The Power 9 LE system is Ubuntu 20.04.5 LTS (Focal Fossa)
The Power 10 LE system Red Hat Enterprise Linux 9.0 (Plow)
I did some additional testing on Power 9 LE and Power 10 LE with
different OS distributions with the PPC fix patch applied.
Power 9 LE Red Hat Enterprise Linux 8.7 (Ootpa)
== 714 tests, 4 stderr failures, 0 stdout failures, 0 stderrB failures,
0 stdoutB failures, 9 post failures ==
Power 10 LE Ubuntu 22.04.2 LTS
== 721 tests, 303 stderr failures, 62 stdout failures, 11 stderrB
failures, 14 stdoutB failures, 0 post failures ==
So, it seems that RHAT works well on Power 9 and Power 10. Ubuntu
doesn't work well on Power 10, Power 9 or Power 8. There seems to be
an OS issue, not a timing issue that is causing the differences on the
various platforms that I tested.
Carl
|
|
From: Carl L. <ce...@us...> - 2023-08-31 17:40:26
|
Mark, Aaron:
So, I tried running the doublefree test by hand with the intention of
then adding some debug prints to see which routines were being called.
I am seeing the following:
valgrind --tool=memcheck -q ./memcheck/tests/doublefree > out-
current
valgrind: m_debuginfo/image.c:1106 (vgModuleLocal_img_valid):
Assertion 'img != NULL' failed.
Segmentation fault
I rolled back the git tree to the commit prior to the initial patch to
do the lazy load,
commit 6ce0979884a8f246c80a098333ceef1a7b7f694d
Author: Paul Floyd <pj...@wa...>
Date: Mon Jul 24 22:06:00 2023 +0200
Bug 472219 - Syscall param ppoll(ufds.events) points to
uninitialised byte(s
Add checks that (p)poll fd is not negative. If it is negative,
don't check
the events field.
I re-compliled, re-installed and tested again and get:
valgrind --tool=memcheck -q ./memcheck/tests/doublefree > out-current
==124807== Invalid free() / delete / delete[] / realloc()
==124807== at 0x409B680: free (vg_replace_malloc.c:974)
==124807== by 0x1000063B: main (doublefree.c:10)
==124807== Address 0x42f0040 is 0 bytes inside a block of size 177
free'd
==124807== at 0x409B680: free (vg_replace_malloc.c:974)
==124807== by 0x1000063B: main (doublefree.c:10)
==124807== Block was alloc'd at
==124807== at 0x409858C: malloc (vg_replace_malloc.c:431)
==124807== by 0x1000061B: main (doublefree.c:8)
==124807==
So it seems with the initial patch and the PPC patch we are hitting an
assertion issue. I will try and pursue a bit more.
Carl
|
|
From: Carl L. <ce...@us...> - 2023-08-31 17:07:44
|
Mark:
On Thu, 2023-08-31 at 17:43 +0200, Mark Wielaard wrote:
> Hi,
>
> On Thu, Aug 31, 2023 at 04:14:53PM +0200, Mark Wielaard wrote:
> > It also doesn't seem to work for me on a power9 f38 system. Which
> > is
> > surprising, since theoretically I think it should work. The
> > difference between ppc64le and other architectures is that all
> > other
> > architectures use VG_(use_CF_info) for unwinding, which will
> > indirectly load the debuginfo for the pc. So explicitly loading it
> > for
> > the pc in the ppc case should have worked, but it doesn't... :{
> >
> > I'll keep poking if there is some other difference with the other
> > architectures.
>
> I take that back. I didn't apply Aaron's patch correctly (I had some
> local hacks that conflicted with the second part). With a clean
> current trunk and Aaron's patch applied the results look pretty good:
>
> == 712 tests, 4 stderr failures, 0 stdout failures, 0 stderrB
> failures, 0 stdoutB failures, 0 post failures ==
> memcheck/tests/bug340392 (stderr)
> memcheck/tests/linux/debuginfod-check (stderr)
> helgrind/tests/pth_mempcpy_false_races (stderr)
> drd/tests/std_thread2 (stderr)
>
> ...checking makefile consistency
> ...checking header files and include directives
> make: *** [Makefile:1438: regtest] Error 1
>
> I think memcheck/tests/bug340392,
> helgrind/tests/pth_mempcpy_false_races and drd/tests/std_thread2 are
> known failures.
>
> memcheck/tests/linux/debuginfod-check.stderr.diff:
>
> --- debuginfod-check.stderr.exp 2023-04-27 15:25:16.209181780 +0000
> +++ debuginfod-check.stderr.out 2023-08-31 14:21:46.438006283 +0000
> @@ -2,5 +2,5 @@
> at 0x........: main (debuginfod-check.c:5)
> Address 0x........ is 1 bytes before a block of size 1 alloc'd
> at 0x........: malloc (vg_replace_malloc.c:...)
> - by 0x........: main (debuginfod-check.c:4)
> + ...
>
> so that is interesting, have to figure out why the explicit
> debuginfod
> testcase fails. But the rest does look good with Aaron's patch
> applied.
>
> Carl, can you look if the patch applied cleanly for you?
I have test directories on each machine. I did a git pull, compiled,
ran the test, then applied the fix patch, compiled, ran tests, then I
rolled back the git repository to the commit prior to the initial
commit complied and ran the test.
I didn't see any issues when I applied the PPC fix patch.
So today, I cloned the current valgrind tree into an empty directory,
applied the PPC fix patch. The patch applied without any issues. Then
configure, compiled and installed valgrind. I reran the tests on each
platform and got identical results as I posted yesterday. Looking at
the variability in the results before and after the PPC fix patch just
makes me wonder if there is a timing issue given what the patch did??
Valgrind is a single threaded program as far as I know so I am puzzled
how it could be a timing issue. I have tried running the tests
multiple times on the various platforms and always get consistent
results.
I will see if I can play around with the patch some today to see if I
can find anything.
Carl
|
|
From: Mark W. <ma...@kl...> - 2023-08-31 15:43:37
|
Hi,
On Thu, Aug 31, 2023 at 04:14:53PM +0200, Mark Wielaard wrote:
> It also doesn't seem to work for me on a power9 f38 system. Which is
> surprising, since theoretically I think it should work. The
> difference between ppc64le and other architectures is that all other
> architectures use VG_(use_CF_info) for unwinding, which will
> indirectly load the debuginfo for the pc. So explicitly loading it for
> the pc in the ppc case should have worked, but it doesn't... :{
>
> I'll keep poking if there is some other difference with the other
> architectures.
I take that back. I didn't apply Aaron's patch correctly (I had some
local hacks that conflicted with the second part). With a clean
current trunk and Aaron's patch applied the results look pretty good:
== 712 tests, 4 stderr failures, 0 stdout failures, 0 stderrB failures, 0 stdoutB failures, 0 post failures ==
memcheck/tests/bug340392 (stderr)
memcheck/tests/linux/debuginfod-check (stderr)
helgrind/tests/pth_mempcpy_false_races (stderr)
drd/tests/std_thread2 (stderr)
...checking makefile consistency
...checking header files and include directives
make: *** [Makefile:1438: regtest] Error 1
I think memcheck/tests/bug340392,
helgrind/tests/pth_mempcpy_false_races and drd/tests/std_thread2 are
known failures.
memcheck/tests/linux/debuginfod-check.stderr.diff:
--- debuginfod-check.stderr.exp 2023-04-27 15:25:16.209181780 +0000
+++ debuginfod-check.stderr.out 2023-08-31 14:21:46.438006283 +0000
@@ -2,5 +2,5 @@
at 0x........: main (debuginfod-check.c:5)
Address 0x........ is 1 bytes before a block of size 1 alloc'd
at 0x........: malloc (vg_replace_malloc.c:...)
- by 0x........: main (debuginfod-check.c:4)
+ ...
so that is interesting, have to figure out why the explicit debuginfod
testcase fails. But the rest does look good with Aaron's patch
applied.
Carl, can you look if the patch applied cleanly for you?
Cheers,
Mark
|
|
From: Floyd, P. <pj...@wa...> - 2023-08-31 14:50:32
|
Hi Mark On 27/08/2023 17:46, Mark Wielaard wrote: > And hopefully nightly test results and/or buildbot builders. > >> * the long running question of what to do with macOS > Any testers/developers/volunteers? One main developer but intermittent progress. Occasional other contributors After being just about totally broken on macOS versions from 10.15 onwards memcheck is now just about usable for small non-GUI applications on Intel hardware. Ongoing work on Apple's ARM architecture but I don't have an ARM mac so I can't test that. >> * memcheck aligned and sized checks plus maybe c23 free_sized and >> free_sized_aligned plus Linux aligned_alloc >> >> * detect whether a debug version of libstdc++ is being used and then >> use that to automatically turn on or off mismatch detection > That is an interesting idea. I need to do more testing to see if detecting debuginfo would be enough, or whether we also need to know if the compiler used optimization. >> And I expect the usual steady stream of smaller fixes. > On irc we also discussed having a "rolling release branch" where we > put small/essential bug fixes (some of which some distros now backport > themselves). I'd be interested in that. On FreeBSD I maintain two ports, one that is mostly stable and based on official Valgrind releases and a second one that is a bit more "rolling" (actually it doesn't roll as fast as it should lack of time etc.). That second one is based on a GH repo that shadows the sourceware repo. I'm planning to switch over to use sourceware snapshots the next time I bump that port (RSN after de-borking FreeBSD 14/15 Valgrind amd64 today). The big advantage for me is that I don't have to maintain any icky patchsets for the port. I can just fix things and push to sourceware and then base off that. Also I don't have any constraints like LTS and paying customers. Do you think it will be feasible for such release branch to satisfy most distro packagers needs? I do occasionally look to see what distros have in their patchsets - I merged one from Debian a dew days ago. A+ Paul |
|
From: Mark W. <ma...@kl...> - 2023-08-31 14:15:04
|
Hi Aaron, Hi Carl,
On Wed, Aug 30, 2023 at 03:48:20PM -0700, Carl Love via Valgrind-developers wrote:
> On Wed, 2023-08-30 at 15:09 -0400, Aaron Merey wrote:
> > Sorry for the delay. I'm currently away for the next couple
> > weeks, however I was able to take a look at these regressions.
Thanks. But don't feel you have to come back early just for this
technical issue. We might not be as quick as you, but we should be
able to figure it out :)
> > It looks like debuginfo is not always lazily loaded on ppc64le
> > since it's possible for neither describe_IP or find_DiCfSI to be
> > called before symtab lookups during stacktrace. describe_IP and
> > find_DiCfSI contain calls to lazily load debuginfo, so if they are
> > not called before stacktrace printing it results in missing
> > debuginfo and lower quality stacktraces.
> >
> > I've attached a patch that fixed the regressions for me when I
> > tested this on a ppc64le machine. It adds lazy debuginfo loading
> > during ppc get_StackTrace_wrk.
>
> Thanks for taking a look at the issue. I tested the patch an a variety
> of machines and get mixed results. Here is what I am seeing before the
> commit to add the lazy loading, with the current Valgrind mainline
> (includes the lazy commit) and with the patch to fix the lazy load on
> Power: [...]
It also doesn't seem to work for me on a power9 f38 system. Which is
surprising, since theoretically I think it should work. The
difference between ppc64le and other architectures is that all other
architectures use VG_(use_CF_info) for unwinding, which will
indirectly load the debuginfo for the pc. So explicitly loading it for
the pc in the ppc case should have worked, but it doesn't... :{
I'll keep poking if there is some other difference with the other
architectures.
Cheers,
Mark
|
|
From: Paul F. <pa...@so...> - 2023-08-31 11:44:46
|
https://sourceware.org/git/gitweb.cgi?p=valgrind.git;h=c934430d56c2add25002ea8e321bd8bdab80fc99 commit c934430d56c2add25002ea8e321bd8bdab80fc99 Author: Paul Floyd <pj...@wa...> Date: Thu Aug 31 15:32:21 2023 +0200 Bug 473870 - FreeBSD 14 applications fail early at startup FreeBSD recently started adding some functions using @gnu_indirect_function, specifically strpcmp which was causing this crash. When running and encountering this ifunc Valgrind looked for the ifunc_handler. But there wasn't one for FreeBSD so Valgrind asserted. Diff: --- NEWS | 1 + coregrind/vg_preloaded.c | 22 +++++++++++++++++++++- 2 files changed, 22 insertions(+), 1 deletion(-) diff --git a/NEWS b/NEWS index 96eb06af8a..519ed664c1 100644 --- a/NEWS +++ b/NEWS @@ -52,6 +52,7 @@ are not entered into bugzilla tend to get forgotten about or ignored. 472963 Broken regular expression in configure.ac 473604 Fix bug472219.c compile failure with Clang 16 473677 make check compile failure with Clang 16 based on GCC 13.x +473870 FreeBSD 14 applications fail early at startup n-i-bz Allow arguments with spaces in .valgrindrc files To see details of a given bug, visit diff --git a/coregrind/vg_preloaded.c b/coregrind/vg_preloaded.c index a792081b11..1c727966f3 100644 --- a/coregrind/vg_preloaded.c +++ b/coregrind/vg_preloaded.c @@ -238,7 +238,27 @@ void VG_REPLACE_FUNCTION_ZU(libSystemZdZaZddylib, arc4random_addrandom)(unsigned #elif defined(VGO_freebsd) -// nothing specific currently +#if (FREEBSD_VERS >= FREEBSD_14) + +void * VG_NOTIFY_ON_LOAD(ifunc_wrapper) (void); +void * VG_NOTIFY_ON_LOAD(ifunc_wrapper) (void) +{ + OrigFn fn; + Addr result = 0; + Addr fnentry; + + /* Call the original indirect function and get it's result */ + VALGRIND_GET_ORIG_FN(fn); + CALL_FN_W_v(result, fn); + + fnentry = result; + + VALGRIND_DO_CLIENT_REQUEST_STMT(VG_USERREQ__ADD_IFUNC_TARGET, + fn.nraddr, fnentry, 0, 0, 0); + return (void*)result; +} + +#endif #elif defined(VGO_solaris) |
|
From: Carl L. <ce...@us...> - 2023-08-30 22:48:42
|
Aaron:
On Wed, 2023-08-30 at 15:09 -0400, Aaron Merey wrote:
> Hi Carl,
>
> Sorry for the delay. I'm currently away for the next couple weeks,
> however
> I was able to take a look at these regressions.
>
> It looks like debuginfo is not always lazily loaded on ppc64le since
> it's
> possible for neither describe_IP or find_DiCfSI to be called before
> symtab
> lookups during stacktrace. describe_IP and find_DiCfSI contain calls
> to lazily load debuginfo, so if they are not called before stacktrace
> printing
> it results in missing debuginfo and lower quality stacktraces.
>
> I've attached a patch that fixed the regressions for me when I tested
> this on
> a ppc64le machine. It adds lazy debuginfo loading during ppc
> get_StackTrace_wrk.
>
Thanks for taking a look at the issue. I tested the patch an a variety
of machines and get mixed results. Here is what I am seeing before the
commit to add the lazy loading, with the current Valgrind mainline
(includes the lazy commit) and with the patch to fix the lazy load on
Power:
machine pre-lazy-load current mainline with ppc debuginfo fix
Power 8 LE 707 tests, 708 tests, 708 tests
4 stderr failures, 280 stderr failures, 247 stderr failures,
0 stdout failures, 54 stdout failures, 54 stdout failures,
13 stderrB failures, 16 stderrB failures, 16 stderrB failures,
0 stdoutB failures, 11 stdoutB failures, 12 stdoutB failures
9 post failures 13 post failures 9 post failures
Power 8 BE 742 tests, 743 tests, 743 tests,
2 stderr failures, 671 stderr failures, 671 stderr failures,
0 stdout failures, 152 stdout failures, 152 stdout failures,
0 stderrB failures, 14 stderrB failures, 14 stderrB failures,
2 stdoutB failures, 20 stdoutB failures, 20 stdoutB failures,
9 post failures 43 post failures 43 post failures
Power 9 LE 711 tests, 712 tests, 712 tests,
4 stderr failures, 280 stderr failures, 247 stderr failures,
0 stdout failures, 54 stdout failures, 54 stdout failures,
13 stderrB failures, 16 stderrB failures, 16 stderrB failures,
0 stdoutB failures, 12 stdoutB failures, 12 stdoutB failures
9 post failures 13 post failures 9 post failures
Power 10 LE 719 tests 720 tests, 720 tests,
2 stderr failures, 42 stderr failures, 2 stderr failures,
0 stdout failures, 0 stdout failures, 0 stdout failures,
2 stderrB failures, 2 stderrB failures, 2 stderrB failures,
10 stdoutB failures, 10 stdoutB failures, 10 stdoutB failures,
0 post failures 3 post failures 0 post failures
So the patch has mixed results in fixing the issue. It feels like
there is still a timing issue to me. Perhaps there needs to be a check
to see if the lazy load has completed before the use? Just throwing
out ideas here.
Anyway, sounds like you are out of the office for awhile. I am fine
with waiting until you are back to work on this some more. No need to
mess up you time off. I don't think there is a release coming soon so
I think we have some time to get this fixed up.
Thanks for the help with the initial patch fix.
Carl
|
|
From: Aaron M. <am...@re...> - 2023-08-30 19:10:08
|
Hi Carl, Sorry for the delay. I'm currently away for the next couple weeks, however I was able to take a look at these regressions. It looks like debuginfo is not always lazily loaded on ppc64le since it's possible for neither describe_IP or find_DiCfSI to be called before symtab lookups during stacktrace. describe_IP and find_DiCfSI contain calls to lazily load debuginfo, so if they are not called before stacktrace printing it results in missing debuginfo and lower quality stacktraces. I've attached a patch that fixed the regressions for me when I tested this on a ppc64le machine. It adds lazy debuginfo loading during ppc get_StackTrace_wrk. Aaron On Tue, Aug 29, 2023 at 3:15 PM Carl Love <ce...@us...> wrote: > > Mark, Paul, Aaron: > > On Sun, 2023-08-27 at 17:36 +0200, Mark Wielaard wrote: > > Hi Paul, > > > > On Sat, Aug 26, 2023 at 06:26:29AM +0200, Paul Floyd wrote: > > > I was just looking at valgrind-testresults and there was a jump in > > > the number of failures on ppc64le on Aug 17th, just after the > > > deferred debuginfo reading change. > > > > > > One random example > > > > > > ================================================= > > > ./valgrind-old/drd/tests/tc16_byterace.stderr.diff > > > ================================================= > > > --- tc16_byterace.stderr.exp 2023-08-17 03:01:09.168107928 +0000 > > > +++ tc16_byterace.stderr.out 2023-08-17 03:28:20.030515805 +0000 > > > @@ -1,8 +1,7 @@ > > > > > > Conflicting load by thread 1 at 0x........ size 1 > > > at 0x........: main (tc16_byterace.c:34) > > > -Location 0x........ is 0 bytes inside bytes[4], > > > -a global variable declared at tc16_byterace.c:7 > > > +Allocation context: BSS section of tc16_byterace > > > > > > It does look to me like this is a debuginfo issue. > > > > You are correct, this was caused by: > > > > commit 60f7e89ba32b54d73b9e36d49e28d0f559ade0b9 > > Author: Aaron Merey <am...@re...> > > Date: Fri Jun 30 18:31:42 2023 -0400 > > > > Support lazy reading and downloading of DWARF debuginfo > > > > That commit shouldn't have been architecture specific, but it > > apparently was. I put some early analysis into the bug > > > > https://bugs.kde.org/show_bug.cgi?id=471807#c16 > > > > The patch depends on a call to find_DiCfSI triggering a full > > debuginfo load. > > find_DiCfSI is (indirectly called) when ML_(get_CFA) is called. > > It looks like ppc64le doesn't call ML_(get_CFA) because we have the > > following in > > coregrind/m_debuginfo/d3basics.c > > > > #if defined(VGP_ppc32_linux) || defined(VGP_ppc64be_linux) \ > > || defined(VGP_ppc64le_linux) > > /* Valgrind on ppc32/ppc64 currently doesn't use unwind > > info. */ > > uw1 = ML_(read_Addr)((UChar*)regs->sp); > > #else > > uw1 = ML_(get_CFA)(regs->ip, regs->sp, regs->fp, 0, > > ~(UWord) 0); > > #endif > > I verified that the patch from Aaron causes regression failures on > Power 9 and Power 10. Per the comment above, not sure why PowerPC does > not support the get_CFA call? Unfortunately, I don't know much about > callgrind or the debuginfo stuff. Not obvious to me at first glance > how to fix the issue. > > I would be happy to help test a patch or work on a patch if someone has > specific suggestions on how to fix the issue on PowerPC. > > Carl > |
|
From: Mark W. <ma...@kl...> - 2023-08-30 08:16:21
|
Sourceware infrastructure community updates for Q3 2023 - Sourceware 25 Roadmap - git source code integrity - inbox.sourceware.org vs HTML email - Continuous glibc src and manual snapshots - Conservancy BBB server for Sourceware projects - Working on individual tech sovereignty together - Sourceware Overseers Open Office hour = Sourceware 25 Roadmap Preparing Sourceware for the next 25 years. In the last couple of years we have started to diversify our hardware partners, setup new services using containers and isolated VMs, investigated secure supply chain issues, added redundant mirrors, created a non-profit home, collected funds, invested in open communication, open office hours and introduced community oversight by a Sourceware Project Leadership Committee with the help from the Software Freedom Conservancy. https://sourceware.org/sourceware-25-roadmap.html = git source code integrity. gitsigur for protecting git repo integrity. With comparisons, developer workflow examples and composition possibilities for gitsigur, b4 and sigstore. https://inbox.sourceware.org/ZJ3Tihvu6GbOb8%2F...@el.../ Sourceware now also allows signed git pushes (in addition to signed git commits). = inbox.sourceware.org vs HTML email HTML email. Most Sourceware projects allow it, if there is at least a text/plain alternative. But public-inbox is not so forgiving, it only allows plain-text emails, HTML is rejected by default. So the https://inbox.sourceware.org archive was incomplete. We now have a filter that removes redundant HTML parts before storing in public-inbox. And we re-imported missing emails to make the archive complete. But please don't sent HTML email. It will make DKIM verification of your email impossible. = Continuous glibc src and manual snapshots glibc is the latest Sourceware project that provides continuous snapshots from current git with both source archives and manuals. https://snapshots.sourceware.org This helps to make sure the release process always works and that manuals can be produced in various formats. Thanks to OSUOSL for hosting the snapshots server. = Conservancy BBB server for Sourceware projects The Software Freedom Conservancy is extending the use of their Big Blue Button instance https://bbb.sfconservancy.org/ to Sourceware projects that want to host video meetings. https://sfconservancy.org/news/2023/aug/15/exit-zoom/ Please contact ove...@so... for instructions on how to create an account for your project. Note: Anyone is able to join a meeting, accounts are only required to create new meetings. = Working on individual tech sovereignty together Valgrind was picked for a FUTO https://futo.org Microgrant, which has been donated to Sourceware through the Software Freedom Conservancy for maintaining and expanding the infrastructure for Valgrind and other core toolchain and developer tool projects. https://www.youtube.com/watch?v=aYzzOfQehPg If you want to donate to Sourceware please see https://sfconservancy.org/donate and become a Conservancy Sustainer or give directly by mentioning Sourceware as comment or on the memo line. = Sourceware Overseers Open Office hour Every second Friday of the month is the Sourceware Overseers Open Office hour in #overseers on irc.libera.chat from 18:00 till 19:00 UTC. The next one will be Friday September 4th. Please feel free to drop by with any Sourceware services and hosting questions. Feedback and questions about the Sourceware 25 Roadmap are also very appreciated. https://sourceware.org/sourceware-25-roadmap.html Of course you are welcome to drop into the #overseers channel at any time and we can also be reached through email and bugzilla: https://sourceware.org/mission.html#organization If you aren't already and want to keep up to date on Sourceware infrastructure services then please also subscribe to the overseers mailinglist. https://sourceware.org/mailman/listinfo/overseers We are also on the fediverse these days: https://fosstodon.org/@sourceware The Sourceware Project Leadership Committee also meets once a month to discuss all community input. The committee will set priorities and decide how to spend any funds, negotiate with hardware and service partners, create budgets together with the Conservancy, or decide when a new fundraising campaign is needed. Up till now we have been able to add new services without needing to use any of the collected funds. Our hardware partners have also been very generous with providing extra servers when requested. The current committee includes Frank Ch. Eigler, Christopher Faylor, Ian Kelling, Ian Lance Taylor, Tom Tromey, Jon Turney, Mark J. Wielaard and Elena Zannoni. |
|
From: Paul F. <pa...@so...> - 2023-08-30 06:06:32
|
https://sourceware.org/git/gitweb.cgi?p=valgrind.git;h=acf2c99ec39f3301e51152b1e78156616e8d6129 commit acf2c99ec39f3301e51152b1e78156616e8d6129 Author: Paul Floyd <pj...@wa...> Date: Wed Aug 30 08:06:00 2023 +0200 FreeBSD: a bit of cleanup of README.freebsd Diff: --- README.freebsd | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/README.freebsd b/README.freebsd index d197efcaf3..eb6a510ada 100644 --- a/README.freebsd +++ b/README.freebsd @@ -14,7 +14,7 @@ cd /usr/ports/devel/valgrind && make install clean Building Valgrind ~~~~~~~~~~~~~~~~~ -Install ports for autoconf, automake, libtool and gmake. +Install ports for autotools, gmake and python. $ sh autogen.sh $ ./configure --prefix=/where/ever @@ -137,13 +137,13 @@ the following packages gdb gsed -In addition to running "make" you will need to run -"make check" to build the regression test exectutables -and "make regtest". Again, more details can be seen in +In addition to running "gmake" you will need to run +"gmake check" to build the regression test exectutables +and "gmake regtest". Again, more details can be seen in README_DEVELOPERS. If you want to run the 'nightly' script (see nightly/README.txt) -you will need to install coreutils and modify the +you will need to install gcp and coreutils and modify the nightly/conf/freebsd.* files. The default configuration sends an e-mail to the valgrind-testresults mailing list. |
|
From: Carl L. <ce...@us...> - 2023-08-29 19:15:52
|
Mark, Paul, Aaron: On Sun, 2023-08-27 at 17:36 +0200, Mark Wielaard wrote: > Hi Paul, > > On Sat, Aug 26, 2023 at 06:26:29AM +0200, Paul Floyd wrote: > > I was just looking at valgrind-testresults and there was a jump in > > the number of failures on ppc64le on Aug 17th, just after the > > deferred debuginfo reading change. > > > > One random example > > > > ================================================= > > ./valgrind-old/drd/tests/tc16_byterace.stderr.diff > > ================================================= > > --- tc16_byterace.stderr.exp 2023-08-17 03:01:09.168107928 +0000 > > +++ tc16_byterace.stderr.out 2023-08-17 03:28:20.030515805 +0000 > > @@ -1,8 +1,7 @@ > > > > Conflicting load by thread 1 at 0x........ size 1 > > at 0x........: main (tc16_byterace.c:34) > > -Location 0x........ is 0 bytes inside bytes[4], > > -a global variable declared at tc16_byterace.c:7 > > +Allocation context: BSS section of tc16_byterace > > > > It does look to me like this is a debuginfo issue. > > You are correct, this was caused by: > > commit 60f7e89ba32b54d73b9e36d49e28d0f559ade0b9 > Author: Aaron Merey <am...@re...> > Date: Fri Jun 30 18:31:42 2023 -0400 > > Support lazy reading and downloading of DWARF debuginfo > > That commit shouldn't have been architecture specific, but it > apparently was. I put some early analysis into the bug > > https://bugs.kde.org/show_bug.cgi?id=471807#c16 > > The patch depends on a call to find_DiCfSI triggering a full > debuginfo load. > find_DiCfSI is (indirectly called) when ML_(get_CFA) is called. > It looks like ppc64le doesn't call ML_(get_CFA) because we have the > following in > coregrind/m_debuginfo/d3basics.c > > #if defined(VGP_ppc32_linux) || defined(VGP_ppc64be_linux) \ > || defined(VGP_ppc64le_linux) > /* Valgrind on ppc32/ppc64 currently doesn't use unwind > info. */ > uw1 = ML_(read_Addr)((UChar*)regs->sp); > #else > uw1 = ML_(get_CFA)(regs->ip, regs->sp, regs->fp, 0, > ~(UWord) 0); > #endif I verified that the patch from Aaron causes regression failures on Power 9 and Power 10. Per the comment above, not sure why PowerPC does not support the get_CFA call? Unfortunately, I don't know much about callgrind or the debuginfo stuff. Not obvious to me at first glance how to fix the issue. I would be happy to help test a patch or work on a patch if someone has specific suggestions on how to fix the issue on PowerPC. Carl |
|
From: Jojo R <rj...@gm...> - 2023-08-29 07:47:14
|
Hi, We are glad to open source RVV implementation here again: https://github.com/rjiejie/valgrind-riscv64 4 kinds extra ISAs were added in this repo: RV64Zfh : Half-precision floating-point RV64Xthead [1] : T-THEAD vendor extension for RV64G RV64V0p7 [2] : Vector 0.7.1 RV64V [3] : Vector 1.0 [1] https://github.com/T-head-Semi/thead-extension-spec [2] https://github.com/riscv/riscv-v-spec/releases/tag/0.7.1 [3] https://github.com/riscv/riscv-v-spec/releases/tag/v1.0 Regards --Jojo 在 2023/7/17 15:05, Jojo R 写道: > > Hi, > > Sorry for the late reply, > > i have been pushing the progress of valgrind RVV implementation 😄 > We finished the first version and tested with full RVV intrinsics spec. > > For real project and developers, we implement the first useable/ full > functionality's RVV valgrind with dirtycall method, > and we will make experiment or optimize RVV implementation on ideal > RVV design. > > Back to the RVV RFC, we are happy to share our thinking of design, see > attachment for more details :) > > Regards > > --Jojo > > 在 2023/4/21 17:25, Jojo R 写道: >> >> Hi, >> >> We consider to add RVV/Vector [1] feature in valgrind, there are some >> challenges. >> RVV like ARM's SVE [2] programming model, it's scalable/VLA, that >> means the vector length is agnostic. >> ARM's SVE is not supported in valgrind :( >> >> There are three major issues in implementing RVV instruction set in >> Valgrind as following: >> >> 1. Scalable vector register width VLENB >> 2. Runtime changing property of LMUL and SEW >> 3. Lack of proper VEX IR to represent all vector operations >> >> We propose applicable methods to solve 1 and 2. As for 3, we explore >> several possible but maybe imperfect approaches to handle different >> cases. >> >> We start from 1. As each guest register should be described in >> VEXGuestState struct, the vector registers with scalable width of >> VLENB can be added into VEXGuestState as arrays using an allowable >> maximum length like 2048/4096. >> >> The actual available access range can be determined at Valgrind >> startup time by querying the CPU for its vector capability or some >> suitable setup steps. >> >> >> To solve problem 2, we are inspired by already-proven techniques in >> QEMU, where translation blocks are broken up when certain critical >> CSRs are set. Because the guest code to IR translation relies on the >> precise value of LMUL/SEW and they may change within a basic block, >> we can break up the basic block each time encountering a vsetvl{i} >> instruction and return to the scheduler to execute the translated >> code and update LMUL/SEW. Accordingly, translation cache management >> should be refactored to detect the changing of LMUL/SEW to invalidate >> outdated code cache. Without losing the generality, the LMUL/SEW >> should be encoded into an ULong flag such that other architectures >> can leverage this flag to store their arch-dependent information. The >> TTentry struct should also take the flag into account no matter >> insertion or deletion. By doing this, the flag carries the newest >> LMUL/SEW throughout the simulation and can be passed to disassemble >> functions using the VEXArchInfo struct such that we can get the real >> and newest value of LMUL and SEW to facilitate our translation. >> >> Also, some architecture-related code should be taken care of. Like >> m_dispatch part, disp_cp_xindir function looks up code cache using >> hardcoded assembly by checking the requested guest state IP and >> translation cache entry address with no more constraints. Many other >> modules should be checked to ensure the in-time update of LMUL/SEW is >> instantly visible to essential parts in Valgrind. >> >> >> The last remaining big issue is 3, which we introduce some ad-hoc >> approaches to deal with. We summarize these approaches into three >> types as following: >> >> 1. Break down a vector instruction to scalar VEX IR ops. >> 2. Break down a vector instruction to fixed-length VEX IR ops. >> 3. Use dirty helpers to realize vector instructions. >> >> The very first method theoretically exists but is probably not >> applicable as the number of IR ops explodes when a large VLENB is >> adopted. Imaging a configuration of VLENB=512, SEW=8, LMUL=8, the VL >> is 512 * 8 / 8 = 512, meaning that a single vector instruction turns >> into 512 scalar instructions and each scalar instruction would be >> expanded to multiple IRs. To make things worse, the tool >> instrumentation will insert more IRs between adjacent scalar IR ops. >> As a result, the performance is likely to be slowed down thousand >> times during running a real-world application with lots of vector >> instructions. Therefore, the other two methods are more promising and >> we will discuss them below. >> >> 2 and 3 are not mutually exclusive as we may choose a suitable method >> from them to implement a vector instruction regarding its concrete >> behavior. To explain these methods in detail, we present some >> instances to illustrate their pros and cons. >> >> In terms of method 2, we have real values of VLENB/LMUL/SEW. The >> simple case is VLENB <= 256 and LMUL=1, where many SIMD IR ops are >> available and can be directly applied to represent vector operations. >> However, even when VLENB is restricted to 128, it still exceeds the >> maximum SIMD width of 256 supported by VEX IR if LMUL>2. Hence, here >> are two variants of method 2 to deal with long vectors: >> >> >> *2.1*Add more SIMD IR ops such as 1024/2048/4096, and translate >> vector instructions in the granularity of VLENB. Accordingly, >> VLENB=4096 with LMUL=2 is fulfilled by two 4096 SIMD VEX IR ops. >> >> * *pros*: it encourages VEX backend to generate more compact and >> efficient SIMD code (maybe). Particularly,it accommodatesmask and >> gather/scatter (indexed) instructions by delivering more >> information in IR itself. >> * *cons*: too many new IR ops need to be introduced in VEX as each >> op of different length should implement its add/sub/mul variants. >> New data types to denote long vectors are necessary too, causing >> difficulties in both VEX backend register allocation and tool >> instrumentation. >> >> *2.2*Break down long vectors to multiple repeated SIMD ops. For >> instance, a vadd.vv vector instruction with VLENB=256/LMUL=2/SEW=8 is >> composed of four operators of Iop_Add8x16 type. >> >> * *pros:*less efforts are required in register allocation and tool >> instrumentation. The VEX frontend is able to notify the backend >> to generate efficient vector instructions by existing Iops. It >> better trades off the complexity of adding many long vector IR >> ops and the benefit of generating high-efficiency host code. >> * *cons:*it is hard to describe a mask operation given that the >> mask is pretty flexible (the least significant bit of each >> segment of v0). Additionally, gather/scatter instructions may >> have similar problems in appropriately dividing index registers. >> There are various corner cases left here such as widening >> arithmetic operations (widening SIMD IR ops are currently not >> compatible) and vstart CSR register. When using fixed-length IR >> ops to comprise a vector instruction, we will inevitably tell >> each IR op which position encoded in vstart you can start to >> process the data. We can use vstart as a normal guest state >> virtual register to calculate each op's start position as a guard >> IRExpr or obtain the value of vstart like what we do in LMUL/SEW. >> Nevertheless, it is non-trivial to decompose a vector instruction >> concisely. >> >> In short, both 2.1 and 2.2 confront a dilemma in reducing engineering >> efforts of refactoring Valgrind elegantly as well as implementing the >> vector instruction set efficiently. Same obstacles exist in ARM SVE >> as they are scalable vector instructions and flexible in many ways. >> >> The final solution is the dirty helper. It is undoubtedly practical >> and requires possibly the least engineering efforts in dealing with >> so many details in Valgrind. In this design, each instruction is >> completed using an inline assembly running the same instruction on >> the host. Moreover, tool instrumentation already handles IRDirty >> except that new fields should be added in _IRDirty struct to indicate >> strided/indexed/masked memory accesses and arithmetic operations. >> >> * *pros:*it supports all instructions without bothering to build >> complicated IR expressions and statements. It executes vector >> instructions using host CPU to get acceleration to some extent. >> Besides, we do not need to add VEX backend to translate new IRs >> to vector instructions. >> * *cons:*the dirty helper always keeps its operations in a black >> box such that tools can never see what happens in a dirty helper. >> Like memcheck, the bit precision merit is missing once it meets a >> dirty helper as the V-bit propagation chain adopts a pretty >> coarse determination strategy. On the other hand, it is also not >> an elegant way to implement the entire ISA extension in dirty >> helpers. >> >> In summary, it is far to reach a truly applicable solution in adding >> vector extensions in Valgrind. We need to do detailed and >> comprehensive estimations on different vector instruction categories. >> >> Any feedback is welcome in github [3] also. >> >> >> [1] https://github.com/riscv/riscv-v-spec >> >> [2] >> https://community.arm.com/arm-research/b/articles/posts/the-arm-scalable-vector-extension-sve >> >> [3] https://github.com/petrpavlu/valgrind-riscv64/issues/17 >> >> >> Thanks. >> >> Jojo >> >> >> >> _______________________________________________ >> Valgrind-developers mailing list >> Val...@li... >> https://lists.sourceforge.net/lists/listinfo/valgrind-developers > > > _______________________________________________ > Valgrind-developers mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-developers |
|
From: Paul F. <pa...@so...> - 2023-08-29 05:56:47
|
https://sourceware.org/git/gitweb.cgi?p=valgrind.git;h=69ac8e03a2e22175337acf999f8c8814143ba08a commit 69ac8e03a2e22175337acf999f8c8814143ba08a Author: Paul Floyd <pj...@wa...> Date: Tue Aug 29 07:55:31 2023 +0200 FreeBSD: add memcheck suppression for libc newlocale still reachable memory Diff: --- freebsd.supp | 17 +++++++++++++---- 1 file changed, 13 insertions(+), 4 deletions(-) diff --git a/freebsd.supp b/freebsd.supp index 10d4a10454..471fbe6fd3 100644 --- a/freebsd.supp +++ b/freebsd.supp @@ -43,12 +43,21 @@ fun:posix_fallocate } { - MEMCHECK-LIBX-REACHABLE-2 + MEMCHECK-LIBC-REACHABLE-2 Memcheck:Leak match-leak-kinds: reachable fun:malloc - obj:/lib/libc.so.7 - obj:/lib/libc.so.7 - obj:/lib/libc.so.7 + obj:*/lib*/libc.so.7 + obj:*/lib*/libc.so.7 + obj:*/lib*/libc.so.7 fun:fwrite } +# when calling std::locale::facet::_S_create_c_locale +{ + MEMCHECK-LIBC-REACHABLE-3 + Memcheck:Leak + match-leak-kinds: reachable + fun:calloc + ... + fun:newlocale +} |
|
From: Paul F. <pa...@so...> - 2023-08-27 16:03:37
|
https://sourceware.org/git/gitweb.cgi?p=valgrind.git;h=5b43a08b64d2038e384bdcc229dfac3b826ae32a commit 5b43a08b64d2038e384bdcc229dfac3b826ae32a Author: Paul Floyd <pj...@wa...> Date: Sun Aug 27 18:03:07 2023 +0200 Linux: remove a couple of cpmpiler warnings Diff: --- massif/tests/Makefile.am | 2 +- memcheck/tests/realloc_size_zero.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/massif/tests/Makefile.am b/massif/tests/Makefile.am index cc79beceb4..f8deeb5766 100644 --- a/massif/tests/Makefile.am +++ b/massif/tests/Makefile.am @@ -91,7 +91,7 @@ AM_CXXFLAGS += $(AM_FLAG_M3264_PRI) bug469146_SOURCES = bug469146.cpp # -fno-optimize-sibling-calls because otherwise some platforms will have # tail call optimization which messes up --ignore-fn -bug469146_CXXFLAGS = $(AM_CXXFLAGS) -O2 -fno-optimize-sibling-calls +bug469146_CXXFLAGS = $(AM_CXXFLAGS) -O2 -fno-optimize-sibling-calls @FLAG_W_NO_USE_AFTER_FREE@ new_cpp_SOURCES = new-cpp.cpp overloaded_new_SOURCES = overloaded-new.cpp # pre C++11 compilers don't have exception specs diff --git a/memcheck/tests/realloc_size_zero.c b/memcheck/tests/realloc_size_zero.c index c9d8e74777..3e25d85a61 100644 --- a/memcheck/tests/realloc_size_zero.c +++ b/memcheck/tests/realloc_size_zero.c @@ -22,7 +22,7 @@ int main(void) } errno = 0; - volatile void *ptr = NULL; + void *ptr = NULL; volatile size_t size = 0U; char *p2 = realloc(ptr, size); if (p2) { |
|
From: Mark W. <ma...@kl...> - 2023-08-27 15:47:00
|
Hi Paul, On Tue, Aug 22, 2023 at 03:58:26PM +0200, Floyd, Paul wrote: > On 22/08/2023 15:16, Mark Wielaard wrote: > >There are a couple of larger features that could use more people to > >take a look: > > > >- AVX-512 support, incomplete and we lost contact with the original > > developer:https://bugs.kde.org/show_bug.cgi?id=383010 > >- Risc-V port, seems to have a dedicated developers: > > https://bugs.kde.org/show_bug.cgi?id=468575 > > There also is active development to extend it with > > Vector Register support > >https://sourceforge.net/p/valgrind/mailman/valgrind-developers/thread/20230526135944.1959407-5-fei2.wu%40intel.com/ > >- Loongarch64 port, seems pretty complete, split out in 40 commits: > > https://bugs.kde.org/show_bug.cgi?id=457504 > > > >Unfortunately I cannot promise to have time before October to look at > >all of these. So if others could take a look and report on status that > >would be great. > > Do we have any contacts at Intel (or AMD) for help with AVX512? I don't have any, but I'll ask around. > My wishlist for the October release of 3.22 includes all of the above plus > > * get at least one dev each for RISC-V and Loongson on board with > sourceware git write access in order to be able to support the > platforms directly And hopefully nightly test results and/or buildbot builders. > * the long running question of what to do with macOS Any testers/developers/volunteers? > * memcheck aligned and sized checks plus maybe c23 free_sized and > free_sized_aligned plus Linux aligned_alloc > > * detect whether a debug version of libstdc++ is being used and then > use that to automatically turn on or off mismatch detection That is an interesting idea. > And I expect the usual steady stream of smaller fixes. On irc we also discussed having a "rolling release branch" where we put small/essential bug fixes (some of which some distros now backport themselves). Cheers, Mark |
|
From: Mark W. <ma...@kl...> - 2023-08-27 15:36:24
|
Hi Paul,
On Sat, Aug 26, 2023 at 06:26:29AM +0200, Paul Floyd wrote:
> I was just looking at valgrind-testresults and there was a jump in
> the number of failures on ppc64le on Aug 17th, just after the
> deferred debuginfo reading change.
>
> One random example
>
> =================================================
> ./valgrind-old/drd/tests/tc16_byterace.stderr.diff
> =================================================
> --- tc16_byterace.stderr.exp 2023-08-17 03:01:09.168107928 +0000
> +++ tc16_byterace.stderr.out 2023-08-17 03:28:20.030515805 +0000
> @@ -1,8 +1,7 @@
>
> Conflicting load by thread 1 at 0x........ size 1
> at 0x........: main (tc16_byterace.c:34)
> -Location 0x........ is 0 bytes inside bytes[4],
> -a global variable declared at tc16_byterace.c:7
> +Allocation context: BSS section of tc16_byterace
>
> It does look to me like this is a debuginfo issue.
You are correct, this was caused by:
commit 60f7e89ba32b54d73b9e36d49e28d0f559ade0b9
Author: Aaron Merey <am...@re...>
Date: Fri Jun 30 18:31:42 2023 -0400
Support lazy reading and downloading of DWARF debuginfo
That commit shouldn't have been architecture specific, but it
apparently was. I put some early analysis into the bug
https://bugs.kde.org/show_bug.cgi?id=471807#c16
The patch depends on a call to find_DiCfSI triggering a full debuginfo load.
find_DiCfSI is (indirectly called) when ML_(get_CFA) is called.
It looks like ppc64le doesn't call ML_(get_CFA) because we have the following in
coregrind/m_debuginfo/d3basics.c
#if defined(VGP_ppc32_linux) || defined(VGP_ppc64be_linux) \
|| defined(VGP_ppc64le_linux)
/* Valgrind on ppc32/ppc64 currently doesn't use unwind info. */
uw1 = ML_(read_Addr)((UChar*)regs->sp);
#else
uw1 = ML_(get_CFA)(regs->ip, regs->sp, regs->fp, 0, ~(UWord) 0);
#endif
Cheers,
Mark
|