You can subscribe to this list here.
| 2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
(122) |
Nov
(152) |
Dec
(69) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2003 |
Jan
(6) |
Feb
(25) |
Mar
(73) |
Apr
(82) |
May
(24) |
Jun
(25) |
Jul
(10) |
Aug
(11) |
Sep
(10) |
Oct
(54) |
Nov
(203) |
Dec
(182) |
| 2004 |
Jan
(307) |
Feb
(305) |
Mar
(430) |
Apr
(312) |
May
(187) |
Jun
(342) |
Jul
(487) |
Aug
(637) |
Sep
(336) |
Oct
(373) |
Nov
(441) |
Dec
(210) |
| 2005 |
Jan
(385) |
Feb
(480) |
Mar
(636) |
Apr
(544) |
May
(679) |
Jun
(625) |
Jul
(810) |
Aug
(838) |
Sep
(634) |
Oct
(521) |
Nov
(965) |
Dec
(543) |
| 2006 |
Jan
(494) |
Feb
(431) |
Mar
(546) |
Apr
(411) |
May
(406) |
Jun
(322) |
Jul
(256) |
Aug
(401) |
Sep
(345) |
Oct
(542) |
Nov
(308) |
Dec
(481) |
| 2007 |
Jan
(427) |
Feb
(326) |
Mar
(367) |
Apr
(255) |
May
(244) |
Jun
(204) |
Jul
(223) |
Aug
(231) |
Sep
(354) |
Oct
(374) |
Nov
(497) |
Dec
(362) |
| 2008 |
Jan
(322) |
Feb
(482) |
Mar
(658) |
Apr
(422) |
May
(476) |
Jun
(396) |
Jul
(455) |
Aug
(267) |
Sep
(280) |
Oct
(253) |
Nov
(232) |
Dec
(304) |
| 2009 |
Jan
(486) |
Feb
(470) |
Mar
(458) |
Apr
(423) |
May
(696) |
Jun
(461) |
Jul
(551) |
Aug
(575) |
Sep
(134) |
Oct
(110) |
Nov
(157) |
Dec
(102) |
| 2010 |
Jan
(226) |
Feb
(86) |
Mar
(147) |
Apr
(117) |
May
(107) |
Jun
(203) |
Jul
(193) |
Aug
(238) |
Sep
(300) |
Oct
(246) |
Nov
(23) |
Dec
(75) |
| 2011 |
Jan
(133) |
Feb
(195) |
Mar
(315) |
Apr
(200) |
May
(267) |
Jun
(293) |
Jul
(353) |
Aug
(237) |
Sep
(278) |
Oct
(611) |
Nov
(274) |
Dec
(260) |
| 2012 |
Jan
(303) |
Feb
(391) |
Mar
(417) |
Apr
(441) |
May
(488) |
Jun
(655) |
Jul
(590) |
Aug
(610) |
Sep
(526) |
Oct
(478) |
Nov
(359) |
Dec
(372) |
| 2013 |
Jan
(467) |
Feb
(226) |
Mar
(391) |
Apr
(281) |
May
(299) |
Jun
(252) |
Jul
(311) |
Aug
(352) |
Sep
(481) |
Oct
(571) |
Nov
(222) |
Dec
(231) |
| 2014 |
Jan
(185) |
Feb
(329) |
Mar
(245) |
Apr
(238) |
May
(281) |
Jun
(399) |
Jul
(382) |
Aug
(500) |
Sep
(579) |
Oct
(435) |
Nov
(487) |
Dec
(256) |
| 2015 |
Jan
(338) |
Feb
(357) |
Mar
(330) |
Apr
(294) |
May
(191) |
Jun
(108) |
Jul
(142) |
Aug
(261) |
Sep
(190) |
Oct
(54) |
Nov
(83) |
Dec
(22) |
| 2016 |
Jan
(49) |
Feb
(89) |
Mar
(33) |
Apr
(50) |
May
(27) |
Jun
(34) |
Jul
(53) |
Aug
(53) |
Sep
(98) |
Oct
(206) |
Nov
(93) |
Dec
(53) |
| 2017 |
Jan
(65) |
Feb
(82) |
Mar
(102) |
Apr
(86) |
May
(187) |
Jun
(67) |
Jul
(23) |
Aug
(93) |
Sep
(65) |
Oct
(45) |
Nov
(35) |
Dec
(17) |
| 2018 |
Jan
(26) |
Feb
(35) |
Mar
(38) |
Apr
(32) |
May
(8) |
Jun
(43) |
Jul
(27) |
Aug
(30) |
Sep
(43) |
Oct
(42) |
Nov
(38) |
Dec
(67) |
| 2019 |
Jan
(32) |
Feb
(37) |
Mar
(53) |
Apr
(64) |
May
(49) |
Jun
(18) |
Jul
(14) |
Aug
(53) |
Sep
(25) |
Oct
(30) |
Nov
(49) |
Dec
(31) |
| 2020 |
Jan
(87) |
Feb
(45) |
Mar
(37) |
Apr
(51) |
May
(99) |
Jun
(36) |
Jul
(11) |
Aug
(14) |
Sep
(20) |
Oct
(24) |
Nov
(40) |
Dec
(23) |
| 2021 |
Jan
(14) |
Feb
(53) |
Mar
(85) |
Apr
(15) |
May
(19) |
Jun
(3) |
Jul
(14) |
Aug
(1) |
Sep
(57) |
Oct
(73) |
Nov
(56) |
Dec
(22) |
| 2022 |
Jan
(3) |
Feb
(22) |
Mar
(6) |
Apr
(55) |
May
(46) |
Jun
(39) |
Jul
(15) |
Aug
(9) |
Sep
(11) |
Oct
(34) |
Nov
(20) |
Dec
(36) |
| 2023 |
Jan
(79) |
Feb
(41) |
Mar
(99) |
Apr
(169) |
May
(48) |
Jun
(16) |
Jul
(16) |
Aug
(57) |
Sep
(32) |
Oct
|
Nov
|
Dec
|
|
From: Andreas A. <ar...@so...> - 2023-07-06 15:19:08
|
https://sourceware.org/git/gitweb.cgi?p=valgrind.git;h=cb684b50e7d4d845b56abea72fd9b9925fed644e commit cb684b50e7d4d845b56abea72fd9b9925fed644e Author: Andreas Arnez <ar...@li...> Date: Mon May 22 19:49:08 2023 +0200 Bug 470132 - s390x: Increase test coverage for VGM Add more tests for the VGM instruction, to verify the fix for the VGM wrap-around case. Also test setting unused bits in the I2 and I3 fields, to check that Valgrind ignores them as it should. Diff: --- none/tests/s390x/vec2.c | 26 ++++++++++++++++++++++++++ none/tests/s390x/vec2.stdout.exp | 20 ++++++++++++++++++++ 2 files changed, 46 insertions(+) diff --git a/none/tests/s390x/vec2.c b/none/tests/s390x/vec2.c index 73b04dee49..0d549cb235 100644 --- a/none/tests/s390x/vec2.c +++ b/none/tests/s390x/vec2.c @@ -301,6 +301,31 @@ static void test_all_fp_int_conversions() #undef TEST_EXEC #undef TEST_GENERATE +/* -- Vector generate mask -- */ + +#define XTEST(insn, i2, i3) \ + do { \ + ulong_v out = vec_ini; \ + puts(#insn " " #i2 "," #i3); \ + __asm__(#insn " %[out]," #i2 "," #i3 : [out] "+v"(out) : :); \ + printf("\t%016lx %016lx\n", out[0], out[1]); \ + } while (0) + +static void test_all_generate_mask() +{ + XTEST(vgmb, 2, 1); + XTEST(vgmb, 0xf7, 0x30); + XTEST(vgmb, 0, 0); + XTEST(vgmh, 3, 2); + XTEST(vgmh, 15, 15); + XTEST(vgmf, 4, 3); + XTEST(vgmf, 16, 17); + XTEST(vgmg, 55, 63); + XTEST(vgmg, 43, 55); + XTEST(vgmg, 63, 2); +} + +#undef XTEST int main() { @@ -310,5 +335,6 @@ int main() test_all_double_bitshifts(); test_all_int_fp_conversions(); test_all_fp_int_conversions(); + test_all_generate_mask(); return 0; } diff --git a/none/tests/s390x/vec2.stdout.exp b/none/tests/s390x/vec2.stdout.exp index b32cbe1bc0..7b894b9519 100644 --- a/none/tests/s390x/vec2.stdout.exp +++ b/none/tests/s390x/vec2.stdout.exp @@ -166,3 +166,23 @@ vcsfp 0 vcsfp 8 00ffffff - - - 00000004 - - - +vgmb 2,1 + ffffffffffffffff ffffffffffffffff +vgmb 0xf7,0x30 + 8181818181818181 8181818181818181 +vgmb 0,0 + 8080808080808080 8080808080808080 +vgmh 3,2 + ffffffffffffffff ffffffffffffffff +vgmh 15,15 + 0001000100010001 0001000100010001 +vgmf 4,3 + ffffffffffffffff ffffffffffffffff +vgmf 16,17 + 0000c0000000c000 0000c0000000c000 +vgmg 55,63 + 00000000000001ff 00000000000001ff +vgmg 43,55 + 00000000001fff00 00000000001fff00 +vgmg 63,2 + e000000000000001 e000000000000001 |
|
From: Andreas A. <ar...@so...> - 2023-07-06 15:19:06
|
https://sourceware.org/git/gitweb.cgi?p=valgrind.git;h=6635fc58345ba2c36589f0bef4d326166e947023 commit 6635fc58345ba2c36589f0bef4d326166e947023 Author: Andreas Arnez <ar...@li...> Date: Mon May 22 18:57:35 2023 +0200 Bug 470132 - s390x: Fix the wrap-around case in VGM Valgrind's implementation of VGM is incomplete: * It doesn't support generating a wrap-around bit mask. Such a mask should result when the ending bit position is smaller than the starting bit position. Valgrind runs into an assertion failure instead. * It doesn't ignore unused bits in the I2 and I3 fields of the instruction, as it should. Fix this by re-implementing the main logic in s390_irgen_VGM(). Diff: --- NEWS | 1 + VEX/priv/guest_s390_toIR.c | 57 ++++++++++++++++++---------------------------- 2 files changed, 23 insertions(+), 35 deletions(-) diff --git a/NEWS b/NEWS index a4e7533115..783612fbb9 100644 --- a/NEWS +++ b/NEWS @@ -38,6 +38,7 @@ are not entered into bugzilla tend to get forgotten about or ignored. 469146 massif --ignore-fn does not ignore inlined functions 469768 Make it possible to install gdb scripts in a different location 470121 Can't run callgrind_control with valgrind 3.21.0 because of perl errors +470132 s390x: Assertion failure on VGM instruction 470520 Multiple realloc zero errors crash in MC_(eq_Error) 470713 Failure on the Yosys project: valgrind: m_libcfile.c:1802 (Bool vgPlain_realpath(const HChar *, HChar *)): diff --git a/VEX/priv/guest_s390_toIR.c b/VEX/priv/guest_s390_toIR.c index 11dda41ef5..d9d746c38a 100644 --- a/VEX/priv/guest_s390_toIR.c +++ b/VEX/priv/guest_s390_toIR.c @@ -16388,50 +16388,37 @@ s390_irgen_VGBM(UChar v1, UShort i2, UChar m3 __attribute__((unused))) static const HChar * s390_irgen_VGM(UChar v1, UShort i2, UChar m3) { - UChar from = (i2 & 0xff00) >> 8; - UChar to = (i2 & 0x00ff); - ULong value = 0UL; - IRType type = s390_vr_get_type(m3); - vassert(from <= to); - - UChar maxIndex = 0; - switch (type) { - case Ity_I8: - maxIndex = 7; - break; - case Ity_I16: - maxIndex = 15; - break; - case Ity_I32: - maxIndex = 31; - break; - case Ity_I64: - maxIndex = 63; - break; - default: - vpanic("s390_irgen_VGM: unknown type"); - } - - for(UChar index = from; index <= to; index++) { - value |= (1ULL << (maxIndex - index)); - } - - IRExpr *fillValue; - switch (type) { - case Ity_I8: + s390_insn_assert("vgm", m3 <= 3); + + UChar max_idx = (8 << m3) - 1; + UChar from = max_idx & (i2 >> 8); + UChar to = max_idx & i2; + ULong all_one = (1ULL << max_idx << 1) - 1; + ULong value = (all_one >> from) ^ (all_one >> to >> 1); + + /* In case of wrap-around we now have a value that needs inverting: + to from + V V + 00000111111111110000000000000000 */ + if (to < from) + value ^= all_one; + + IRExpr* fillValue; + switch (m3) { + case 0: fillValue = mkU8(value); break; - case Ity_I16: + case 1: fillValue = mkU16(value); break; - case Ity_I32: + case 2: fillValue = mkU32(value); break; - case Ity_I64: + case 3: fillValue = mkU64(value); break; default: - vpanic("s390_irgen_VGM: unknown type"); + vpanic("s390_irgen_VGM: unknown element size"); } s390_vr_fill(v1, fillValue); |
|
From: Wu, F. <fe...@in...> - 2023-07-06 12:40:15
|
On 5/29/2023 11:29 AM, Wu, Fei wrote:
> On 5/28/2023 1:06 AM, Petr Pavlu wrote:
>> On 21. Apr 23 17:25, Jojo R wrote:
>>> We consider to add RVV/Vector [1] feature in valgrind, there are some
>>> challenges.
>>> RVV like ARM's SVE [2] programming model, it's scalable/VLA, that means the
>>> vector length is agnostic.
>>> ARM's SVE is not supported in valgrind :(
>>>
>>> There are three major issues in implementing RVV instruction set in Valgrind
>>> as following:
>>>
>>> 1. Scalable vector register width VLENB
>>> 2. Runtime changing property of LMUL and SEW
>>> 3. Lack of proper VEX IR to represent all vector operations
>>>
>>> We propose applicable methods to solve 1 and 2. As for 3, we explore several
>>> possible but maybe imperfect approaches to handle different cases.
>>>
I did a very basic prototype for vlen Vector-IR, particularly on RISC-V
Vector (RVV):
* Define new iops such as Iop_VAdd8/16/32/64, the difference from
existing SIMD version is that no element number is specified like
Iop_Add8x32
* Define new IR type Ity_VLen along side existing types such as Ity_I64,
Ity_V256
* Define new class HRcVecVLen in HRegClass for vlen vector registers
The real length is embedded in both IROp and IRType for vlen ops/types,
it's runtime-decided and already known when handling insn such as vadd,
this leads to more flexibility, e.g. backend can issue extra vsetvl if
necessary.
With the above, RVV instruction in the guest can be passed from
frontend, to memcheck, to the backend, and generate the final RVV insn
during host isel, a very basic testcase has been tested.
Now here comes to the complexities:
1. RVV has the concept of LMUL, which groups multiple (or partial)
vector registers, e.g. when LMUL==2, v2 means the real v2+v3. This
complicates the register allocation.
2. RVV uses the "implicit" v0 for mask, its content must be loaded to
the exact "v0" register instead of any other ones if host isel wants to
leverage RVV insn, this implicitness in ISA requires more explicitness
in Valgrind implementation.
For #1 LMUL, a new register allocation algorithm for it can be added,
and it will be great if someone is willing to try it, I'm not sure how
much effort it will take. The other way is splitting it into multiple
ops which only takes one vector register, taking vadd for example, 2
vadd will run with LMUL=1 for one vadd with LMUL=2, this is still okay
for the widening insn, most of the arithmetic insns can be covered in
this way. The exception could be register gather insn vrgather, which we
can consult other ways for it, e.g. scalar or helper.
For #2 v0 mask, one way is to handle the mask in the very beginning at
guest_riscv64_toIR.c, similar to what AVX port does:
a) Read the whole dest register without mask
b) Generate unmasked result by running op without mask
c) Applying mask to a,b and generate the final dest
by doing this, insn with mask is converted to non-mask ones, although
more insns are generated but the performance should be acceptable. There
are still exceptions, e.g. vadc (Add-with-Carry), v0 is not used as mask
but as carry, but just as mentioned above, it's okay to use other ways
for a few insns. Eventually, we can pass v0 mask down to the backend if
it's proved a better solution.
This approach will introduce a bunch of new vlen Vector IRs, especially
the arithmetic IRs such as vadd, my goal is for a good solution which
takes reasonable time to reach usable status, yet still be able to
evolve and generic enough for other vector ISA. Any comments?
Best Regards,
Fei.
>>> We start from 1. As each guest register should be described in VEXGuestState
>>> struct, the vector registers with scalable width of VLENB can be added into
>>> VEXGuestState as arrays using an allowable maximum length like 2048/4096.
>>
>> Size of VexGuestRISCV64State is currently 592 bytes. Adding these large
>> vector registers will bump it by 32*2048/8=8192 bytes.
>>
> Yes, that's the reason in my RFC patches the vlen is set to 128, that's
> the largest room for vector in current design.
>
>> The baseblock layout in VEX is: the guest state, two equal sized areas
>> for shadow state and then a spill area. The RISC-V port accesses the
>> baseblock in generated code via x8/s0. The register is set to the
>> address of the baseblock+2048 (file
>> coregrind/m_dispatch/dispatch-riscv64-linux.S). The extra offset is
>> a small optimization to utilize the fact that load/store instructions in
>> RVI have a signed offset in range [-2048,2047]. The end result is that
>> it is possible to access the baseblock data using only a single
>> instruction.
>>
> Nice design.
>
>> Adding the new vector registers will cause that more instructions will
>> be necessary. For instance, accessing any shadow guest state would
>> naively require a sequence of LUI+ADDI+LOAD/STORE.
>>
>> I suspect this could affect performance quite a bit and might need some
>> optimizing.
>>
> Yes, can we separate the vector registers from the other ones, is it
> able to use two baseblocks? Or we can do some experiments to measure the
> overhead.
>
>>>
>>> The actual available access range can be determined at Valgrind startup time
>>> by querying the CPU for its vector capability or some suitable setup steps.
>>
>> Something to consider is that the virtual CPU provided by Valgrind does
>> not necessarily need to match the host CPU. For instance, VEX could
>> hardcode that its vector registers are only 128 bits in size.
>>
>> I was originally hoping that this is how support for the V extension
>> could be added, but the LMUL grouping looks to break this model.
>>
> Originally I had the same idea, but 128 vlen hardware cannot run the
> software built for larger vlen, e.g. clang has option
> -riscv-v-vector-bits-min, if it's set to 256, then it assumes the
> underlying hardware has at least 256 vlen.
>
>>>
>>>
>>> To solve problem 2, we are inspired by already-proven techniques in QEMU,
>>> where translation blocks are broken up when certain critical CSRs are set.
>>> Because the guest code to IR translation relies on the precise value of
>>> LMUL/SEW and they may change within a basic block, we can break up the basic
>>> block each time encountering a vsetvl{i} instruction and return to the
>>> scheduler to execute the translated code and update LMUL/SEW. Accordingly,
>>> translation cache management should be refactored to detect the changing of
>>> LMUL/SEW to invalidate outdated code cache. Without losing the generality,
>>> the LMUL/SEW should be encoded into an ULong flag such that other
>>> architectures can leverage this flag to store their arch-dependent
>>> information. The TTentry struct should also take the flag into account no
>>> matter insertion or deletion. By doing this, the flag carries the newest
>>> LMUL/SEW throughout the simulation and can be passed to disassemble
>>> functions using the VEXArchInfo struct such that we can get the real and
>>> newest value of LMUL and SEW to facilitate our translation.
>>>
>>> Also, some architecture-related code should be taken care of. Like
>>> m_dispatch part, disp_cp_xindir function looks up code cache using hardcoded
>>> assembly by checking the requested guest state IP and translation cache
>>> entry address with no more constraints. Many other modules should be checked
>>> to ensure the in-time update of LMUL/SEW is instantly visible to essential
>>> parts in Valgrind.
>>>
>>>
>>> The last remaining big issue is 3, which we introduce some ad-hoc approaches
>>> to deal with. We summarize these approaches into three types as following:
>>>
>>> 1. Break down a vector instruction to scalar VEX IR ops.
>>> 2. Break down a vector instruction to fixed-length VEX IR ops.
>>> 3. Use dirty helpers to realize vector instructions.
>>
>> I would also look at adding new VEX IR ops for scalable vector
>> instructions. In particular, if it could be shown that RVV and SVE can
>> use same new ops then it could make a good argument for adding them.
>>
>> Perhaps interesting is if such new scalable vector ops could also
>> represent fixed operations on other architectures, but that is just me
>> thinking out loud.
>>
> It's a good idea to consolidate all vector/simd together, the challenge
> is to verify its feasibility and to speedup the adaption progress, as
> it's supposed to take more efforts and longer time. Is there anyone with
> knowledge or experience of other ISA such as avx/sve on valgrind can
> share the pain and gain, or we can do some quick prototype?
>
> Thanks,
> Fei.
>
>>> [...]
>>> In summary, it is far to reach a truly applicable solution in adding vector
>>> extensions in Valgrind. We need to do detailed and comprehensive estimations
>>> on different vector instruction categories.
>>>
>>> Any feedback is welcome in github [3] also.
>>>
>>>
>>> [1] https://github.com/riscv/riscv-v-spec
>>>
>>> [2] https://community.arm.com/arm-research/b/articles/posts/the-arm-scalable-vector-extension-sve
>>>
>>> [3] https://github.com/petrpavlu/valgrind-riscv64/issues/17
>>
>> Sorry for not being more helpful at this point. As mentioned in the
>> GitHub issue, I still need to get myself more familiar with RVV and how
>> Valgrind handles vector instructions.
>>
>> Thanks,
>> Petr
>>
>>
>>
>> _______________________________________________
>> Valgrind-developers mailing list
>> Val...@li...
>> https://lists.sourceforge.net/lists/listinfo/valgrind-developers
>
>
>
> _______________________________________________
> Valgrind-developers mailing list
> Val...@li...
> https://lists.sourceforge.net/lists/listinfo/valgrind-developers
|
|
From: Paul F. <pj...@wa...> - 2023-07-04 05:29:46
|
Hi I just pushed a change to the web pages that adds this info. A+ Paul |
|
From: Paul F. <pa...@so...> - 2023-07-02 11:02:04
|
https://sourceware.org/git/gitweb.cgi?p=valgrind.git;h=73ec73ed7fe20ec6427dba63e52534136f3c19bd commit 73ec73ed7fe20ec6427dba63e52534136f3c19bd Author: Paul Floyd <pj...@wa...> Date: Sun Jul 2 12:59:40 2023 +0200 FreeBSD: add default to configure.ac FreeBSD 13 versions Also add comment to README.freebsd about ensuring that jails set "uname -r" to be something compatible with the normal RELEASE/STABLE/CURRENT releases. Diff: --- README.freebsd | 4 ++++ configure.ac | 4 ++++ 2 files changed, 8 insertions(+) diff --git a/README.freebsd b/README.freebsd index 90eefc89b9..d197efcaf3 100644 --- a/README.freebsd +++ b/README.freebsd @@ -21,6 +21,10 @@ $ ./configure --prefix=/where/ever $ gmake $ gmake install +If you are using a jail for building, make sure that it is configured so that +"uname -r" returns a string that matches the pattern "XX.Y-*" where XX is the +major version (12, 13, 14 ...) and Y is the minor version (0, 1, 2, 3). + Known Limitations (June 2022) 0. Be aware that if you use a wrapper script and run Valgrind on the wrapper diff --git a/configure.ac b/configure.ac index 1d4164a7d8..4dbb1753c7 100755 --- a/configure.ac +++ b/configure.ac @@ -444,6 +444,10 @@ case "${host_os}" in AC_DEFINE([FREEBSD_VERS], FREEBSD_13_2, [FreeBSD version]) freebsd_vers=$freebsd_13_2 ;; + *) + AC_MSG_RESULT([unsupported (${kernel})]) + AC_MSG_ERROR([Valgrind works on FreeBSD 10.x to 14.x]) + ;; esac ;; 14.*) |
|
From: Feiyang C. <chr...@gm...> - 2023-06-30 09:57:47
|
Hi, I sent patches v5, which were rebased on master and squashed into 40 commits. I am working on supporting vector on LoongArch64 now. https://bugs.kde.org/show_bug.cgi?id=457504 Thanks, Feiyang |
|
From: Andreas A. <ar...@so...> - 2023-06-28 14:20:55
|
https://sourceware.org/git/gitweb.cgi?p=valgrind.git;h=b4cc7815ba722426c5456831f858a2aeceb3761f commit b4cc7815ba722426c5456831f858a2aeceb3761f Author: Andreas Arnez <ar...@li...> Date: Thu Jun 15 17:24:53 2023 +0200 Bug 470978 - s390x: Link the tools with -Wl,--s390-pgste Programs that require the PGSTE mode to be enabled may currently fail under Valgrind. In particular this affects qemu-kvm. While it is also possible to enable the PGSTE mode globally with sysctl vm.allocate_psgte=1 the problem can more easily be prevented by linking the Valgrind tools with -Wl,--s390-pgste. Add a configure check if the linker supports this, and activate the flag if it does. To verify the intended result, the following shell command can be used to list the executables having this flag set: find . -type f -perm -u+x -execdir \ /bin/sh -c 'readelf -lW $0 2>/dev/null | grep PGSTE' {} \; -print Diff: --- Makefile.tool.am | 2 +- NEWS | 1 + configure.ac | 20 ++++++++++++++++++++ 3 files changed, 22 insertions(+), 1 deletion(-) diff --git a/Makefile.tool.am b/Makefile.tool.am index df95029138..4ce6d5ab0d 100644 --- a/Makefile.tool.am +++ b/Makefile.tool.am @@ -78,7 +78,7 @@ TOOL_LDFLAGS_ARM64_LINUX = \ $(TOOL_LDFLAGS_COMMON_LINUX) @FLAG_M64@ TOOL_LDFLAGS_S390X_LINUX = \ - $(TOOL_LDFLAGS_COMMON_LINUX) @FLAG_M64@ + $(TOOL_LDFLAGS_COMMON_LINUX) @FLAG_M64@ @FLAG_S390_PGSTE@ TOOL_LDFLAGS_X86_DARWIN = \ $(TOOL_LDFLAGS_COMMON_DARWIN) -arch i386 diff --git a/NEWS b/NEWS index c22c82131d..a4e7533115 100644 --- a/NEWS +++ b/NEWS @@ -43,6 +43,7 @@ are not entered into bugzilla tend to get forgotten about or ignored. (Bool vgPlain_realpath(const HChar *, HChar *)): Assertion 'resolved' failed 470830 Don't print actions vgdb me ... continue for vgdb --multi mode +470978 s390x: Valgrind cannot start qemu-kvm when "sysctl vm.allocate_pgste=0" To see details of a given bug, visit https://bugs.kde.org/show_bug.cgi?id=XXXXXX diff --git a/configure.ac b/configure.ac index 0cf84a1c00..1d4164a7d8 100755 --- a/configure.ac +++ b/configure.ac @@ -3096,6 +3096,26 @@ AC_SUBST([FLAG_NO_BUILD_ID], [""]) fi CFLAGS=$safe_CFLAGS +# On s390x, if the linker supports -Wl,--s390-pgste, then we build the +# tools with that flag. This enables running programs that need it, such +# as qemu-kvm. +if test x$VGCONF_PLATFORM_PRI_CAPS = xS390X_LINUX; then +AC_MSG_CHECKING([if the linker accepts -Wl,--s390-pgste]) +safe_CFLAGS=$CFLAGS +CFLAGS="-Wl,--s390-pgste" + +AC_LINK_IFELSE( +[AC_LANG_PROGRAM([ ], [return 0;])], +[ + AC_SUBST([FLAG_S390_PGSTE], ["-Wl,--s390-pgste"]) + AC_MSG_RESULT([yes]) +], [ + AC_SUBST([FLAG_S390_PGSTE], [""]) + AC_MSG_RESULT([no]) +]) +CFLAGS=$safe_CFLAGS +fi + # does the ppc assembler support "mtocrf" et al? AC_MSG_CHECKING([if ppc32/64 as supports mtocrf/mfocrf]) |
|
From: Mark W. <ma...@so...> - 2023-06-15 15:50:01
|
https://sourceware.org/git/gitweb.cgi?p=valgrind.git;h=bf0c73231b76e293a103ed8b2178975c7032f669 commit bf0c73231b76e293a103ed8b2178975c7032f669 Author: Mark Wielaard <ma...@kl...> Date: Fri Jun 9 15:21:57 2023 +0200 Don't print action vgdb me ... and continuing ... in vgdb --multi mode Guard each (action) vgdb me ... VG_(umsg) printing with !(VG_(clo_launched_with_multi)) https://bugs.kde.org/show_bug.cgi?id=470830 Diff: --- NEWS | 4 +++- coregrind/m_errormgr.c | 6 ++++-- coregrind/m_gdbserver/m_gdbserver.c | 9 +++++---- coregrind/m_libcassert.c | 3 ++- coregrind/m_main.c | 10 ++++++---- 5 files changed, 20 insertions(+), 12 deletions(-) diff --git a/NEWS b/NEWS index 52ee38ab8b..c22c82131d 100644 --- a/NEWS +++ b/NEWS @@ -40,7 +40,9 @@ are not entered into bugzilla tend to get forgotten about or ignored. 470121 Can't run callgrind_control with valgrind 3.21.0 because of perl errors 470520 Multiple realloc zero errors crash in MC_(eq_Error) 470713 Failure on the Yosys project: valgrind: m_libcfile.c:1802 - (Bool vgPlain_realpath(const HChar *, HChar *)): Assertion 'resolved' failed + (Bool vgPlain_realpath(const HChar *, HChar *)): + Assertion 'resolved' failed +470830 Don't print actions vgdb me ... continue for vgdb --multi mode To see details of a given bug, visit https://bugs.kde.org/show_bug.cgi?id=XXXXXX diff --git a/coregrind/m_errormgr.c b/coregrind/m_errormgr.c index 6be637190a..63c0e4eaa7 100644 --- a/coregrind/m_errormgr.c +++ b/coregrind/m_errormgr.c @@ -526,9 +526,11 @@ void do_actions_on_error(const Error* err, Bool allow_db_attach) if (VG_(clo_vgdb) != Vg_VgdbNo && allow_db_attach && VG_(clo_vgdb_error) <= n_errs_shown) { - VG_(umsg)("(action on error) vgdb me ... \n"); + if (!(VG_(clo_launched_with_multi))) + VG_(umsg)("(action on error) vgdb me ... \n"); VG_(gdbserver)( err->tid ); - VG_(umsg)("Continuing ...\n"); + if (!(VG_(clo_launched_with_multi))) + VG_(umsg)("Continuing ...\n"); } /* Or maybe we want to generate the error's suppression? */ diff --git a/coregrind/m_gdbserver/m_gdbserver.c b/coregrind/m_gdbserver/m_gdbserver.c index f8fbc5af23..5d0973e9ed 100644 --- a/coregrind/m_gdbserver/m_gdbserver.c +++ b/coregrind/m_gdbserver/m_gdbserver.c @@ -602,11 +602,11 @@ void VG_(gdbserver_prerun_action) (ThreadId tid) // Using VG_(clo_vgdb_error) allows the user to control if gdbserver // stops after a fork. if ((VG_(clo_vgdb_error) == 0 - || (VgdbStopAtiS(VgdbStopAt_Startup, VG_(clo_vgdb_stop_at)))) - && !(VG_(clo_launched_with_multi))) { + || (VgdbStopAtiS(VgdbStopAt_Startup, VG_(clo_vgdb_stop_at))))) { /* The below call allows gdb to attach at startup before the first guest instruction is executed. */ - VG_(umsg)("(action at startup) vgdb me ... \n"); + if (!(VG_(clo_launched_with_multi))) + VG_(umsg)("(action at startup) vgdb me ... \n"); VG_(gdbserver)(tid); } else { /* User has activated gdbserver => initialize now the FIFOs @@ -975,7 +975,8 @@ void VG_(gdbserver_report_fatal_signal) (const vki_siginfo_t *info, return; } - VG_(umsg)("(action on fatal signal) vgdb me ... \n"); + if (!(VG_(clo_launched_with_multi))) + VG_(umsg)("(action on fatal signal) vgdb me ... \n"); /* indicate to gdbserver that there is a signal */ gdbserver_signal_encountered (info); diff --git a/coregrind/m_libcassert.c b/coregrind/m_libcassert.c index 35f37f88df..0b04bfcc1d 100644 --- a/coregrind/m_libcassert.c +++ b/coregrind/m_libcassert.c @@ -282,7 +282,8 @@ static void exit_wrk( Int status, Bool gdbserver_call_allowed) if (status != 0 && VgdbStopAtiS(VgdbStopAt_ValgrindAbExit, VG_(clo_vgdb_stop_at))) { if (VG_(gdbserver_init_done)()) { - VG_(umsg)("(action at valgrind abnormal exit) vgdb me ... \n"); + if (!(VG_(clo_launched_with_multi))) + VG_(umsg)("(action at valgrind abnormal exit) vgdb me ... \n"); VG_(gdbserver) (atid); } else { VG_(umsg)("(action at valgrind abnormal exit)\n" diff --git a/coregrind/m_main.c b/coregrind/m_main.c index a857e5afeb..b8751341a0 100644 --- a/coregrind/m_main.c +++ b/coregrind/m_main.c @@ -2258,12 +2258,14 @@ void shutdown_actions_NORETURN( ThreadId tid, /* Final call to gdbserver, if requested. */ if (VG_(gdbserver_stop_at) (VgdbStopAt_Abexit) && tid_exit_code (tid) != 0) { - VG_(umsg)("(action at abexit, exit code %d) vgdb me ... \n", - tid_exit_code (tid)); + if (!(VG_(clo_launched_with_multi))) + VG_(umsg)("(action at abexit, exit code %d) vgdb me ... \n", + tid_exit_code (tid)); VG_(gdbserver) (tid); } else if (VG_(gdbserver_stop_at) (VgdbStopAt_Exit)) { - VG_(umsg)("(action at exit, exit code %d) vgdb me ... \n", - tid_exit_code (tid)); + if (!(VG_(clo_launched_with_multi))) + VG_(umsg)("(action at exit, exit code %d) vgdb me ... \n", + tid_exit_code (tid)); VG_(gdbserver) (tid); } VG_(threads)[tid].status = VgTs_Empty; |
|
From: Mark W. <ma...@so...> - 2023-06-15 15:02:05
|
https://sourceware.org/git/gitweb.cgi?p=valgrind.git;h=5a97c06080078aab8adfcc8985aecce7bfa5a738 commit 5a97c06080078aab8adfcc8985aecce7bfa5a738 Author: Tulio Magno Quites Machado Filho <tu...@re...> Date: Wed Jun 14 11:28:38 2023 -0300 s390x: Replace absolute jump for a relative one The bne instruction expects an absolute target address and it isn't best-suited for implementing a short range jump, such as the one in XCHG_M_R(). Replace it with jne which expects a relative address that can be correctly computed a link time. Interestingly, the jump is almost never taken. If it would, this would crash the test. However, linkers may complain when relacating the target address used in bne. Diff: --- helgrind/tests/tc11_XCHG.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/helgrind/tests/tc11_XCHG.c b/helgrind/tests/tc11_XCHG.c index f6ff1c9846..08e34a0b57 100644 --- a/helgrind/tests/tc11_XCHG.c +++ b/helgrind/tests/tc11_XCHG.c @@ -81,7 +81,7 @@ __asm__ __volatile__( \ "0: l 0,%[global]\n\t" \ " cs 0,%[local],%[global]\n\t" \ - " bne 0b\n\t" \ + " jne 0b\n\t" \ " lr %[local],0\n\t" \ : /*out*/ [global]"+m"(_addr), [local]"+d"(_lval) \ : /*in*/ \ |
|
From: Mark W. <ma...@kl...> - 2023-06-14 21:26:37
|
Hi, On Thu, Jun 08, 2023 at 04:55:38PM +0200, Floyd, Paul wrote: > On 16/05/2023 06:43, Nicholas Nethercote wrote: > > > >Are there any consequences of note for Valgrind? Judging by this > >paragraph, not particularly: > > > >> Sourceware will continue its long standing mission of providing free > > software infrastructure to the projects it supports, and this will not > > change moving forward. The affiliation with SFC will be transparent to > > the projects hosted on Sourceware. Project admins will keep being in > > charge of how they utilize the services Sourceware provides. > > > >Is that right? Yeah, it really is about the infrastructure, not about how projects use the infrastructure. But if we want any changes to the services provided we can always ask. Maybe one concrete thing might be for Sourceware/SFC to hold the valgrind.org domain for the project so no one individual is responsible for keeping it valid (although that isn't a big burden, just a convenient way to reduce the "bus factor"). > >I have been thinking a bit recently about the fact that Valgrind > >doesn't have any explicit governance structure or decision-making > >processes, and how it would be good to have some. > >developers > > I've been listening to afew 'Oxide and Friends' podcasts recently > (which has a heavy Rust slant), and yes, it would be good to have > some more in the way of governance. > > But first it would be even better to have more developers. Yes, but having a bit more visible "governance" might help with that. So people who join have a better view of what to expect. I admit I am still acting as if Julian is the BDFL. If we do something really bad he will certainly step in :) But that is cheating a little I guess. In practice our governance is having consensus around the DEVELOPER and processes READMEs. Doing releases twice a year at fixed dates/months (and the video chats planning those) do seem to work well. What we don't really have is a process for when there isn't clear consensus. Which means we never really make radical changes. Also we don't have enough reviewers for bigger changes (there are still two ports pending). Cheers, Mark |
|
From: LATHUILIERE B. <bru...@ed...> - 2023-06-12 12:52:34
|
Hi, I like the idea to add verrou in the variant list. You can get the source and documentation from github : https://github.com/edf-hpc/verrou/ The direct link to the documentation of the last version : http://edf-hpc.github.io/verrou/vr-manual.html (Soon or later I will change the link, to keep the documentation of old versions) The main references about verrou are : - François Févotte and Bruno Lathuilière. Debugging and optimization of HPC programs with the Verrou tool. In International Workshop on Software Correctness for HPC Applications (Correctness), Denver, CO, USA, Nov. 2019. DOI: 10.1109/Correctness49594.2019.00006 https://hal.science/hal-02044101/ - François Févotte and Bruno Lathuilière. Studying the numerical quality of an industrial computing code: A case study on code_aster. In 10th International Workshop on Numerical Software Verification (NSV), pages 61--80, Heidelberg, Germany, July 2017. DOI: 10.1007/978-3-319-63501-9_5 https://www.fevotte.net/publications/fevotte2017a.pdf - François Févotte and Bruno Lathuilière. VERROU: a CESTAC evaluation without recompilation. In International Symposium on Scientific Computing, Computer Arithmetics and Verified Numerics (SCAN), Uppsala, Sweden, September 2016. https://www.fevotte.net/publications/fevotte2016.pdf And if you are interested by the required number of samples, you should read the following paper (not specific to verrou) : - Devan Sohier, Pablo De Oliveira Castro, François Févotte, Bruno Lathuilière, Eric Petit, and Olivier Jamond. Confidence intervals for stochastic arithmetic. ACM Transactions on Mathematical Software, 47(2), 2021. https://hal.science/hal-01827319 ++ Bruno Lathuilière -----Message d'origine----- De : pj...@wa... <pj...@wa...> Envoyé : lundi 12 juin 2023 11:26 À : val...@li... Objet : Re: [Valgrind-developers] RFC: support scalable vector model / riscv vector On 01/06/2023 13:13, LATHUILIERE Bruno via Valgrind-developers wrote: > I don't know if my experience is the one you expect, nevertheless I will try to share it. > I'm the main developer of a valgrind tool called verrou (url: https://github.com/edf-hpc/verrou ) which currently only works with x86_64 architecture. > From user's point of view, verrou enables to estimate the effect of the floating-point rounding error propagation (If you are interested by the subject, there are documentation and publication). [snip] Interesting, I don't remember having seen anything on verrou. I need to look more at the doc and publications. I'll add a link to https://valgrind.org/downloads/variants.html (which is a bit out of date) A+ Paul _______________________________________________ Valgrind-developers mailing list Val...@li... https://lists.sourceforge.net/lists/listinfo/valgrind-developers Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à l'intention exclusive des destinataires et les informations qui y figurent sont strictement confidentielles. Toute utilisation de ce Message non conforme à sa destination, toute diffusion ou toute publication totale ou partielle, est interdite sauf autorisation expresse. Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si vous avez reçu ce Message par erreur, merci de le supprimer de votre système, ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support que ce soit. Nous vous remercions également d'en avertir immédiatement l'expéditeur par retour du message. Il est impossible de garantir que les communications par messagerie électronique arrivent en temps utile, sont sécurisées ou dénuées de toute erreur ou virus. ____________________________________________________ This message and any attachments (the 'Message') are intended solely for the addressees. The information contained in this Message is confidential. Any use of information contained in this Message not in accord with its purpose, any dissemination or disclosure, either whole or partial, is prohibited except formal approval. If you are not the addressee, you may not copy, forward, disclose or use any part of it. If you have received this message in error, please delete it and all copies from your system and notify the sender immediately by return message. E-mail communication cannot be guaranteed to be timely secure, error or virus-free. |
|
From: Floyd, P. <pj...@wa...> - 2023-06-12 09:26:16
|
On 01/06/2023 13:13, LATHUILIERE Bruno via Valgrind-developers wrote: > I don't know if my experience is the one you expect, nevertheless I will try to share it. > I'm the main developer of a valgrind tool called verrou (url: https://github.com/edf-hpc/verrou ) which currently only works with x86_64 architecture. > From user's point of view, verrou enables to estimate the effect of the floating-point rounding error propagation (If you are interested by the subject, there are documentation and publication). [snip] Interesting, I don't remember having seen anything on verrou. I need to look more at the doc and publications. I'll add a link to https://valgrind.org/downloads/variants.html (which is a bit out of date) A+ Paul |
|
From: Paul F. <pa...@so...> - 2023-06-09 11:20:26
|
https://sourceware.org/git/gitweb.cgi?p=valgrind.git;h=3df8a00a4ed7dbe436f28d8b3db72e679eb1b427 commit 3df8a00a4ed7dbe436f28d8b3db72e679eb1b427 Author: Paul Floyd <pj...@wa...> Date: Fri Jun 9 13:17:58 2023 +0200 470121 - Can't run callgrind_control with valgrind 3.21.0 because of perl errors Diff: --- NEWS | 1 + callgrind/callgrind_control.in | 94 +++++++++++++++++++++++++----------------- 2 files changed, 57 insertions(+), 38 deletions(-) diff --git a/NEWS b/NEWS index 4c5635dde1..52ee38ab8b 100644 --- a/NEWS +++ b/NEWS @@ -37,6 +37,7 @@ are not entered into bugzilla tend to get forgotten about or ignored. 469049 link failure on ppc64 (big endian) valgrind 3.20 469146 massif --ignore-fn does not ignore inlined functions 469768 Make it possible to install gdb scripts in a different location +470121 Can't run callgrind_control with valgrind 3.21.0 because of perl errors 470520 Multiple realloc zero errors crash in MC_(eq_Error) 470713 Failure on the Yosys project: valgrind: m_libcfile.c:1802 (Bool vgPlain_realpath(const HChar *, HChar *)): Assertion 'resolved' failed diff --git a/callgrind/callgrind_control.in b/callgrind/callgrind_control.in index 083ffa29fc..bee6661efb 100644 --- a/callgrind/callgrind_control.in +++ b/callgrind/callgrind_control.in @@ -29,6 +29,12 @@ use File::Basename; # vgdb_exe will be set to a vgdb found 'near' the callgrind_control file my $vgdb_exe = ""; +my $vgdbPrefixOption = ""; +my $cmd = ""; +my %cmd; +my %cmdline; +my $pid = -1; +my @pids = (); sub getCallgrindPids { @@ -50,6 +56,8 @@ sub getCallgrindPids { close LIST; } +my $headerPrinted = 0; + sub printHeader { if ($headerPrinted) { return; } $headerPrinted = 1; @@ -95,11 +103,17 @@ sub printHelp { # Parts more or less copied from cg_annotate (author: Nicholas Nethercote) # +my $event = ""; +my $events = ""; +my %events = (); +my @events = (); +my @show_events = (); +my @show_order = (); + sub prepareEvents { @events = split(/\s+/, $events); - %events = (); - $n = 0; + my $n = 0; foreach $event (@events) { $events{$event} = $n; $n++; @@ -178,7 +192,7 @@ sub print_events ($) { my ($CC_col_widths) = @_; - foreach my $i (@show_order) { + foreach my $i (@show_order) { my $event = $events[$i]; my $event_width = length($event); my $col_width = $CC_col_widths->[$i]; @@ -209,7 +223,7 @@ if (-x $controldir . "/vgdb") { # To find the list of active pids, we need to have # the --vgdb-prefix option if given. -$vgdbPrefixOption = ""; +my $arg = ""; foreach $arg (@ARGV) { if ($arg =~ /^--vgdb-prefix=.*$/) { $vgdbPrefixOption=$arg; @@ -219,15 +233,19 @@ foreach $arg (@ARGV) { getCallgrindPids; -$requestEvents = 0; -$requestDump = 0; -$switchInstr = 0; -$headerPrinted = 0; -$dumpHint = ""; +my $requestEvents = 0; +my $requestDump = 0; +my $switchInstr = 0; +my $dumpHint = ""; +my $printBacktrace = 0; +my $printStatus = 0; +my $switchInstrMode = ""; +my $requestKill = ""; +my $requestZero = ""; -$verbose = 0; +my $verbose = 0; -%spids = (); +my %spids = (); foreach $arg (@ARGV) { if ($arg =~ /^-/) { if ($requestDump == 1) { $requestDump = 2; } @@ -329,8 +347,8 @@ foreach $arg (@ARGV) { } if (defined $cmd{$arg}) { $spids{$arg} = 1; next; } - $nameFound = 0; - foreach $p (@pids) { + my $nameFound = 0; + foreach my $p (@pids) { if ($cmd{$p} =~ /$arg$/) { $nameFound = 1; $spids{$p} = 1; @@ -353,11 +371,11 @@ if (scalar @pids == 0) { exit; } -@spids = keys %spids; +my @spids = keys %spids; if (scalar @spids >0) { @pids = @spids; } -$vgdbCommand = ""; -$waitForAnswer = 0; +my $vgdbCommand = ""; +my $waitForAnswer = 0; if ($requestDump) { $vgdbCommand = "dump"; if ($dumpHint ne "") { $vgdbCommand .= " ".$dumpHint; } @@ -371,7 +389,7 @@ if ($printStatus || $printBacktrace || $requestEvents) { } foreach $pid (@pids) { - $pidstr = "PID $pid: "; + my $pidstr = "PID $pid: "; if ($pid >0) { print $pidstr.$cmdline{$pid}; } if ($vgdbCommand eq "") { @@ -385,24 +403,24 @@ foreach $pid (@pids) { } open RESULT, $vgdb_exe . " $vgdbPrefixOption --pid=$pid $vgdbCommand|"; - @tids = (); - $ctid = 0; - %fcount = (); - %func = (); - %calls = (); - %events = (); - @events = (); - @threads = (); - %totals = (); - - $exec_bbs = 0; - $dist_bbs = 0; - $exec_calls = 0; - $dist_calls = 0; - $dist_ctxs = 0; - $dist_funcs = 0; - $threads = ""; - $events = ""; + my @tids = (); + my $tid; + my $ctid = 0; + my %fcount = (); + my %func = (); + my %calls = (); + my @threads = (); + my %totals = (); + my $totals_width = []; + + my $exec_bbs = 0; + my $dist_bbs = 0; + my $exec_calls = 0; + my $dist_calls = 0; + my $dist_ctxs = 0; + my $dist_funcs = 0; + my $threads = ""; + my $instrumentation = ""; while(<RESULT>) { if (/function-(\d+)-(\d+): (.+)$/) { @@ -485,10 +503,10 @@ foreach $pid (@pids) { } print "Backtrace for Thread $tid\n"; - $i = $fcount{$tid}; - $c = 0; + my $i = $fcount{$tid}; + my $c = 0; while($i>0 && $c<100) { - $fc = substr(" $c",-2); + my $fc = substr(" $c",-2); print " [$fc] "; if ($requestEvents >0) { print_CC($events{$tid,$i-1}, $totals_width); |
|
From: Floyd, P. <pj...@wa...> - 2023-06-08 14:56:11
|
On 16/05/2023 06:43, Nicholas Nethercote wrote: > Hi, > > Are there any consequences of note for Valgrind? Judging by this > paragraph, not particularly: > > > Sourceware will continue its long standing mission of providing free > software infrastructure to the projects it supports, and this will not > change moving forward. The affiliation with SFC will be transparent to > the projects hosted on Sourceware. Project admins will keep being in > charge of how they utilize the services Sourceware provides. > > Is that right? > > I have been thinking a bit recently about the fact that Valgrind > doesn't have any explicit governance structure or decision-making > processes, and how it would be good to have some. > developers Hi Nick I've been listening to afew 'Oxide and Friends' podcasts recently (which has a heavy Rust slant), and yes, it would be good to have some more in the way of governance. But first it would be even better to have more developers. A+ Paul |
|
From: Paul F. <pa...@so...> - 2023-06-07 20:55:00
|
https://sourceware.org/git/gitweb.cgi?p=valgrind.git;h=8bc3a55d50570ef3e3077a7bdfb6354895a56878 commit 8bc3a55d50570ef3e3077a7bdfb6354895a56878 Author: Paul Floyd <pj...@wa...> Date: Wed Jun 7 22:54:22 2023 +0200 Merge error, missing continuation in Makefile.am Diff: --- memcheck/tests/freebsd/Makefile.am | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/memcheck/tests/freebsd/Makefile.am b/memcheck/tests/freebsd/Makefile.am index f515a684ec..722cc5d51e 100644 --- a/memcheck/tests/freebsd/Makefile.am +++ b/memcheck/tests/freebsd/Makefile.am @@ -100,7 +100,7 @@ EXTRA_DIST = \ bug464476_rel_symlink.vgtest \ bug464476_rel_symlink.stderr.exp \ bug464476_rel_symlink.stdout.exp \ - memalign.vgtest memalign.stderr.exp + memalign.vgtest memalign.stderr.exp \ bug470713.vgtest bug470713.stderr.exp \ bug470713.stdout.exp |
|
From: Paul F. <pa...@so...> - 2023-06-07 20:29:52
|
https://sourceware.org/git/gitweb.cgi?p=valgrind.git;h=840ccb9915c675fd7db527107e6b38343fafdf86 commit 840ccb9915c675fd7db527107e6b38343fafdf86 Author: Paul Floyd <pj...@wa...> Date: Wed Jun 7 22:27:08 2023 +0200 Bug 470713 - Failure on the Yosys project: valgrind: m_libcfile.c:1802 (Bool vgPlain_realpath(const HChar *, HChar *)): Assertion 'resolved' failed When using sysctl kern proc pathname with the pid of the guest or -1 we need to intercept the call otherwise the syscall will return the path of the memcheck tool and not the path of the guest. This uses VG_(realpath), which asserts if it doesn't get valid input pointers. sysctl kern proc pathname can use a NULL pointer in order to determine the length of the path (so users can allocate the minumum necessary). The NULL pointer was being passed on to VG_(realpath) without being checked, resulting in an assert. Diff: --- .gitignore | 1 + NEWS | 2 ++ coregrind/m_syswrap/syswrap-freebsd.c | 13 +++++++++ memcheck/tests/freebsd/Makefile.am | 5 +++- memcheck/tests/freebsd/bug470713.cpp | 44 +++++++++++++++++++++++++++++ memcheck/tests/freebsd/bug470713.stderr.exp | 0 memcheck/tests/freebsd/bug470713.stdout.exp | 2 ++ memcheck/tests/freebsd/bug470713.vgtest | 3 ++ 8 files changed, 69 insertions(+), 1 deletion(-) diff --git a/.gitignore b/.gitignore index 6d73324cea..9e16ac126d 100644 --- a/.gitignore +++ b/.gitignore @@ -1341,6 +1341,7 @@ /memcheck/tests/freebsd/452275 /memcheck/tests/freebsd/access /memcheck/tests/freebsd/bug464476 +/memcheck/tests/freebsd/bug470713 /memcheck/tests/freebsd/capsicum /memcheck/tests/freebsd/chflags /memcheck/tests/freebsd/chmod_chown diff --git a/NEWS b/NEWS index 09f8c71370..4c5635dde1 100644 --- a/NEWS +++ b/NEWS @@ -38,6 +38,8 @@ are not entered into bugzilla tend to get forgotten about or ignored. 469146 massif --ignore-fn does not ignore inlined functions 469768 Make it possible to install gdb scripts in a different location 470520 Multiple realloc zero errors crash in MC_(eq_Error) +470713 Failure on the Yosys project: valgrind: m_libcfile.c:1802 + (Bool vgPlain_realpath(const HChar *, HChar *)): Assertion 'resolved' failed To see details of a given bug, visit https://bugs.kde.org/show_bug.cgi?id=XXXXXX diff --git a/coregrind/m_syswrap/syswrap-freebsd.c b/coregrind/m_syswrap/syswrap-freebsd.c index fd4dff4da4..6b9f3d2109 100644 --- a/coregrind/m_syswrap/syswrap-freebsd.c +++ b/coregrind/m_syswrap/syswrap-freebsd.c @@ -1987,6 +1987,19 @@ static Bool sysctl_kern_proc_pathname(HChar *out, SizeT *len) { const HChar *exe_name = VG_(resolved_exename); + if (!len) { + return False; + } + + if (!out) { + HChar tmp[VKI_PATH_MAX]; + if (!VG_(realpath)(exe_name, tmp)) { + return False; + } + *len = VG_(strlen)(tmp)+1; + return True; + } + if (!VG_(realpath)(exe_name, out)) { return False; } diff --git a/memcheck/tests/freebsd/Makefile.am b/memcheck/tests/freebsd/Makefile.am index 2259e1efb8..f515a684ec 100644 --- a/memcheck/tests/freebsd/Makefile.am +++ b/memcheck/tests/freebsd/Makefile.am @@ -101,6 +101,8 @@ EXTRA_DIST = \ bug464476_rel_symlink.stderr.exp \ bug464476_rel_symlink.stdout.exp \ memalign.vgtest memalign.stderr.exp + bug470713.vgtest bug470713.stderr.exp \ + bug470713.stdout.exp check_PROGRAMS = \ statfs pdfork_pdkill getfsstat inlinfo inlinfo_nested.so extattr \ @@ -108,7 +110,7 @@ check_PROGRAMS = \ linkat scalar_fork scalar_thr_exit scalar_abort2 scalar_pdfork \ scalar_vfork stat file_locking_wait6 utimens access chmod_chown \ misc get_set_context utimes static_allocs fexecve errno_aligned_allocs \ - setproctitle sctp sctp2 bug464476 memalign + setproctitle sctp sctp2 bug464476 memalign bug470713 AM_CFLAGS += $(AM_FLAG_M3264_PRI) AM_CXXFLAGS += $(AM_FLAG_M3264_PRI) @@ -122,6 +124,7 @@ inlinfo_nested_so_CFLAGS = $(AM_CFLAGS) -fPIC @FLAG_W_NO_UNINITIALIZED@ inlinfo_nested_so_LDFLAGS = -Wl,-rpath,$(top_builddir)/memcheck/tests/freebsd -shared -fPIC bug464476_SOURCES = bug464476.cpp +bug470713_SOURCES = bug470713.cpp if FREEBSD_VERS_13_PLUS check_PROGRAMS += realpathat scalar_13_plus eventfd1 eventfd2 diff --git a/memcheck/tests/freebsd/bug470713.cpp b/memcheck/tests/freebsd/bug470713.cpp new file mode 100644 index 0000000000..67a544926a --- /dev/null +++ b/memcheck/tests/freebsd/bug470713.cpp @@ -0,0 +1,44 @@ +// roughly based on the code for Firefox class BinaryPath +// https://searchfox.org/mozilla-central/source/xpcom/build/BinaryPath.h#185 + +#include <iostream> +#include <sys/types.h> +#include <sys/sysctl.h> +#include <limits.h> +#include <string> +#include <memory> + +using std::cerr; +using std::cout; +using std::string; + +int main(int argc, char **argv) +{ + int mib[] = { CTL_KERN, KERN_PROC, KERN_PROC_PATHNAME, -1}; + size_t len; + + if (sysctl(mib, 4, NULL, &len, NULL, 0) != 0) { + cout << "sysctl failed to get path length: " << strerror(errno) << '\n'; + return -1; + } + + std::unique_ptr<char[]> aResult(new char[len]); + + if (sysctl(mib, 4, aResult.get(), &len, NULL, 0) != 0) { + cout << "sysctl failed to get path: " << strerror(errno) << '\n'; + return -1; + } + + if (string(aResult.get()) == argv[1]) { + cout << "OK\n"; + } else { + cout << "Not OK aResult " << aResult << " argv[1] " << argv[1] << '\n'; + } + + if (sysctl(mib, 4, NULL, NULL, NULL, 0) != -1) { + cout << "OK syscall failed\n"; + return -1; + } else { + cout << "sysctl succeeded when it should have failed\n"; + } +} diff --git a/memcheck/tests/freebsd/bug470713.stderr.exp b/memcheck/tests/freebsd/bug470713.stderr.exp new file mode 100644 index 0000000000..e69de29bb2 diff --git a/memcheck/tests/freebsd/bug470713.stdout.exp b/memcheck/tests/freebsd/bug470713.stdout.exp new file mode 100644 index 0000000000..2ba70ed13d --- /dev/null +++ b/memcheck/tests/freebsd/bug470713.stdout.exp @@ -0,0 +1,2 @@ +OK +OK syscall failed diff --git a/memcheck/tests/freebsd/bug470713.vgtest b/memcheck/tests/freebsd/bug470713.vgtest new file mode 100644 index 0000000000..b85043a5ab --- /dev/null +++ b/memcheck/tests/freebsd/bug470713.vgtest @@ -0,0 +1,3 @@ +prog: bug470713 +vgopts: -q +args: `pwd`/bug470713 |
|
From: LATHUILIERE B. <bru...@ed...> - 2023-06-05 18:07:30
|
-----Message d'origine----- De : fe...@in... <fe...@in...> Envoyé : lundi 5 juin 2023 03:24 À : LATHUILIERE Bruno <bru...@ed...>; val...@li...; val...@li...; Petr Pavlu <pet...@da...>; Jojo R <rj...@gm...>; pa...@so...; yun...@al...; zha...@al... Objet : Re: [Valgrind-developers] RFC: support scalable vector model / riscv vector >On 6/1/2023 7:13 PM, LATHUILIERE Bruno via Valgrind-developers wrote: >> >> -------- Courriel original -------- >> Objet: Re: [Valgrind-developers] RFC: support scalable vector model / >> riscv vector >> Date: 2023-05-29 05:29 >> De: "Wu, Fei" <fe...@in...> >> À: Petr Pavlu <pet...@da...>, Jojo R <rj...@gm...> >> From valgrind tool developer's point of view, we need to replace all floating-point operations (fpo) by our own modified fpo implemented with C++ functions. One C++ function has 1,2 or 3 floating point input values and one floating point output value. > >Do you use libvex_BackEnd() to translate the insn to host, e.g. >host_riscv64_isel.c to select the host insn, Is there any difference of processing flow between verrou and memcheck? I do not use (at least directly) function defined in host_*_isel.c. The fact Verrou is not yet portable to other architecture comes from two reasons : - In the C++ call we use intrinsics for the fma. - We need to compile with --enable-only64bit. As I do not need to use verrou on 32bit architecture, I postpone the problem. >> As we have to replace all VEX fpo, the way we handle with SSE and AVX has consequences for us. For each kind of fpo (add,sub,mul,div,sqrt)x(float,double), we have to replace VEX op for the following variants : scalar, SSE low lane, SSE, AVX. It is painful but possible via code generation. Thanks to the multiple VEX ops it is possible to select only one type of instruction (it can be useful to 1- get speed up, 2- know if floating point errors come from scalar or vector instructions). >> >> On the other hand, for fma operations (madd,msub)x(float,double) we have less work to do, as valgrind do the un-vectorisation for us, but it is impossible to instrument selectively scalar or vector ops. >As these insns are un-vectorised, are there any other issues besides the >1 (performance) & 2 (original type) mentioned above? I want to make sure if there is any risk of the un-vectorisation design, e.g. when the vector length is large such as 2k vlen on rvv. As a user of valgrind framework (ie tool developer), I ve no idea about this kind of limitation. To be able to develop a valgrind tool without strong architecture knowledge is a strength of valgrind framework. >> We could think that the multiple VEX ops enable performance improvements via the vectorisation of C++ call, but it is not now possible (at least to my knowledge). Indeed, with the valgrind API I don't know how I can get the floating-point values in the register without applying un-vectorisation : To get the values in the AVX register, I do an awful sequence of Iop_V256to64_0, Iop_V256to64_1, Iop_V256to64_2, Iop_V256to64_3 for the 2 arguments. As it is not possible to do a IRStmt_Dirty call with a function with 9 args (9=2*4+1 2 for a binary operation, 4 for the vector length and 1 for the result), I do a first call to copy the 4 values of the first arg somewhere then a second one to perform the 4 C++ calls. >> Due to the algorithm inside the C++ calls it could be tricky to vectorise, but I even didn't try because of the sequence of Iop_V256to64_*. >For memcheck, the process is as follows if we put it simple: > toIR -> instrumentation -> Backend isel With my understanding the tool memcheck do only the instrumentation stage, and toIR and backend isel stages are done by the valgrind framework. >If the vector insn is split into scalar at the stage of toIR just as I did in this series, the advantage looks obvious as I only need to deal with this single stage and leverage the existing code to handle the scalar >version, the disadvantage is that it might lose some opportunities to optimize, e.g. >* toIR - introduce extra temp variables for generated scalars >* instrumentation - for memcheck, the key is to trace the V+A bits instead of the real results of the ops, the ideal case is V+A of the whole vector can be checked together w/o breaking it to scalars You pinpoint the main difference between verrou and memcheck. The verrou instrumentation can not be seen as a trace generation : indeed we modifiy the floating point behaviour. >* Backend isel - the ideal case is to use the vector insn on host for guest vector insn, but I'm not sure how much effort will be taken to achieve this. > >Thank you again for this sharing. I hope the discussion can help both of us, and others. I hope so. > >Best regards, >Fei. Best regards, Bruno Lathuilière Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à l'intention exclusive des destinataires et les informations qui y figurent sont strictement confidentielles. Toute utilisation de ce Message non conforme à sa destination, toute diffusion ou toute publication totale ou partielle, est interdite sauf autorisation expresse. Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si vous avez reçu ce Message par erreur, merci de le supprimer de votre système, ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support que ce soit. Nous vous remercions également d'en avertir immédiatement l'expéditeur par retour du message. Il est impossible de garantir que les communications par messagerie électronique arrivent en temps utile, sont sécurisées ou dénuées de toute erreur ou virus. ____________________________________________________ This message and any attachments (the 'Message') are intended solely for the addressees. The information contained in this Message is confidential. Any use of information contained in this Message not in accord with its purpose, any dissemination or disclosure, either whole or partial, is prohibited except formal approval. If you are not the addressee, you may not copy, forward, disclose or use any part of it. If you have received this message in error, please delete it and all copies from your system and notify the sender immediately by return message. E-mail communication cannot be guaranteed to be timely secure, error or virus-free. |
|
From: Mark W. <ma...@kl...> - 2023-06-05 09:11:10
|
Sourceware infrastructure community updates for Q2 2023 - https://dwarfstd.org/ joins Sourceware - https://snapshots.sourceware.org/ - Simpler b4 setup - Sourceware joins Software Freedom Conservancy - Sourceware joins the fediverse @sou...@fo... - Mirrors and Software Heritage - Open Office Hours (this Friday) = dwarfstd.org joins Sourceware The DWARF Debugging Standard, are now hosted on Sourceware. This includes git.dwarfstd.org, wiki.dwarfstd.org and lists.dwarfstd.org. Sourceware already hosted the old dwarf2 archives https://sourceware.org/legacy-ml/dwarf2/ = snapshots.sourceware.org Thanks to OSUOSL we now have a snapshots server to publish static artifacts from current git repos created in isolated containers. It can be used as alternative to git hooks or cron jobs to generate snapshots for things like: GNU poke code and doc snapshots: https://snapshots.sourceware.org/gnupoke/trunk/latest/ elfutils code coverage: https://snapshots.sourceware.org/elfutils/coverage/latest/ libabigail website, manuals and api docs: https://snapshots.sourceware.org/libabigail/html-doc/latest/ valgrind snapshots and manuals: https://snapshots.sourceware.org/valgrind/trunk/latest/ DWARF draft spec: https://snapshots.sourceware.org/dwarfstd/dwarf-spec/latest/ The container files and build steps are defined through the builder project. https://inbox.sourceware.org/202...@gn... = Simpler b4 setup Previously the guidance for adding b4 support through inbox.sourceware.org was to add per project mailing-list b4 settings. But if all your projects have an inbox.sourceware.org mailinglist you can simply use: $ git config --global b4.midmask https://inbox.sourceware.org/%s $ git config --global b4.linkmask https://inbox.sourceware.org/%s Since public-inbox knows about all message-ids independent of which mailinglist they were posted in and b4 just needs the message-id. Thanks to Thomas Schwinge for the hint. = Sourceware joins Software Freedom Conservancy https://sfconservancy.org/news/2023/may/15/sourceware-joins-sfc/ As the fiscal host of Sourceware, Software Freedom Conservancy will provide a home for fundraising, legal protection and governance that will benefit all projects under Sourceware's care. We share one mission: developing, distributing and advocating for Software Freedom. Together we will offer a worry-free, friendly home for core toolchain and developer tool projects. There will be no big changes, this is just an oppertunity to protect the confidence in the long term future of Sourceware. There is a small budget already available which we would like to use for extra redundancy and backup services. But we are happy to discuss other ideas like mentioned in the original proposal and Sourceware technical roadmap. https://inbox.sourceware.org/Yw5Q4b%2F2...@el.../ https://inbox.sourceware.org/YrL...@wi.../ = Sourceware joins the fediverse Sourceware joined the fediverse at @sou...@fo... https://fosstodon.org/@sourceware The account will be used for Sourceware announcements, notices about downtime and temporary issues with our network. = Mirrors and Software Heritage We added two rsync and http mirrors in China https://sourceware.org/mirrors.html And the Software Heritage project https://www.softwareheritage.org/ started archiving the active git repos. We are working on also adding the (historic) subversion and cvs archives. This is in addition to the mirrors at SourceHut https://sr.ht/~sourceware/ Thanks to Paul Wise for getting the ball rolling. https://sourceware.org/bugzilla/show_bug.cgi?id=29618 = Overseers Open Office hours Every second Friday of the month is the Overseers Open Office hour in #overseers on irc.libera.chat from 18:00 till 19:00 UTC. That is this Friday June 9th. Of course you are welcome to drop into the #overseers channel at any time and we can also be reached through email and bugzilla: https://sourceware.org/mission.html#organization If you aren't already and want to keep up to date on Sourceware infrastructure services then please also subscribe to the overseers mailinglist. https://sourceware.org/mailman/listinfo/overseers |
|
From: Wu, F. <fe...@in...> - 2023-06-05 01:24:04
|
On 6/1/2023 7:13 PM, LATHUILIERE Bruno via Valgrind-developers wrote: > > -------- Courriel original -------- > Objet: Re: [Valgrind-developers] RFC: support scalable vector model / riscv vector > Date: 2023-05-29 05:29 > De: "Wu, Fei" <fe...@in...> > À: Petr Pavlu <pet...@da...>, Jojo R <rj...@gm...> > Cc: pa...@so..., yun...@al..., val...@li..., > val...@li..., zha...@al... > >> On 5/28/2023 1:06 AM, Petr Pavlu wrote: >>> On 21. Apr 23 17:25, Jojo R wrote: >>>> The last remaining big issue is 3, which we introduce some ad-hoc >>>> approaches to deal with. We summarize these approaches into three >>>> types as >>>> following: >>>> >>>> 1. Break down a vector instruction to scalar VEX IR ops. >>>> 2. Break down a vector instruction to fixed-length VEX IR ops. >>>> 3. Use dirty helpers to realize vector instructions. >>> >>> I would also look at adding new VEX IR ops for scalable vector >>> instructions. In particular, if it could be shown that RVV and SVE can >>> use same new ops then it could make a good argument for adding them. >>> >>> Perhaps interesting is if such new scalable vector ops could also >>> represent fixed operations on other architectures, but that is just me >>> thinking out loud. >>> >> It's a good idea to consolidate all vector/simd together, the challenge is to verify its feasibility and to speedup the adaption progress, as it's supposed to take more efforts and longer time. Is there anyone with knowledge or experience of other ISA such as avx/sve on valgrind >can share the pain and gain, or we can do some quick prototype? >> >> Thanks, >> Fei. > > Hi, > > I don't know if my experience is the one you expect, nevertheless I will try to share it. Hi Bruno, Thank you for sharing this, it's definitely worth reading. > I'm the main developer of a valgrind tool called verrou (url: https://github.com/edf-hpc/verrou ) which currently only works with x86_64 architecture. > From user's point of view, verrou enables to estimate the effect of the floating-point rounding error propagation (If you are interested by the subject, there are documentation and publication). > It looks interesting, good job. > From valgrind tool developer's point of view, we need to replace all floating-point operations (fpo) by our own modified fpo implemented with C++ functions. One C++ function has 1,2 or 3 floating point input values and one floating point output value. > Do you use libvex_BackEnd() to translate the insn to host, e.g. host_riscv64_isel.c to select the host insn, Is there any difference of processing flow between verrou and memcheck? > As we have to replace all VEX fpo, the way we handle with SSE and AVX has consequences for us. For each kind of fpo (add,sub,mul,div,sqrt)x(float,double), we have to replace VEX op for the following variants : scalar, SSE low lane, SSE, AVX. It is painful but possible via code generation. Thanks to the multiple VEX ops it is possible to select only one type of instruction (it can be useful to 1- get speed up, 2- know if floating point errors come from scalar or vector instructions). > > On the other hand, for fma operations (madd,msub)x(float,double) we have less work to do, as valgrind do the un-vectorisation for us, but it is impossible to instrument selectively scalar or vector ops. As these insns are un-vectorised, are there any other issues besides the 1 (performance) & 2 (original type) mentioned above? I want to make sure if there is any risk of the un-vectorisation design, e.g. when the vector length is large such as 2k vlen on rvv. > We could think that the multiple VEX ops enable performance improvements via the vectorisation of C++ call, but it is not now possible (at least to my knowledge). Indeed, with the valgrind API I don't know how I can get the floating-point values in the register without applying un-vectorisation : To get the values in the AVX register, I do an awful sequence of Iop_V256to64_0, Iop_V256to64_1, Iop_V256to64_2, Iop_V256to64_3 for the 2 arguments. As it is not possible to do a IRStmt_Dirty call with a function with 9 args (9=2*4+1 2 for a binary operation, 4 for the vector length and 1 for the result), I do a first call to copy the 4 values of the first arg somewhere then a second one to perform the 4 C++ calls. > Due to the algorithm inside the C++ calls it could be tricky to vectorise, but I even didn't try because of the sequence of Iop_V256to64_*. For memcheck, the process is as follows if we put it simple: toIR -> instrumentation -> Backend isel If the vector insn is split into scalar at the stage of toIR just as I did in this series, the advantage looks obvious as I only need to deal with this single stage and leverage the existing code to handle the scalar version, the disadvantage is that it might lose some opportunities to optimize, e.g. * toIR - introduce extra temp variables for generated scalars * instrumentation - for memcheck, the key is to trace the V+A bits instead of the real results of the ops, the ideal case is V+A of the whole vector can be checked together w/o breaking it to scalars * Backend isel - the ideal case is to use the vector insn on host for guest vector insn, but I'm not sure how much effort will be taken to achieve this. > In my dreams I would like Iop_ to convert a V256 or V128 type to an aligned pointer on floating point args. > > So, I don't know if my experience can be useful for you, but if someone has a better solution to my needs it will be useful at least ... to me :) > Thank you again for this sharing. I hope the discussion can help both of us, and others. Best regards, Fei. > Best regards, > Bruno Lathuilière > > > > > Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à l'intention exclusive des destinataires et les informations qui y figurent sont strictement confidentielles. Toute utilisation de ce Message non conforme à sa destination, toute diffusion ou toute publication totale ou partielle, est interdite sauf autorisation expresse. > > Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si vous avez reçu ce Message par erreur, merci de le supprimer de votre système, ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support que ce soit. Nous vous remercions également d'en avertir immédiatement l'expéditeur par retour du message. > > Il est impossible de garantir que les communications par messagerie électronique arrivent en temps utile, sont sécurisées ou dénuées de toute erreur ou virus. > ____________________________________________________ > > This message and any attachments (the 'Message') are intended solely for the addressees. The information contained in this Message is confidential. Any use of information contained in this Message not in accord with its purpose, any dissemination or disclosure, either whole or partial, is prohibited except formal approval. > > If you are not the addressee, you may not copy, forward, disclose or use any part of it. If you have received this message in error, please delete it and all copies from your system and notify the sender immediately by return message. > > E-mail communication cannot be guaranteed to be timely secure, error or virus-free. > > > > _______________________________________________ > Valgrind-developers mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-developers |
|
From: Mark W. <ma...@so...> - 2023-06-02 10:05:59
|
https://sourceware.org/git/gitweb.cgi?p=valgrind.git;h=453c7111133ce9dc5dce043e03b7b58efdbf46cd commit 453c7111133ce9dc5dce043e03b7b58efdbf46cd Author: Mark Wielaard <ma...@kl...> Date: Thu Jun 1 16:10:56 2023 +0200 memcheck: Handle Err_ReallocSizeZero in MC_(eq_Error) When an realloc size zero error is emitted MC_(eq_Error) is called to see if the errors can be deduplicated. This crashed since Err_ReallocSizeZero wasn't handled. Handle it like Err_Free. Also add a testcase for this case and test with both --realloc-zero-bytes-frees=yes and --realloc-zero-bytes-frees=no. Which will report a different number of errors. https://bugs.kde.org/show_bug.cgi?id=470520 Diff: --- .gitignore | 1 + NEWS | 1 + memcheck/mc_errors.c | 1 + memcheck/tests/Makefile.am | 7 +++++++ memcheck/tests/realloc_size_zero_again.c | 15 +++++++++++++++ memcheck/tests/realloc_size_zero_again_no.stderr.exp | 18 ++++++++++++++++++ memcheck/tests/realloc_size_zero_again_no.stdout.exp | 0 memcheck/tests/realloc_size_zero_again_no.vgtest | 2 ++ memcheck/tests/realloc_size_zero_again_yes.stderr.exp | 18 ++++++++++++++++++ memcheck/tests/realloc_size_zero_again_yes.stdout.exp | 0 memcheck/tests/realloc_size_zero_again_yes.vgtest | 2 ++ 11 files changed, 65 insertions(+) diff --git a/.gitignore b/.gitignore index 076e168ded..6d73324cea 100644 --- a/.gitignore +++ b/.gitignore @@ -953,6 +953,7 @@ /memcheck/tests/post-syscall /memcheck/tests/reach_thread_register /memcheck/tests/realloc_size_zero +/memcheck/tests/realloc_size_zero_again /memcheck/tests/realloc_size_zero_mismatch /memcheck/tests/realloc1 /memcheck/tests/realloc2 diff --git a/NEWS b/NEWS index ea9fc7c868..09f8c71370 100644 --- a/NEWS +++ b/NEWS @@ -37,6 +37,7 @@ are not entered into bugzilla tend to get forgotten about or ignored. 469049 link failure on ppc64 (big endian) valgrind 3.20 469146 massif --ignore-fn does not ignore inlined functions 469768 Make it possible to install gdb scripts in a different location +470520 Multiple realloc zero errors crash in MC_(eq_Error) To see details of a given bug, visit https://bugs.kde.org/show_bug.cgi?id=XXXXXX diff --git a/memcheck/mc_errors.c b/memcheck/mc_errors.c index 00d6ec301e..65210a2209 100644 --- a/memcheck/mc_errors.c +++ b/memcheck/mc_errors.c @@ -1041,6 +1041,7 @@ Bool MC_(eq_Error) ( VgRes res, const Error* e1, const Error* e2 ) case Err_IllegalMempool: case Err_Overlap: case Err_Cond: + case Err_ReallocSizeZero: return True; case Err_FishyValue: diff --git a/memcheck/tests/Makefile.am b/memcheck/tests/Makefile.am index 71c38acbaf..5a17fd35d4 100644 --- a/memcheck/tests/Makefile.am +++ b/memcheck/tests/Makefile.am @@ -291,8 +291,14 @@ EXTRA_DIST = \ realloc_size_zero.vgtest \ realloc_size_zero_yes.stderr.exp realloc_size_zero_yes.stdout.exp \ realloc_size_zero_yes.vgtest \ + realloc_size_zero_again_yes.stderr.exp \ + realloc_size_zero_again_yes.stdout.exp \ + realloc_size_zero_again_yes.vgtest \ realloc_size_zero_no.stderr.exp realloc_size_zero_no.stdout.exp \ realloc_size_zero_no.vgtest \ + realloc_size_zero_again_no.stderr.exp \ + realloc_size_zero_again_no.stdout.exp \ + realloc_size_zero_again_no.vgtest \ realloc_size_zero_off.stderr.exp realloc_size_zero_off.stdout.exp \ realloc_size_zero_off.vgtest \ realloc_size_zero_mismatch.stderr.exp \ @@ -459,6 +465,7 @@ check_PROGRAMS = \ posix_memalign \ post-syscall \ realloc_size_zero realloc_size_zero_mismatch \ + realloc_size_zero_again \ realloc1 realloc2 realloc3 \ recursive-merge \ resvn_stack \ diff --git a/memcheck/tests/realloc_size_zero_again.c b/memcheck/tests/realloc_size_zero_again.c new file mode 100644 index 0000000000..782d4bde5f --- /dev/null +++ b/memcheck/tests/realloc_size_zero_again.c @@ -0,0 +1,15 @@ +#include <stdlib.h> + +int +main () +{ + char *p = malloc (1024); + for (int i = 3; i >= 0; i--) + for (int j = 0; j <= 3; j++) + { + char *q = realloc (p, i * j * 512); + p = q; + } + + free (p); +} diff --git a/memcheck/tests/realloc_size_zero_again_no.stderr.exp b/memcheck/tests/realloc_size_zero_again_no.stderr.exp new file mode 100644 index 0000000000..b9c061d1ad --- /dev/null +++ b/memcheck/tests/realloc_size_zero_again_no.stderr.exp @@ -0,0 +1,18 @@ +realloc() with size 0 + at 0x........: realloc (vg_replace_malloc.c:...) + ... + Address 0x........ is 0 bytes inside a block of size 1,024 alloc'd + at 0x........: malloc (vg_replace_malloc.c:...) + ... + +ERROR SUMMARY: 7 errors from 1 contexts (suppressed: 0 from 0) + +7 errors in context 1 of 1: +realloc() with size 0 + at 0x........: realloc (vg_replace_malloc.c:...) + ... + Address 0x........ is 0 bytes inside a block of size 1,024 alloc'd + at 0x........: malloc (vg_replace_malloc.c:...) + ... + +ERROR SUMMARY: 7 errors from 1 contexts (suppressed: 0 from 0) diff --git a/memcheck/tests/realloc_size_zero_again_no.stdout.exp b/memcheck/tests/realloc_size_zero_again_no.stdout.exp new file mode 100644 index 0000000000..e69de29bb2 diff --git a/memcheck/tests/realloc_size_zero_again_no.vgtest b/memcheck/tests/realloc_size_zero_again_no.vgtest new file mode 100644 index 0000000000..f1757b6c19 --- /dev/null +++ b/memcheck/tests/realloc_size_zero_again_no.vgtest @@ -0,0 +1,2 @@ +prog: realloc_size_zero_again +vgopts: -q -s --realloc-zero-bytes-frees=no diff --git a/memcheck/tests/realloc_size_zero_again_yes.stderr.exp b/memcheck/tests/realloc_size_zero_again_yes.stderr.exp new file mode 100644 index 0000000000..d40aa24550 --- /dev/null +++ b/memcheck/tests/realloc_size_zero_again_yes.stderr.exp @@ -0,0 +1,18 @@ +realloc() with size 0 + at 0x........: realloc (vg_replace_malloc.c:...) + ... + Address 0x........ is 0 bytes inside a block of size 1,024 alloc'd + at 0x........: malloc (vg_replace_malloc.c:...) + ... + +ERROR SUMMARY: 5 errors from 1 contexts (suppressed: 0 from 0) + +5 errors in context 1 of 1: +realloc() with size 0 + at 0x........: realloc (vg_replace_malloc.c:...) + ... + Address 0x........ is 0 bytes inside a block of size 1,024 alloc'd + at 0x........: malloc (vg_replace_malloc.c:...) + ... + +ERROR SUMMARY: 5 errors from 1 contexts (suppressed: 0 from 0) diff --git a/memcheck/tests/realloc_size_zero_again_yes.stdout.exp b/memcheck/tests/realloc_size_zero_again_yes.stdout.exp new file mode 100644 index 0000000000..e69de29bb2 diff --git a/memcheck/tests/realloc_size_zero_again_yes.vgtest b/memcheck/tests/realloc_size_zero_again_yes.vgtest new file mode 100644 index 0000000000..215392ed62 --- /dev/null +++ b/memcheck/tests/realloc_size_zero_again_yes.vgtest @@ -0,0 +1,2 @@ +prog: realloc_size_zero_again +vgopts: -q -s --realloc-zero-bytes-frees=yes |
|
From: LATHUILIERE B. <bru...@ed...> - 2023-06-01 11:29:19
|
-------- Courriel original -------- Objet: Re: [Valgrind-developers] RFC: support scalable vector model / riscv vector Date: 2023-05-29 05:29 De: "Wu, Fei" <fe...@in...> À: Petr Pavlu <pet...@da...>, Jojo R <rj...@gm...> Cc: pa...@so..., yun...@al..., val...@li..., val...@li..., zha...@al... >On 5/28/2023 1:06 AM, Petr Pavlu wrote: >> On 21. Apr 23 17:25, Jojo R wrote: >>> The last remaining big issue is 3, which we introduce some ad-hoc >>> approaches to deal with. We summarize these approaches into three >>> types as >>> following: >>> >>> 1. Break down a vector instruction to scalar VEX IR ops. >>> 2. Break down a vector instruction to fixed-length VEX IR ops. >>> 3. Use dirty helpers to realize vector instructions. >> >> I would also look at adding new VEX IR ops for scalable vector >> instructions. In particular, if it could be shown that RVV and SVE can >> use same new ops then it could make a good argument for adding them. >> >> Perhaps interesting is if such new scalable vector ops could also >> represent fixed operations on other architectures, but that is just me >> thinking out loud. >> >It's a good idea to consolidate all vector/simd together, the challenge is to verify its feasibility and to speedup the adaption progress, as it's supposed to take more efforts and longer time. Is there anyone with knowledge or experience of other ISA such as avx/sve on valgrind >can share the pain and gain, or we can do some quick prototype? > >Thanks, >Fei. Hi, I don't know if my experience is the one you expect, nevertheless I will try to share it. I'm the main developer of a valgrind tool called verrou (url: https://github.com/edf-hpc/verrou ) which currently only works with x86_64 architecture. >From user's point of view, verrou enables to estimate the effect of the floating-point rounding error propagation (If you are interested by the subject, there are documentation and publication). >From valgrind tool developer's point of view, we need to replace all floating-point operations (fpo) by our own modified fpo implemented with C++ functions. One C++ function has 1,2 or 3 floating point input values and one floating point output value. As we have to replace all VEX fpo, the way we handle with SSE and AVX has consequences for us. For each kind of fpo (add,sub,mul,div,sqrt)x(float,double), we have to replace VEX op for the following variants : scalar, SSE low lane, SSE, AVX. It is painful but possible via code generation. Thanks to the multiple VEX ops it is possible to select only one type of instruction (it can be useful to 1- get speed up, 2- know if floating point errors come from scalar or vector instructions). On the other hand, for fma operations (madd,msub)x(float,double) we have less work to do, as valgrind do the un-vectorisation for us, but it is impossible to instrument selectively scalar or vector ops. We could think that the multiple VEX ops enable performance improvements via the vectorisation of C++ call, but it is not now possible (at least to my knowledge). Indeed, with the valgrind API I don't know how I can get the floating-point values in the register without applying un-vectorisation : To get the values in the AVX register, I do an awful sequence of Iop_V256to64_0, Iop_V256to64_1, Iop_V256to64_2, Iop_V256to64_3 for the 2 arguments. As it is not possible to do a IRStmt_Dirty call with a function with 9 args (9=2*4+1 2 for a binary operation, 4 for the vector length and 1 for the result), I do a first call to copy the 4 values of the first arg somewhere then a second one to perform the 4 C++ calls. Due to the algorithm inside the C++ calls it could be tricky to vectorise, but I even didn't try because of the sequence of Iop_V256to64_*. In my dreams I would like Iop_ to convert a V256 or V128 type to an aligned pointer on floating point args. So, I don't know if my experience can be useful for you, but if someone has a better solution to my needs it will be useful at least ... to me :) Best regards, Bruno Lathuilière Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à l'intention exclusive des destinataires et les informations qui y figurent sont strictement confidentielles. Toute utilisation de ce Message non conforme à sa destination, toute diffusion ou toute publication totale ou partielle, est interdite sauf autorisation expresse. Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si vous avez reçu ce Message par erreur, merci de le supprimer de votre système, ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support que ce soit. Nous vous remercions également d'en avertir immédiatement l'expéditeur par retour du message. Il est impossible de garantir que les communications par messagerie électronique arrivent en temps utile, sont sécurisées ou dénuées de toute erreur ou virus. ____________________________________________________ This message and any attachments (the 'Message') are intended solely for the addressees. The information contained in this Message is confidential. Any use of information contained in this Message not in accord with its purpose, any dissemination or disclosure, either whole or partial, is prohibited except formal approval. If you are not the addressee, you may not copy, forward, disclose or use any part of it. If you have received this message in error, please delete it and all copies from your system and notify the sender immediately by return message. E-mail communication cannot be guaranteed to be timely secure, error or virus-free. |
|
From: Wu, F. <fe...@in...> - 2023-05-29 03:29:55
|
On 5/28/2023 1:06 AM, Petr Pavlu wrote:
> On 21. Apr 23 17:25, Jojo R wrote:
>> We consider to add RVV/Vector [1] feature in valgrind, there are some
>> challenges.
>> RVV like ARM's SVE [2] programming model, it's scalable/VLA, that means the
>> vector length is agnostic.
>> ARM's SVE is not supported in valgrind :(
>>
>> There are three major issues in implementing RVV instruction set in Valgrind
>> as following:
>>
>> 1. Scalable vector register width VLENB
>> 2. Runtime changing property of LMUL and SEW
>> 3. Lack of proper VEX IR to represent all vector operations
>>
>> We propose applicable methods to solve 1 and 2. As for 3, we explore several
>> possible but maybe imperfect approaches to handle different cases.
>>
>> We start from 1. As each guest register should be described in VEXGuestState
>> struct, the vector registers with scalable width of VLENB can be added into
>> VEXGuestState as arrays using an allowable maximum length like 2048/4096.
>
> Size of VexGuestRISCV64State is currently 592 bytes. Adding these large
> vector registers will bump it by 32*2048/8=8192 bytes.
>
Yes, that's the reason in my RFC patches the vlen is set to 128, that's
the largest room for vector in current design.
> The baseblock layout in VEX is: the guest state, two equal sized areas
> for shadow state and then a spill area. The RISC-V port accesses the
> baseblock in generated code via x8/s0. The register is set to the
> address of the baseblock+2048 (file
> coregrind/m_dispatch/dispatch-riscv64-linux.S). The extra offset is
> a small optimization to utilize the fact that load/store instructions in
> RVI have a signed offset in range [-2048,2047]. The end result is that
> it is possible to access the baseblock data using only a single
> instruction.
>
Nice design.
> Adding the new vector registers will cause that more instructions will
> be necessary. For instance, accessing any shadow guest state would
> naively require a sequence of LUI+ADDI+LOAD/STORE.
>
> I suspect this could affect performance quite a bit and might need some
> optimizing.
>
Yes, can we separate the vector registers from the other ones, is it
able to use two baseblocks? Or we can do some experiments to measure the
overhead.
>>
>> The actual available access range can be determined at Valgrind startup time
>> by querying the CPU for its vector capability or some suitable setup steps.
>
> Something to consider is that the virtual CPU provided by Valgrind does
> not necessarily need to match the host CPU. For instance, VEX could
> hardcode that its vector registers are only 128 bits in size.
>
> I was originally hoping that this is how support for the V extension
> could be added, but the LMUL grouping looks to break this model.
>
Originally I had the same idea, but 128 vlen hardware cannot run the
software built for larger vlen, e.g. clang has option
-riscv-v-vector-bits-min, if it's set to 256, then it assumes the
underlying hardware has at least 256 vlen.
>>
>>
>> To solve problem 2, we are inspired by already-proven techniques in QEMU,
>> where translation blocks are broken up when certain critical CSRs are set.
>> Because the guest code to IR translation relies on the precise value of
>> LMUL/SEW and they may change within a basic block, we can break up the basic
>> block each time encountering a vsetvl{i} instruction and return to the
>> scheduler to execute the translated code and update LMUL/SEW. Accordingly,
>> translation cache management should be refactored to detect the changing of
>> LMUL/SEW to invalidate outdated code cache. Without losing the generality,
>> the LMUL/SEW should be encoded into an ULong flag such that other
>> architectures can leverage this flag to store their arch-dependent
>> information. The TTentry struct should also take the flag into account no
>> matter insertion or deletion. By doing this, the flag carries the newest
>> LMUL/SEW throughout the simulation and can be passed to disassemble
>> functions using the VEXArchInfo struct such that we can get the real and
>> newest value of LMUL and SEW to facilitate our translation.
>>
>> Also, some architecture-related code should be taken care of. Like
>> m_dispatch part, disp_cp_xindir function looks up code cache using hardcoded
>> assembly by checking the requested guest state IP and translation cache
>> entry address with no more constraints. Many other modules should be checked
>> to ensure the in-time update of LMUL/SEW is instantly visible to essential
>> parts in Valgrind.
>>
>>
>> The last remaining big issue is 3, which we introduce some ad-hoc approaches
>> to deal with. We summarize these approaches into three types as following:
>>
>> 1. Break down a vector instruction to scalar VEX IR ops.
>> 2. Break down a vector instruction to fixed-length VEX IR ops.
>> 3. Use dirty helpers to realize vector instructions.
>
> I would also look at adding new VEX IR ops for scalable vector
> instructions. In particular, if it could be shown that RVV and SVE can
> use same new ops then it could make a good argument for adding them.
>
> Perhaps interesting is if such new scalable vector ops could also
> represent fixed operations on other architectures, but that is just me
> thinking out loud.
>
It's a good idea to consolidate all vector/simd together, the challenge
is to verify its feasibility and to speedup the adaption progress, as
it's supposed to take more efforts and longer time. Is there anyone with
knowledge or experience of other ISA such as avx/sve on valgrind can
share the pain and gain, or we can do some quick prototype?
Thanks,
Fei.
>> [...]
>> In summary, it is far to reach a truly applicable solution in adding vector
>> extensions in Valgrind. We need to do detailed and comprehensive estimations
>> on different vector instruction categories.
>>
>> Any feedback is welcome in github [3] also.
>>
>>
>> [1] https://github.com/riscv/riscv-v-spec
>>
>> [2] https://community.arm.com/arm-research/b/articles/posts/the-arm-scalable-vector-extension-sve
>>
>> [3] https://github.com/petrpavlu/valgrind-riscv64/issues/17
>
> Sorry for not being more helpful at this point. As mentioned in the
> GitHub issue, I still need to get myself more familiar with RVV and how
> Valgrind handles vector instructions.
>
> Thanks,
> Petr
>
>
>
> _______________________________________________
> Valgrind-developers mailing list
> Val...@li...
> https://lists.sourceforge.net/lists/listinfo/valgrind-developers
|
|
From: Wu, F. <fe...@in...> - 2023-05-29 02:24:43
|
On 5/28/2023 1:11 AM, Petr Pavlu wrote: > On 26. May 23 21:59, Fei Wu wrote: >> I'm from Intel RISC-V team and working on a RISC-V International >> development partner project to add RISC-V vector (RVV) support on >> Valgrind, the target tool is memcheck. My work bases on commit >> 71272b252977 of Petr's riscv64-linux branch, many thanks to Petr for his >> great work first. >> https://github.com/petrpavlu/valgrind-riscv64 >> >> This RFC is a starting point of RVV support on Valgrind, It's far from >> complete, which will take huge time, but I do think it's more effective >> to have some real code for discussion, so this series adds the RVV >> support to run memcpy/strcmp/strcpy/strlen/strncpy in: >> https://github.com/riscv-non-isa/rvv-intrinsic-doc/tree/master/examples >> >> The whole idea is splitting the vector instructions into scalar >> instructions which have already been well supported on Petr's branch, >> the correctness of binary translation (tool=none) is simple to ensure, >> but the logic of tool=memcheck should not be broken, one of the keys is >> to deal with the instructions with mask: >> >> [...] >> >> At last, if the performance is tolerable, is this the right way to go? > > Have you seen the recent mail about RVV to this list from Jojo [1]? It > has some discussion on breaking vector operations down to scalars too. > Thank you for pointing it out, it's a nice writeup, now I subscribed the mailing list in order to capture all the traffics. > It seems you are both looking at the same topic. It would be good if you > can cooperate on this, if that is not already the case. > Agreed, I will sync up with him. Thanks, Fei. > [1] https://sourceforge.net/p/valgrind/mailman/valgrind-developers/thread/84b7a55c-1868-ca14-2626-ffb88925741a%40linux.alibaba.com/ > > Thanks, > Petr |
|
From: Petr P. <pet...@da...> - 2023-05-27 17:25:57
|
On 21. Apr 23 17:25, Jojo R wrote:
> We consider to add RVV/Vector [1] feature in valgrind, there are some
> challenges.
> RVV like ARM's SVE [2] programming model, it's scalable/VLA, that means the
> vector length is agnostic.
> ARM's SVE is not supported in valgrind :(
>
> There are three major issues in implementing RVV instruction set in Valgrind
> as following:
>
> 1. Scalable vector register width VLENB
> 2. Runtime changing property of LMUL and SEW
> 3. Lack of proper VEX IR to represent all vector operations
>
> We propose applicable methods to solve 1 and 2. As for 3, we explore several
> possible but maybe imperfect approaches to handle different cases.
>
> We start from 1. As each guest register should be described in VEXGuestState
> struct, the vector registers with scalable width of VLENB can be added into
> VEXGuestState as arrays using an allowable maximum length like 2048/4096.
Size of VexGuestRISCV64State is currently 592 bytes. Adding these large
vector registers will bump it by 32*2048/8=8192 bytes.
The baseblock layout in VEX is: the guest state, two equal sized areas
for shadow state and then a spill area. The RISC-V port accesses the
baseblock in generated code via x8/s0. The register is set to the
address of the baseblock+2048 (file
coregrind/m_dispatch/dispatch-riscv64-linux.S). The extra offset is
a small optimization to utilize the fact that load/store instructions in
RVI have a signed offset in range [-2048,2047]. The end result is that
it is possible to access the baseblock data using only a single
instruction.
Adding the new vector registers will cause that more instructions will
be necessary. For instance, accessing any shadow guest state would
naively require a sequence of LUI+ADDI+LOAD/STORE.
I suspect this could affect performance quite a bit and might need some
optimizing.
>
> The actual available access range can be determined at Valgrind startup time
> by querying the CPU for its vector capability or some suitable setup steps.
Something to consider is that the virtual CPU provided by Valgrind does
not necessarily need to match the host CPU. For instance, VEX could
hardcode that its vector registers are only 128 bits in size.
I was originally hoping that this is how support for the V extension
could be added, but the LMUL grouping looks to break this model.
>
>
> To solve problem 2, we are inspired by already-proven techniques in QEMU,
> where translation blocks are broken up when certain critical CSRs are set.
> Because the guest code to IR translation relies on the precise value of
> LMUL/SEW and they may change within a basic block, we can break up the basic
> block each time encountering a vsetvl{i} instruction and return to the
> scheduler to execute the translated code and update LMUL/SEW. Accordingly,
> translation cache management should be refactored to detect the changing of
> LMUL/SEW to invalidate outdated code cache. Without losing the generality,
> the LMUL/SEW should be encoded into an ULong flag such that other
> architectures can leverage this flag to store their arch-dependent
> information. The TTentry struct should also take the flag into account no
> matter insertion or deletion. By doing this, the flag carries the newest
> LMUL/SEW throughout the simulation and can be passed to disassemble
> functions using the VEXArchInfo struct such that we can get the real and
> newest value of LMUL and SEW to facilitate our translation.
>
> Also, some architecture-related code should be taken care of. Like
> m_dispatch part, disp_cp_xindir function looks up code cache using hardcoded
> assembly by checking the requested guest state IP and translation cache
> entry address with no more constraints. Many other modules should be checked
> to ensure the in-time update of LMUL/SEW is instantly visible to essential
> parts in Valgrind.
>
>
> The last remaining big issue is 3, which we introduce some ad-hoc approaches
> to deal with. We summarize these approaches into three types as following:
>
> 1. Break down a vector instruction to scalar VEX IR ops.
> 2. Break down a vector instruction to fixed-length VEX IR ops.
> 3. Use dirty helpers to realize vector instructions.
I would also look at adding new VEX IR ops for scalable vector
instructions. In particular, if it could be shown that RVV and SVE can
use same new ops then it could make a good argument for adding them.
Perhaps interesting is if such new scalable vector ops could also
represent fixed operations on other architectures, but that is just me
thinking out loud.
> [...]
> In summary, it is far to reach a truly applicable solution in adding vector
> extensions in Valgrind. We need to do detailed and comprehensive estimations
> on different vector instruction categories.
>
> Any feedback is welcome in github [3] also.
>
>
> [1] https://github.com/riscv/riscv-v-spec
>
> [2] https://community.arm.com/arm-research/b/articles/posts/the-arm-scalable-vector-extension-sve
>
> [3] https://github.com/petrpavlu/valgrind-riscv64/issues/17
Sorry for not being more helpful at this point. As mentioned in the
GitHub issue, I still need to get myself more familiar with RVV and how
Valgrind handles vector instructions.
Thanks,
Petr
|
|
From: Petr P. <pet...@da...> - 2023-05-27 17:21:08
|
On 26. May 23 21:59, Fei Wu wrote: > I'm from Intel RISC-V team and working on a RISC-V International > development partner project to add RISC-V vector (RVV) support on > Valgrind, the target tool is memcheck. My work bases on commit > 71272b252977 of Petr's riscv64-linux branch, many thanks to Petr for his > great work first. > https://github.com/petrpavlu/valgrind-riscv64 > > This RFC is a starting point of RVV support on Valgrind, It's far from > complete, which will take huge time, but I do think it's more effective > to have some real code for discussion, so this series adds the RVV > support to run memcpy/strcmp/strcpy/strlen/strncpy in: > https://github.com/riscv-non-isa/rvv-intrinsic-doc/tree/master/examples > > The whole idea is splitting the vector instructions into scalar > instructions which have already been well supported on Petr's branch, > the correctness of binary translation (tool=none) is simple to ensure, > but the logic of tool=memcheck should not be broken, one of the keys is > to deal with the instructions with mask: > > [...] > > At last, if the performance is tolerable, is this the right way to go? Have you seen the recent mail about RVV to this list from Jojo [1]? It has some discussion on breaking vector operations down to scalars too. It seems you are both looking at the same topic. It would be good if you can cooperate on this, if that is not already the case. [1] https://sourceforge.net/p/valgrind/mailman/valgrind-developers/thread/84b7a55c-1868-ca14-2626-ffb88925741a%40linux.alibaba.com/ Thanks, Petr |