You can subscribe to this list here.
2003 |
Jan
|
Feb
|
Mar
(58) |
Apr
(261) |
May
(169) |
Jun
(214) |
Jul
(201) |
Aug
(219) |
Sep
(198) |
Oct
(203) |
Nov
(241) |
Dec
(94) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2004 |
Jan
(137) |
Feb
(149) |
Mar
(150) |
Apr
(193) |
May
(95) |
Jun
(173) |
Jul
(137) |
Aug
(236) |
Sep
(157) |
Oct
(150) |
Nov
(136) |
Dec
(90) |
2005 |
Jan
(139) |
Feb
(130) |
Mar
(274) |
Apr
(138) |
May
(184) |
Jun
(152) |
Jul
(261) |
Aug
(409) |
Sep
(239) |
Oct
(241) |
Nov
(260) |
Dec
(137) |
2006 |
Jan
(191) |
Feb
(142) |
Mar
(169) |
Apr
(75) |
May
(141) |
Jun
(169) |
Jul
(131) |
Aug
(141) |
Sep
(192) |
Oct
(176) |
Nov
(142) |
Dec
(95) |
2007 |
Jan
(98) |
Feb
(120) |
Mar
(93) |
Apr
(96) |
May
(95) |
Jun
(65) |
Jul
(62) |
Aug
(56) |
Sep
(53) |
Oct
(95) |
Nov
(106) |
Dec
(87) |
2008 |
Jan
(58) |
Feb
(149) |
Mar
(175) |
Apr
(110) |
May
(106) |
Jun
(72) |
Jul
(55) |
Aug
(89) |
Sep
(26) |
Oct
(96) |
Nov
(83) |
Dec
(93) |
2009 |
Jan
(97) |
Feb
(106) |
Mar
(74) |
Apr
(64) |
May
(115) |
Jun
(83) |
Jul
(137) |
Aug
(103) |
Sep
(56) |
Oct
(59) |
Nov
(61) |
Dec
(37) |
2010 |
Jan
(94) |
Feb
(71) |
Mar
(53) |
Apr
(105) |
May
(79) |
Jun
(111) |
Jul
(110) |
Aug
(81) |
Sep
(50) |
Oct
(82) |
Nov
(49) |
Dec
(21) |
2011 |
Jan
(87) |
Feb
(105) |
Mar
(108) |
Apr
(99) |
May
(91) |
Jun
(94) |
Jul
(114) |
Aug
(77) |
Sep
(58) |
Oct
(58) |
Nov
(131) |
Dec
(62) |
2012 |
Jan
(76) |
Feb
(93) |
Mar
(68) |
Apr
(95) |
May
(62) |
Jun
(109) |
Jul
(90) |
Aug
(87) |
Sep
(49) |
Oct
(54) |
Nov
(66) |
Dec
(84) |
2013 |
Jan
(67) |
Feb
(52) |
Mar
(93) |
Apr
(65) |
May
(33) |
Jun
(34) |
Jul
(52) |
Aug
(42) |
Sep
(52) |
Oct
(48) |
Nov
(66) |
Dec
(14) |
2014 |
Jan
(66) |
Feb
(51) |
Mar
(34) |
Apr
(47) |
May
(58) |
Jun
(27) |
Jul
(52) |
Aug
(41) |
Sep
(78) |
Oct
(30) |
Nov
(28) |
Dec
(26) |
2015 |
Jan
(41) |
Feb
(42) |
Mar
(20) |
Apr
(73) |
May
(31) |
Jun
(48) |
Jul
(23) |
Aug
(55) |
Sep
(36) |
Oct
(47) |
Nov
(48) |
Dec
(41) |
2016 |
Jan
(32) |
Feb
(34) |
Mar
(33) |
Apr
(22) |
May
(14) |
Jun
(31) |
Jul
(29) |
Aug
(41) |
Sep
(17) |
Oct
(27) |
Nov
(38) |
Dec
(28) |
2017 |
Jan
(28) |
Feb
(30) |
Mar
(16) |
Apr
(9) |
May
(27) |
Jun
(57) |
Jul
(28) |
Aug
(43) |
Sep
(31) |
Oct
(20) |
Nov
(24) |
Dec
(18) |
2018 |
Jan
(34) |
Feb
(50) |
Mar
(18) |
Apr
(26) |
May
(13) |
Jun
(31) |
Jul
(13) |
Aug
(11) |
Sep
(15) |
Oct
(12) |
Nov
(18) |
Dec
(13) |
2019 |
Jan
(12) |
Feb
(29) |
Mar
(51) |
Apr
(22) |
May
(13) |
Jun
(20) |
Jul
(13) |
Aug
(12) |
Sep
(21) |
Oct
(6) |
Nov
(9) |
Dec
(5) |
2020 |
Jan
(13) |
Feb
(5) |
Mar
(25) |
Apr
(4) |
May
(40) |
Jun
(27) |
Jul
(5) |
Aug
(17) |
Sep
(21) |
Oct
(1) |
Nov
(5) |
Dec
(15) |
2021 |
Jan
(28) |
Feb
(6) |
Mar
(11) |
Apr
(5) |
May
(7) |
Jun
(8) |
Jul
(5) |
Aug
(5) |
Sep
(11) |
Oct
(9) |
Nov
(10) |
Dec
(12) |
2022 |
Jan
(7) |
Feb
(13) |
Mar
(8) |
Apr
(7) |
May
(12) |
Jun
(27) |
Jul
(14) |
Aug
(27) |
Sep
(27) |
Oct
(17) |
Nov
(17) |
Dec
|
2023 |
Jan
(10) |
Feb
(18) |
Mar
(9) |
Apr
(26) |
May
|
Jun
(13) |
Jul
(18) |
Aug
(5) |
Sep
(12) |
Oct
(16) |
Nov
(1) |
Dec
|
2024 |
Jan
(4) |
Feb
(3) |
Mar
(6) |
Apr
(17) |
May
(2) |
Jun
(33) |
Jul
(13) |
Aug
(1) |
Sep
(6) |
Oct
(8) |
Nov
(6) |
Dec
(15) |
2025 |
Jan
(5) |
Feb
(11) |
Mar
(8) |
Apr
(20) |
May
(1) |
Jun
|
Jul
|
Aug
(1) |
Sep
|
Oct
|
Nov
|
Dec
|
From: Stuart F. <smf...@nt...> - 2023-07-19 11:39:43
|
I am trying to find which of my systems will run valgrind, I know it will not run on my AMD FX-8370 and AMD FX-4350 systems. Does any one know if it should run on my AMD Ryzen 5 5600X (see failure below) ? I have access to an Intel core 7 laptop (Haswell), would I stand a better chance with that, I am reluctant to move my whole project to the laptop if there is no chance of Valgrind working there too. ==5096== Memcheck, a memory error detector ==5096== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al. ==5096== Using Valgrind-3.21.0 and LibVEX; rerun with -h for copyright info ==5096== Command: QtWeather -s moira2 ==5096== vex amd64->IR: unhandled instruction bytes: 0xC4 0xE2 0x7D 0xDC 0xC9 0x48 0x39 0xD1 0x73 0x37 vex amd64->IR: REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0 vex amd64->IR: VEX=1 VEX.L=1 VEX.nVVVV=0x0 ESC=0F38 vex amd64->IR: PFX.66=1 PFX.F2=0 PFX.F3=0 ==5096== valgrind: Unrecognised instruction at address 0x5ee6282. ==5096== at 0x5EE6282: aeshash256_ge32(long long __vector(4), unsigned char const*, unsigned long) (in /opt/qt-6.4.0/lib/libQt6Core.so.6.4.0) ==5096== by 0x5FEBFD1: QFactoryLoaderPrivate::updateSinglePath(QString const&) (in /opt/qt-6.4.0/lib/libQt6Core.so.6.4.0) ==5096== by 0x5FE8403: QFactoryLoader::update() (in /opt/qt-6.4.0/lib/libQt6Core.so.6.4.0) ==5096== by 0x5FE8906: QFactoryLoader::QFactoryLoader(char const*, QString const&, Qt::CaseSensitivity) (in /opt/qt-6.4.0/lib/libQt6Core.so.6.4.0) ==5096== by 0x54DC115: QPlatformIntegrationFactory::keys(QString const&) (in /opt/qt-6.4.0/lib/libQt6Gui.so.6.4.0) ==5096== by 0x54A1A36: init_platform(QString const&, QString const&, QString const&, int&, char**) (in /opt/qt-6.4.0/lib/libQt6Gui.so.6.4.0) ==5096== by 0x54A58DF: QGuiApplicationPrivate::createPlatformIntegration() (in /opt/qt-6.4.0/lib/libQt6Gui.so.6.4.0) ==5096== by 0x54A6517: QGuiApplicationPrivate::createEventDispatcher() (in /opt/qt-6.4.0/lib/libQt6Gui.so.6.4.0) ==5096== by 0x5F64804: QCoreApplicationPrivate::init() (in /opt/qt-6.4.0/lib/libQt6Core.so.6.4.0) ==5096== by 0x54A9979: QGuiApplicationPrivate::init() (in /opt/qt-6.4.0/lib/libQt6Gui.so.6.4.0) ==5096== by 0x4C28708: QApplicationPrivate::init() (in /opt/qt-6.4.0/lib/libQt6Widgets.so.6.4.0) ==5096== by 0x124098: main (in /usr/bin/QtWeather) ==5096== Your program just tried to execute an instruction that Valgrind ==5096== did not recognise. There are two possible reasons for this. ==5096== 1. Your program has a bug and erroneously jumped to a non-code ==5096== location. If you are running Memcheck and you just saw a ==5096== warning about a bad jump, it's probably your program's fault. ==5096== 2. The instruction is legitimate but Valgrind doesn't handle it, ==5096== i.e. it's Valgrind's fault. If you think this is the case or ==5096== you are not sure, please let us know and we'll try to fix it. ==5096== Either way, Valgrind will now raise a SIGILL signal which will ==5096== probably kill your program. ==5096== ... Thanks |
From: Wu, F. <fe...@in...> - 2023-07-19 01:25:15
|
On 7/19/2023 3:08 AM, Petr Pavlu wrote: > On 11. Jul 23 19:28, Wu, Fei wrote: >> On 7/11/2023 4:50 AM, Petr Pavlu wrote: >>> On 6. Jul 23 20:39, Wu, Fei wrote: >>>> [...] >>>> >>>> This approach will introduce a bunch of new vlen Vector IRs, especially >>>> the arithmetic IRs such as vadd, my goal is for a good solution which >>>> takes reasonable time to reach usable status, yet still be able to >>>> evolve and generic enough for other vector ISA. Any comments? > > This personally looks to me as a right direction. Supporting scalable > vector extensions in Valgrind as a first-class citizen would be my > preferred choice. I think it is something that will be needed to handle > Arm SVE and RISC-V RVV well. On the other hand, it is likely the most > complex approach and could take time to iron out. > >>> Could you please share a repository with your changes or send them to me >>> as patches? I have a few questions but I think it might be easier for me >>> first to see the actual code. >>> >> Please see attachment. It's a very raw version to just verify the idea, >> mask is not added but expected to be done as mentioned above, it's based >> on commit 71272b2529 on your branch, patch 0013 is the key. > > Thanks for sharing this code. The previous discussions and this series > introduces a new concept of translating client code per some CPU state. > That is something I spent most time thinking about. > > I can see it is indeed necessary for RVV. In particular, this > "versioning" of translations allows that Valgrind IR can statically > express an element type of each vector operation, i.e. that it is an > operation on I32, F64, ... An alternative would be to try to express the > type dynamically in IR. That should be still somewhat manageable in the > toIR frontend but I have a hard time seeing how it would work for the > instrumentation and codegen. > > The versioning should work well for RVV translations because my > expectation is that most RVV loops will consist of a call to vsetvli > (with a static vtype), followed by some actual vector operations. Such > a block then requires only one translation. > > This is however true only if translations are versioned just per vtype, > without vl. If I understood correctly, the patches version them per vl > too but it isn't clear to me conceptually if this is really necessary. > Yes, this series does version vl, it helps the situation such as in the last patch, it can break the large vl to multiple small vl operations, in case the backend doesn't have a register allocation algorithm for LMUL>1. > For instance, I think VAdd8 could look as follows: > VAdd8(<len>, <in1>, <in2>, <flags?>) where <len> is something as > IRExpr_Get(OFFB_VL, Ity_I64). > > Another problem which I noticed is that blocks containing no RVV > instructions are also versioned. Consider the following: > while (true) { > // (1) some RVV code which can set vtype to different values > // (2) a large chunk of non-RVV code > } > > The code in (2) will currently have multiple same translations for each > residue left in vtype by (1). > Yes, indeed. This is one place to optimize. > In general, I think the concept of allowing translations per some CPU > state could be useful in other cases and for other architectures too. > For RISC-V, it could be beneficial for floating-point operations. My > expectation is that regular RISC-V FP code will have instructions with > encoded rm=DYN and always executed with frm=RNE. The current approach is > that the toIR frontend generates an IR which reads the rounding mode > from frm and remaps it to the Valgrind's representation. The codegen > then does the opposite. The idea here is that the frontend would know > the actual rounding mode and could create IR which has directly this > mode, for instance, AddF64(Irrm_NEAREST, <in1>, <in2>). The codegen then > doesn't need to know how to handle any dynamic rounding modes as they > become static. > > I plan to look further into this series. Specifically, I'd like to have > a stab at adding some basic support for Arm SVE to get a better > understanding if this is generic enough. > Great, I will add more RVV support if it's proved to be the right direction, and thank you for the review. Thanks, Fei. > Thanks, > Petr |
From: Petr P. <pet...@da...> - 2023-07-18 19:26:03
|
On 11. Jul 23 19:28, Wu, Fei wrote: > On 7/11/2023 4:50 AM, Petr Pavlu wrote: > > On 6. Jul 23 20:39, Wu, Fei wrote: > >> [...] > >> > >> This approach will introduce a bunch of new vlen Vector IRs, especially > >> the arithmetic IRs such as vadd, my goal is for a good solution which > >> takes reasonable time to reach usable status, yet still be able to > >> evolve and generic enough for other vector ISA. Any comments? This personally looks to me as a right direction. Supporting scalable vector extensions in Valgrind as a first-class citizen would be my preferred choice. I think it is something that will be needed to handle Arm SVE and RISC-V RVV well. On the other hand, it is likely the most complex approach and could take time to iron out. > > Could you please share a repository with your changes or send them to me > > as patches? I have a few questions but I think it might be easier for me > > first to see the actual code. > > > Please see attachment. It's a very raw version to just verify the idea, > mask is not added but expected to be done as mentioned above, it's based > on commit 71272b2529 on your branch, patch 0013 is the key. Thanks for sharing this code. The previous discussions and this series introduces a new concept of translating client code per some CPU state. That is something I spent most time thinking about. I can see it is indeed necessary for RVV. In particular, this "versioning" of translations allows that Valgrind IR can statically express an element type of each vector operation, i.e. that it is an operation on I32, F64, ... An alternative would be to try to express the type dynamically in IR. That should be still somewhat manageable in the toIR frontend but I have a hard time seeing how it would work for the instrumentation and codegen. The versioning should work well for RVV translations because my expectation is that most RVV loops will consist of a call to vsetvli (with a static vtype), followed by some actual vector operations. Such a block then requires only one translation. This is however true only if translations are versioned just per vtype, without vl. If I understood correctly, the patches version them per vl too but it isn't clear to me conceptually if this is really necessary. For instance, I think VAdd8 could look as follows: VAdd8(<len>, <in1>, <in2>, <flags?>) where <len> is something as IRExpr_Get(OFFB_VL, Ity_I64). Another problem which I noticed is that blocks containing no RVV instructions are also versioned. Consider the following: while (true) { // (1) some RVV code which can set vtype to different values // (2) a large chunk of non-RVV code } The code in (2) will currently have multiple same translations for each residue left in vtype by (1). In general, I think the concept of allowing translations per some CPU state could be useful in other cases and for other architectures too. For RISC-V, it could be beneficial for floating-point operations. My expectation is that regular RISC-V FP code will have instructions with encoded rm=DYN and always executed with frm=RNE. The current approach is that the toIR frontend generates an IR which reads the rounding mode from frm and remaps it to the Valgrind's representation. The codegen then does the opposite. The idea here is that the frontend would know the actual rounding mode and could create IR which has directly this mode, for instance, AddF64(Irrm_NEAREST, <in1>, <in2>). The codegen then doesn't need to know how to handle any dynamic rounding modes as they become static. I plan to look further into this series. Specifically, I'd like to have a stab at adding some basic support for Arm SVE to get a better understanding if this is generic enough. Thanks, Petr |
From: Wu, F. <fe...@in...> - 2023-07-18 01:44:56
|
On 7/11/2023 7:28 PM, Wu, Fei wrote: > On 7/11/2023 4:50 AM, Petr Pavlu wrote: >> On 6. Jul 23 20:39, Wu, Fei wrote: >>> On 5/29/2023 11:29 AM, Wu, Fei wrote: >>>> On 5/28/2023 1:06 AM, Petr Pavlu wrote: >>>>> On 21. Apr 23 17:25, Jojo R wrote: >>>>>> We consider to add RVV/Vector [1] feature in valgrind, there are some >>>>>> challenges. >>>>>> RVV like ARM's SVE [2] programming model, it's scalable/VLA, that means the >>>>>> vector length is agnostic. >>>>>> ARM's SVE is not supported in valgrind :( >>>>>> >>>>>> There are three major issues in implementing RVV instruction set in Valgrind >>>>>> as following: >>>>>> >>>>>> 1. Scalable vector register width VLENB >>>>>> 2. Runtime changing property of LMUL and SEW >>>>>> 3. Lack of proper VEX IR to represent all vector operations >>>>>> >>>>>> We propose applicable methods to solve 1 and 2. As for 3, we explore several >>>>>> possible but maybe imperfect approaches to handle different cases. >>>>>> >>> I did a very basic prototype for vlen Vector-IR, particularly on RISC-V >>> Vector (RVV): >>> >>> * Define new iops such as Iop_VAdd8/16/32/64, the difference from >>> existing SIMD version is that no element number is specified like >>> Iop_Add8x32 >>> >>> * Define new IR type Ity_VLen along side existing types such as Ity_I64, >>> Ity_V256 >>> >>> * Define new class HRcVecVLen in HRegClass for vlen vector registers >>> The real length is embedded in both IROp and IRType for vlen ops/types, >>> it's runtime-decided and already known when handling insn such as vadd, >>> this leads to more flexibility, e.g. backend can issue extra vsetvl if >>> necessary. >>> >>> With the above, RVV instruction in the guest can be passed from >>> frontend, to memcheck, to the backend, and generate the final RVV insn >>> during host isel, a very basic testcase has been tested. >>> >>> Now here comes to the complexities: >>> >>> 1. RVV has the concept of LMUL, which groups multiple (or partial) >>> vector registers, e.g. when LMUL==2, v2 means the real v2+v3. This >>> complicates the register allocation. >>> >>> 2. RVV uses the "implicit" v0 for mask, its content must be loaded to >>> the exact "v0" register instead of any other ones if host isel wants to >>> leverage RVV insn, this implicitness in ISA requires more explicitness >>> in Valgrind implementation. >>> >>> For #1 LMUL, a new register allocation algorithm for it can be added, >>> and it will be great if someone is willing to try it, I'm not sure how >>> much effort it will take. The other way is splitting it into multiple >>> ops which only takes one vector register, taking vadd for example, 2 >>> vadd will run with LMUL=1 for one vadd with LMUL=2, this is still okay >>> for the widening insn, most of the arithmetic insns can be covered in >>> this way. The exception could be register gather insn vrgather, which we >>> can consult other ways for it, e.g. scalar or helper. >>> >>> For #2 v0 mask, one way is to handle the mask in the very beginning at >>> guest_riscv64_toIR.c, similar to what AVX port does: >>> >>> a) Read the whole dest register without mask >>> b) Generate unmasked result by running op without mask >>> c) Applying mask to a,b and generate the final dest >>> >>> by doing this, insn with mask is converted to non-mask ones, although >>> more insns are generated but the performance should be acceptable. There >>> are still exceptions, e.g. vadc (Add-with-Carry), v0 is not used as mask >>> but as carry, but just as mentioned above, it's okay to use other ways >>> for a few insns. Eventually, we can pass v0 mask down to the backend if >>> it's proved a better solution. >>> >>> This approach will introduce a bunch of new vlen Vector IRs, especially >>> the arithmetic IRs such as vadd, my goal is for a good solution which >>> takes reasonable time to reach usable status, yet still be able to >>> evolve and generic enough for other vector ISA. Any comments? >> >> Could you please share a repository with your changes or send them to me >> as patches? I have a few questions but I think it might be easier for me >> first to see the actual code. >> > Please see attachment. It's a very raw version to just verify the idea, > mask is not added but expected to be done as mentioned above, it's based > on commit 71272b2529 on your branch, patch 0013 is the key. > Hi Petr, Have you taken a look? Any comments? Thanks, Fei. > btw, I will setup a repository but it takes a few days to pass the > internal process. > > Thanks, > Fei. > >> Thanks, >> Petr |
From: Pavankumar S V <pav...@gm...> - 2023-07-12 13:27:03
|
Hello, I’m working on an embedded application which is multithreaded running on Linux platform. It has an infinite 'for' loop to keep the main thread alive. Every time, each iteration of this loop takes a different amount of time to get executed. In some iteration it is taking too much time and there are spikes in the execution time now and then. I’m trying to improve the performance(getting a consistent execution time) by figuring out the reason for the spike in execution time. So, I decided to explore the profilers to understand which functions are taking too much time to get executed. Tried gprof, strace, perf etc.. But none of them gave me the expected profiling report. *Question1:** My expectation from profilers*: I want to see time consumed by each function(user-space) of my application. Many of these functions are invoking system calls. So, I want to know the time consumed by each system call and who is invoking those time-consuming system calls. Is this possible with callgrind? I have followed these steps to generate a profiling data from callgrind: 1. I am limiting the infinite 'for' loop to a few thousands of iterations and returning from the main() function to get the callgrind output generated. 2. Compiled the program with these compiler flags: *-O0 -g -fno-inline-functions* 3. Running my application with this command: *valgrind --tool=callgrind * *-q * *--collect-systime=yes * *--trace-children=yes* * taskset 0x1 application_name* 1. Around 150 callgrind.out.X files are generated with different values for ‘X’. 2. I’m taking the callgrind.out.X file with the least value of X, assuming that this has the profiling data of the main thread. (When I checked other files, they did not have main() function in their profiled data). 3. Opening the output file with kcachegrind: *kcachegrind callgrind.out.X* After checking, the below points made me doubt the correctness of the profiling data: · There is a function that gets called inside the 'for' loop in my application which I know is taking a lot of time(as it is using ioctl() calls every time and confirmed that it takes too much time with testing). But callgrind output file shows that it is taking very less time to get executed. · Also, I added a test code (‘for’ loop that loops around for some time every time it gets called and consumes significant amount of time.) in one function. I confirmed that this function(after adding test code) consumes lot of time with gprof. But as per callgrind, this function is taking very less time. *Question2:* Please let me know where I'm going wrong or should I do anything more to get correct profiling data from callgrind. *Question3:* Why are so many *callgrind.out.X* generated? How to identify which file is for the main() thread? How to get only one output file generated like gprof? Thank you *Best Regards,* Pavankumar S V |
From: Wu, F. <fe...@in...> - 2023-07-11 11:29:25
|
On 7/11/2023 4:50 AM, Petr Pavlu wrote: > On 6. Jul 23 20:39, Wu, Fei wrote: >> On 5/29/2023 11:29 AM, Wu, Fei wrote: >>> On 5/28/2023 1:06 AM, Petr Pavlu wrote: >>>> On 21. Apr 23 17:25, Jojo R wrote: >>>>> We consider to add RVV/Vector [1] feature in valgrind, there are some >>>>> challenges. >>>>> RVV like ARM's SVE [2] programming model, it's scalable/VLA, that means the >>>>> vector length is agnostic. >>>>> ARM's SVE is not supported in valgrind :( >>>>> >>>>> There are three major issues in implementing RVV instruction set in Valgrind >>>>> as following: >>>>> >>>>> 1. Scalable vector register width VLENB >>>>> 2. Runtime changing property of LMUL and SEW >>>>> 3. Lack of proper VEX IR to represent all vector operations >>>>> >>>>> We propose applicable methods to solve 1 and 2. As for 3, we explore several >>>>> possible but maybe imperfect approaches to handle different cases. >>>>> >> I did a very basic prototype for vlen Vector-IR, particularly on RISC-V >> Vector (RVV): >> >> * Define new iops such as Iop_VAdd8/16/32/64, the difference from >> existing SIMD version is that no element number is specified like >> Iop_Add8x32 >> >> * Define new IR type Ity_VLen along side existing types such as Ity_I64, >> Ity_V256 >> >> * Define new class HRcVecVLen in HRegClass for vlen vector registers >> The real length is embedded in both IROp and IRType for vlen ops/types, >> it's runtime-decided and already known when handling insn such as vadd, >> this leads to more flexibility, e.g. backend can issue extra vsetvl if >> necessary. >> >> With the above, RVV instruction in the guest can be passed from >> frontend, to memcheck, to the backend, and generate the final RVV insn >> during host isel, a very basic testcase has been tested. >> >> Now here comes to the complexities: >> >> 1. RVV has the concept of LMUL, which groups multiple (or partial) >> vector registers, e.g. when LMUL==2, v2 means the real v2+v3. This >> complicates the register allocation. >> >> 2. RVV uses the "implicit" v0 for mask, its content must be loaded to >> the exact "v0" register instead of any other ones if host isel wants to >> leverage RVV insn, this implicitness in ISA requires more explicitness >> in Valgrind implementation. >> >> For #1 LMUL, a new register allocation algorithm for it can be added, >> and it will be great if someone is willing to try it, I'm not sure how >> much effort it will take. The other way is splitting it into multiple >> ops which only takes one vector register, taking vadd for example, 2 >> vadd will run with LMUL=1 for one vadd with LMUL=2, this is still okay >> for the widening insn, most of the arithmetic insns can be covered in >> this way. The exception could be register gather insn vrgather, which we >> can consult other ways for it, e.g. scalar or helper. >> >> For #2 v0 mask, one way is to handle the mask in the very beginning at >> guest_riscv64_toIR.c, similar to what AVX port does: >> >> a) Read the whole dest register without mask >> b) Generate unmasked result by running op without mask >> c) Applying mask to a,b and generate the final dest >> >> by doing this, insn with mask is converted to non-mask ones, although >> more insns are generated but the performance should be acceptable. There >> are still exceptions, e.g. vadc (Add-with-Carry), v0 is not used as mask >> but as carry, but just as mentioned above, it's okay to use other ways >> for a few insns. Eventually, we can pass v0 mask down to the backend if >> it's proved a better solution. >> >> This approach will introduce a bunch of new vlen Vector IRs, especially >> the arithmetic IRs such as vadd, my goal is for a good solution which >> takes reasonable time to reach usable status, yet still be able to >> evolve and generic enough for other vector ISA. Any comments? > > Could you please share a repository with your changes or send them to me > as patches? I have a few questions but I think it might be easier for me > first to see the actual code. > Please see attachment. It's a very raw version to just verify the idea, mask is not added but expected to be done as mentioned above, it's based on commit 71272b2529 on your branch, patch 0013 is the key. btw, I will setup a repository but it takes a few days to pass the internal process. Thanks, Fei. > Thanks, > Petr |
From: Petr P. <pet...@da...> - 2023-07-10 21:06:01
|
On 6. Jul 23 20:39, Wu, Fei wrote: > On 5/29/2023 11:29 AM, Wu, Fei wrote: > > On 5/28/2023 1:06 AM, Petr Pavlu wrote: > >> On 21. Apr 23 17:25, Jojo R wrote: > >>> We consider to add RVV/Vector [1] feature in valgrind, there are some > >>> challenges. > >>> RVV like ARM's SVE [2] programming model, it's scalable/VLA, that means the > >>> vector length is agnostic. > >>> ARM's SVE is not supported in valgrind :( > >>> > >>> There are three major issues in implementing RVV instruction set in Valgrind > >>> as following: > >>> > >>> 1. Scalable vector register width VLENB > >>> 2. Runtime changing property of LMUL and SEW > >>> 3. Lack of proper VEX IR to represent all vector operations > >>> > >>> We propose applicable methods to solve 1 and 2. As for 3, we explore several > >>> possible but maybe imperfect approaches to handle different cases. > >>> > I did a very basic prototype for vlen Vector-IR, particularly on RISC-V > Vector (RVV): > > * Define new iops such as Iop_VAdd8/16/32/64, the difference from > existing SIMD version is that no element number is specified like > Iop_Add8x32 > > * Define new IR type Ity_VLen along side existing types such as Ity_I64, > Ity_V256 > > * Define new class HRcVecVLen in HRegClass for vlen vector registers > The real length is embedded in both IROp and IRType for vlen ops/types, > it's runtime-decided and already known when handling insn such as vadd, > this leads to more flexibility, e.g. backend can issue extra vsetvl if > necessary. > > With the above, RVV instruction in the guest can be passed from > frontend, to memcheck, to the backend, and generate the final RVV insn > during host isel, a very basic testcase has been tested. > > Now here comes to the complexities: > > 1. RVV has the concept of LMUL, which groups multiple (or partial) > vector registers, e.g. when LMUL==2, v2 means the real v2+v3. This > complicates the register allocation. > > 2. RVV uses the "implicit" v0 for mask, its content must be loaded to > the exact "v0" register instead of any other ones if host isel wants to > leverage RVV insn, this implicitness in ISA requires more explicitness > in Valgrind implementation. > > For #1 LMUL, a new register allocation algorithm for it can be added, > and it will be great if someone is willing to try it, I'm not sure how > much effort it will take. The other way is splitting it into multiple > ops which only takes one vector register, taking vadd for example, 2 > vadd will run with LMUL=1 for one vadd with LMUL=2, this is still okay > for the widening insn, most of the arithmetic insns can be covered in > this way. The exception could be register gather insn vrgather, which we > can consult other ways for it, e.g. scalar or helper. > > For #2 v0 mask, one way is to handle the mask in the very beginning at > guest_riscv64_toIR.c, similar to what AVX port does: > > a) Read the whole dest register without mask > b) Generate unmasked result by running op without mask > c) Applying mask to a,b and generate the final dest > > by doing this, insn with mask is converted to non-mask ones, although > more insns are generated but the performance should be acceptable. There > are still exceptions, e.g. vadc (Add-with-Carry), v0 is not used as mask > but as carry, but just as mentioned above, it's okay to use other ways > for a few insns. Eventually, we can pass v0 mask down to the backend if > it's proved a better solution. > > This approach will introduce a bunch of new vlen Vector IRs, especially > the arithmetic IRs such as vadd, my goal is for a good solution which > takes reasonable time to reach usable status, yet still be able to > evolve and generic enough for other vector ISA. Any comments? Could you please share a repository with your changes or send them to me as patches? I have a few questions but I think it might be easier for me first to see the actual code. Thanks, Petr |
From: Wu, F. <fe...@in...> - 2023-07-06 12:40:15
|
On 5/29/2023 11:29 AM, Wu, Fei wrote: > On 5/28/2023 1:06 AM, Petr Pavlu wrote: >> On 21. Apr 23 17:25, Jojo R wrote: >>> We consider to add RVV/Vector [1] feature in valgrind, there are some >>> challenges. >>> RVV like ARM's SVE [2] programming model, it's scalable/VLA, that means the >>> vector length is agnostic. >>> ARM's SVE is not supported in valgrind :( >>> >>> There are three major issues in implementing RVV instruction set in Valgrind >>> as following: >>> >>> 1. Scalable vector register width VLENB >>> 2. Runtime changing property of LMUL and SEW >>> 3. Lack of proper VEX IR to represent all vector operations >>> >>> We propose applicable methods to solve 1 and 2. As for 3, we explore several >>> possible but maybe imperfect approaches to handle different cases. >>> I did a very basic prototype for vlen Vector-IR, particularly on RISC-V Vector (RVV): * Define new iops such as Iop_VAdd8/16/32/64, the difference from existing SIMD version is that no element number is specified like Iop_Add8x32 * Define new IR type Ity_VLen along side existing types such as Ity_I64, Ity_V256 * Define new class HRcVecVLen in HRegClass for vlen vector registers The real length is embedded in both IROp and IRType for vlen ops/types, it's runtime-decided and already known when handling insn such as vadd, this leads to more flexibility, e.g. backend can issue extra vsetvl if necessary. With the above, RVV instruction in the guest can be passed from frontend, to memcheck, to the backend, and generate the final RVV insn during host isel, a very basic testcase has been tested. Now here comes to the complexities: 1. RVV has the concept of LMUL, which groups multiple (or partial) vector registers, e.g. when LMUL==2, v2 means the real v2+v3. This complicates the register allocation. 2. RVV uses the "implicit" v0 for mask, its content must be loaded to the exact "v0" register instead of any other ones if host isel wants to leverage RVV insn, this implicitness in ISA requires more explicitness in Valgrind implementation. For #1 LMUL, a new register allocation algorithm for it can be added, and it will be great if someone is willing to try it, I'm not sure how much effort it will take. The other way is splitting it into multiple ops which only takes one vector register, taking vadd for example, 2 vadd will run with LMUL=1 for one vadd with LMUL=2, this is still okay for the widening insn, most of the arithmetic insns can be covered in this way. The exception could be register gather insn vrgather, which we can consult other ways for it, e.g. scalar or helper. For #2 v0 mask, one way is to handle the mask in the very beginning at guest_riscv64_toIR.c, similar to what AVX port does: a) Read the whole dest register without mask b) Generate unmasked result by running op without mask c) Applying mask to a,b and generate the final dest by doing this, insn with mask is converted to non-mask ones, although more insns are generated but the performance should be acceptable. There are still exceptions, e.g. vadc (Add-with-Carry), v0 is not used as mask but as carry, but just as mentioned above, it's okay to use other ways for a few insns. Eventually, we can pass v0 mask down to the backend if it's proved a better solution. This approach will introduce a bunch of new vlen Vector IRs, especially the arithmetic IRs such as vadd, my goal is for a good solution which takes reasonable time to reach usable status, yet still be able to evolve and generic enough for other vector ISA. Any comments? Best Regards, Fei. >>> We start from 1. As each guest register should be described in VEXGuestState >>> struct, the vector registers with scalable width of VLENB can be added into >>> VEXGuestState as arrays using an allowable maximum length like 2048/4096. >> >> Size of VexGuestRISCV64State is currently 592 bytes. Adding these large >> vector registers will bump it by 32*2048/8=8192 bytes. >> > Yes, that's the reason in my RFC patches the vlen is set to 128, that's > the largest room for vector in current design. > >> The baseblock layout in VEX is: the guest state, two equal sized areas >> for shadow state and then a spill area. The RISC-V port accesses the >> baseblock in generated code via x8/s0. The register is set to the >> address of the baseblock+2048 (file >> coregrind/m_dispatch/dispatch-riscv64-linux.S). The extra offset is >> a small optimization to utilize the fact that load/store instructions in >> RVI have a signed offset in range [-2048,2047]. The end result is that >> it is possible to access the baseblock data using only a single >> instruction. >> > Nice design. > >> Adding the new vector registers will cause that more instructions will >> be necessary. For instance, accessing any shadow guest state would >> naively require a sequence of LUI+ADDI+LOAD/STORE. >> >> I suspect this could affect performance quite a bit and might need some >> optimizing. >> > Yes, can we separate the vector registers from the other ones, is it > able to use two baseblocks? Or we can do some experiments to measure the > overhead. > >>> >>> The actual available access range can be determined at Valgrind startup time >>> by querying the CPU for its vector capability or some suitable setup steps. >> >> Something to consider is that the virtual CPU provided by Valgrind does >> not necessarily need to match the host CPU. For instance, VEX could >> hardcode that its vector registers are only 128 bits in size. >> >> I was originally hoping that this is how support for the V extension >> could be added, but the LMUL grouping looks to break this model. >> > Originally I had the same idea, but 128 vlen hardware cannot run the > software built for larger vlen, e.g. clang has option > -riscv-v-vector-bits-min, if it's set to 256, then it assumes the > underlying hardware has at least 256 vlen. > >>> >>> >>> To solve problem 2, we are inspired by already-proven techniques in QEMU, >>> where translation blocks are broken up when certain critical CSRs are set. >>> Because the guest code to IR translation relies on the precise value of >>> LMUL/SEW and they may change within a basic block, we can break up the basic >>> block each time encountering a vsetvl{i} instruction and return to the >>> scheduler to execute the translated code and update LMUL/SEW. Accordingly, >>> translation cache management should be refactored to detect the changing of >>> LMUL/SEW to invalidate outdated code cache. Without losing the generality, >>> the LMUL/SEW should be encoded into an ULong flag such that other >>> architectures can leverage this flag to store their arch-dependent >>> information. The TTentry struct should also take the flag into account no >>> matter insertion or deletion. By doing this, the flag carries the newest >>> LMUL/SEW throughout the simulation and can be passed to disassemble >>> functions using the VEXArchInfo struct such that we can get the real and >>> newest value of LMUL and SEW to facilitate our translation. >>> >>> Also, some architecture-related code should be taken care of. Like >>> m_dispatch part, disp_cp_xindir function looks up code cache using hardcoded >>> assembly by checking the requested guest state IP and translation cache >>> entry address with no more constraints. Many other modules should be checked >>> to ensure the in-time update of LMUL/SEW is instantly visible to essential >>> parts in Valgrind. >>> >>> >>> The last remaining big issue is 3, which we introduce some ad-hoc approaches >>> to deal with. We summarize these approaches into three types as following: >>> >>> 1. Break down a vector instruction to scalar VEX IR ops. >>> 2. Break down a vector instruction to fixed-length VEX IR ops. >>> 3. Use dirty helpers to realize vector instructions. >> >> I would also look at adding new VEX IR ops for scalable vector >> instructions. In particular, if it could be shown that RVV and SVE can >> use same new ops then it could make a good argument for adding them. >> >> Perhaps interesting is if such new scalable vector ops could also >> represent fixed operations on other architectures, but that is just me >> thinking out loud. >> > It's a good idea to consolidate all vector/simd together, the challenge > is to verify its feasibility and to speedup the adaption progress, as > it's supposed to take more efforts and longer time. Is there anyone with > knowledge or experience of other ISA such as avx/sve on valgrind can > share the pain and gain, or we can do some quick prototype? > > Thanks, > Fei. > >>> [...] >>> In summary, it is far to reach a truly applicable solution in adding vector >>> extensions in Valgrind. We need to do detailed and comprehensive estimations >>> on different vector instruction categories. >>> >>> Any feedback is welcome in github [3] also. >>> >>> >>> [1] https://github.com/riscv/riscv-v-spec >>> >>> [2] https://community.arm.com/arm-research/b/articles/posts/the-arm-scalable-vector-extension-sve >>> >>> [3] https://github.com/petrpavlu/valgrind-riscv64/issues/17 >> >> Sorry for not being more helpful at this point. As mentioned in the >> GitHub issue, I still need to get myself more familiar with RVV and how >> Valgrind handles vector instructions. >> >> Thanks, >> Petr >> >> >> >> _______________________________________________ >> Valgrind-developers mailing list >> Val...@li... >> https://lists.sourceforge.net/lists/listinfo/valgrind-developers > > > > _______________________________________________ > Valgrind-developers mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-developers |
From: Tom H. <to...@co...> - 2023-07-03 11:36:54
|
On 03/07/2023 10:42, Daniel Fishman wrote: > Thanks for the pointer. I just commented out the line for the mentioned syscall > number from valgrind's syscall table, and this workaround was enough to solve > the problem. Since the custom syscall doesn't modify its parameters and doesn't > seem to write anything in user space, it seems that writing a wrapper > for it is not > strictly necessary - or very useful for that matter, since in any case > it won't be > possible to submit a valgrind patch for the problem. Well pread will be reading user memory so the wrapper would be checking that the memory it was given was valid, and that the file descriptor argument is valid. Not doing that won't break anything of course, it just means you may not detect some problems in your program. > Beyond this problem, maybe it could be useful if upon encountering an impossible > problem (the one when valgrind writes: "valgrind: the 'impossible' happened"), > valgrind will send a user to read the file README_MISSING_SYSCALL_OR_IOCTL > in addition to telling him to read FAQ. Had I been aware of this file > before, I would have known how to solve the problem myself. Well sure, but the chances that a random SEGV in valgrind are caused by a syscall issue are probably less than 1% so doing that would mostly just be completely misleading. Tom -- Tom Hughes (to...@co...) http://compton.nu/ |
From: Daniel F. <qua...@gm...> - 2023-07-03 09:47:41
|
On Mon, Jul 3, 2023 at 12:31 AM John Reiser <jr...@bi...> wrote: > Please show the complete output of "uname -a". > In the linux git source code repository > url = git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git > there is no git tag "v3.10.0"; only "v3.10" and "v3.10.1", "v3.10.2", etc. Yes, this distribution is based on a modified kernel and not on the official one. The full name from uname is 3.10.0-1150.15.2cpx86_64 > After checking out the code via "git checkout v3.10.1", then "grep -sr setprocns ." > shows that the linux source code has no lines that contain "setprocns". > So there is an error in part of your claim. Indeed, it seems that this syscall is specific to this distribution - upon encountering the problem I went straight to distribution's kernel source to check the syscall table and didn't realize that it may be different from the vanilla kernel. After comparing distribution's syscall table to original 3.10.x sources I see that the distribution ported some of the syscalls from later kernels (highest syscall on 3.10.x is 350, while the distribution has syscalls with higher numbers - which are not even consecutive), and added this one new custom syscall. This explains why valgrind 3.10.0 works on this distribution - it just doesn't implement wrappers for syscalls higher than 355, and therefore there is no collision with a custom syscall which hijacks syscall number from a syscall that in later kernel version is used for a different syscall (preadv2). > 2. The valgrind syscall table is in coregrind/m_syswrap/syswrap-x86-linux.c > and syswrap-amd64-linux.c. Therefore, modify the source code of valgrind > to call utsname() during initialization, and alter the table > static SyscallTableEntry syscall_table[] = { ... }; > accordingly. Probably 'static' must be removed. Also, 'const' should > be removed if necessary. [Why isn't the table 'const' in the first place?] Thanks for the pointer. I just commented out the line for the mentioned syscall number from valgrind's syscall table, and this workaround was enough to solve the problem. Since the custom syscall doesn't modify its parameters and doesn't seem to write anything in user space, it seems that writing a wrapper for it is not strictly necessary - or very useful for that matter, since in any case it won't be possible to submit a valgrind patch for the problem. Beyond this problem, maybe it could be useful if upon encountering an impossible problem (the one when valgrind writes: "valgrind: the 'impossible' happened"), valgrind will send a user to read the file README_MISSING_SYSCALL_OR_IOCTL in addition to telling him to read FAQ. Had I been aware of this file before, I would have known how to solve the problem myself. |
From: John R. <jr...@bi...> - 2023-07-02 21:30:48
|
> On a machine that has an old linux kernel, when valgrind 3.21.0 runs an > executable that contains a call to syscall 378 - valgrind fails after > being killed by a fatal signal. > > The kernel on the machine is 3.10.0 x86_64 (the system is based on RedHad 5, > I think), libc 2.17, and the executable itself is 32 bit. Please show the complete output of "uname -a". In the linux git source code repository url = git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git there is no git tag "v3.10.0"; only "v3.10" and "v3.10.1", "v3.10.2", etc. > On this particular kernel, syscall 378 happens to be mapped to setprocns, After checking out the code via "git checkout v3.10.1", then "grep -sr setprocns ." shows that the linux source code has no lines that contain "setprocns". So there is an error in part of your claim. > valgrind thinks that an executable is trying to execute syscall preadv2, which > is indeed mapped to 378 on newer kernels, but doesn't exist in linux 3.10.0 It is correct that "grep -sr preadv2 ." shows no matching lines in linux v3.10.1. However, "cd arch/x86; grep -sr 378 ." shows no syscall numbered 378. Please verify what you are talking about. "cd arch/x86; grep -sr setns ." does show these syscalls with a related name: ./syscalls/syscall_64.tbl:308 common setns sys_setns ./syscalls/syscall_32.tbl:346 i386 setns sys_setns > Older versions of valgrind (for example, valgrind 3.10.0) don't have this > problem, and succeed to execute the same executable on this machine. Therefore one solution is to use a version of valgrind that is contemporaneous with your kernel. Check the list of versions and dates for both linux and valgrind, and find the best match. > What can I do to fix the problem? Unfortunately I am stuck with having to > use such an old system, and therefore using newer kernel is not an option. 1. Double check the versions that you claim, and prove against the official sources. 2. The valgrind syscall table is in coregrind/m_syswrap/syswrap-x86-linux.c and syswrap-amd64-linux.c. Therefore, modify the source code of valgrind to call utsname() during initialization, and alter the table static SyscallTableEntry syscall_table[] = { ... }; accordingly. Probably 'static' must be removed. Also, 'const' should be removed if necessary. [Why isn't the table 'const' in the first place?] Note that any change in syscall numbers creates a giant incompatibility for any app that is built using "the other" assignments. No old apps can be relied to run on newer systems, and no new apps can be relied to run on older systems. That's a disaster, and it is *NOT* "waiting to happen"; it has already happened. |
From: Daniel F. <qua...@gm...> - 2023-07-02 11:40:10
|
Hello, On a machine that has an old linux kernel, when valgrind 3.21.0 runs an executable that contains a call to syscall 378 - valgrind fails after being killed by a fatal signal. The kernel on the machine is 3.10.0 x86_64 (the system is based on RedHad 5, I think), libc 2.17, and the executable itself is 32 bit. On this particular kernel, syscall 378 happens to be mapped to setprocns, while valgrind thinks that an executable is trying to execute syscall preadv2, which is indeed mapped to 378 on newer kernels, but doesn't exist in linux 3.10.0 Older versions of valgrind (for example, valgrind 3.10.0) don't have this problem, and succeed to execute the same executable on this machine. According to release notes it seems that this platform is supported. What can I do to fix the problem? Unfortunately I am stuck with having to use such an old system, and therefore using newer kernel is not an option. Test program and valgrind's report with the fatal signal are attached. gcc 6.3.0 was used for compilations (both the executable and the valgrind). |
From: Simon S. <sim...@gn...> - 2023-06-29 18:25:45
|
Am 29.06.2023 um 18:19 schrieb Mark Wielaard: > Hi Simon, > > On Thu, Jun 29, 2023 at 05:46:59PM +0200, Simon Sobisch wrote: >> Am 29.06.2023 um 15:10 schrieb John Reiser: >>>> Running valgrind on GnuCOBOL errors out with >>>> >>>> vex amd64->IR: unhandled instruction bytes: >>>> 0x62 0xF1 0xFE 0x8 0x6F 0x7 0x48 0xC7 0x5 0x6F >>>> vex amd64->IR: REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0 >>>> vex amd64->IR: VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE >>>> vex amd64->IR: PFX.66=0 PFX.F2=0 PFX.F3=0 >>>> valgrind: Unrecognised instruction at address 0x4e75f20. >>>> at 0x4E75F20: cob_string_init (strings.c:742) >>> >>>> 132 (gdb) disassemble /s >>>> 133 Dump of assembler code for function cob_string_init: >>>> 134 ../../libcob/strings.c: >>>> 135 741 { >>>> 136 742 string_dst_copy = *dst; >>>> 137 => 0x0000000004e75f20 <+0>: vmovdqu64 (%rdi),%xmm0 >>> >>>> Is there anything I can do this to still run the application >>>> with valgrind or do I need to wait for a hotfix? > > vmovdqu64 is part of AVX512, see this bug: > https://bugs.kde.org/show_bug.cgi?id=valgrind-avx512 > (yes, it has been reported so many times that it has its own alias) Whoa! Thanks for pointing this out (I have not found that on the user list, but that is likely because of the exact instruction I've searched for). So the workaround seems to be to compile sources with gcc -march=native -mno-avx512f -mno-avx512dq -mno-avx512cd -mno-avx512bw -mno-avx512vl -mno-avx512ifma -mno-avx512vbmi -mno-avx512vbmi2 -mno-avx512vnni -mno-avx512bitalg -mno-avx5124fmaps -mno-avx5124vnniw -mno-avx5124vbmi -mno-avx512vpopcntdq or use an -march that is "generic" and get slower code when running outside of valgrind. > There are patches, but the original submitter isn't working on it > anymore. So we need someone to pick up the code and go through the > feedback to get it integrated. Hm, as far as I see all the feedback is already considered, no? Sadly I'm not in the position to finish that and _guess_ that there's no .patch file which I could directly apply to the last release, is there? Thanks, Simon |
From: Mark W. <ma...@kl...> - 2023-06-29 16:19:36
|
Hi Simon, On Thu, Jun 29, 2023 at 05:46:59PM +0200, Simon Sobisch wrote: > Am 29.06.2023 um 15:10 schrieb John Reiser: > >>Running valgrind on GnuCOBOL errors out with > >> > >>vex amd64->IR: unhandled instruction bytes: > >> 0x62 0xF1 0xFE 0x8 0x6F 0x7 0x48 0xC7 0x5 0x6F > >>vex amd64->IR: REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0 > >>vex amd64->IR: VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE > >>vex amd64->IR: PFX.66=0 PFX.F2=0 PFX.F3=0 > >>valgrind: Unrecognised instruction at address 0x4e75f20. > >> at 0x4E75F20: cob_string_init (strings.c:742) > > > >>132 (gdb) disassemble /s > >>133 Dump of assembler code for function cob_string_init: > >>134 ../../libcob/strings.c: > >>135 741 { > >>136 742 string_dst_copy = *dst; > >>137 => 0x0000000004e75f20 <+0>: vmovdqu64 (%rdi),%xmm0 > > > >>Is there anything I can do this to still run the application > >>with valgrind or do I need to wait for a hotfix? vmovdqu64 is part of AVX512, see this bug: https://bugs.kde.org/show_bug.cgi?id=valgrind-avx512 (yes, it has been reported so many times that it has its own alias) There are patches, but the original submitter isn't working on it anymore. So we need someone to pick up the code and go through the feedback to get it integrated. Thanks, Mark |
From: Simon S. <sim...@gn...> - 2023-06-29 15:47:13
|
Sorry, I should have made this explicit! The error initially was seen with $> valgrind --version valgrind-3.20.0 which was then updated to $> valgrind --version valgrind-3.21.0 where this output below (100% identical to 3.20.0) came from. Both Valgrind and GnuCOBOL were compiled with gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-4) GNU assembler version 2.30-117.el8 on Linux 4.18.0-348.el8.x86_64 Simon Am 29.06.2023 um 15:10 schrieb John Reiser: >> Running valgrind on GnuCOBOL errors out with >> >> vex amd64->IR: unhandled instruction bytes: >> 0x62 0xF1 0xFE 0x8 0x6F 0x7 0x48 0xC7 0x5 0x6F >> vex amd64->IR: REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0 >> vex amd64->IR: VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE >> vex amd64->IR: PFX.66=0 PFX.F2=0 PFX.F3=0 >> valgrind: Unrecognised instruction at address 0x4e75f20. >> at 0x4E75F20: cob_string_init (strings.c:742) > >> 132 (gdb) disassemble /s >> 133 Dump of assembler code for function cob_string_init: >> 134 ../../libcob/strings.c: >> 135 741 { >> 136 742 string_dst_copy = *dst; >> 137 => 0x0000000004e75f20 <+0>: vmovdqu64 (%rdi),%xmm0 > >> Is there anything I can do this to still run the application with >> valgrind or do I need to wait for a hotfix? > > As always: report the version of valgrind. Run "valgrind --version", > then copy+paste the output here. The version is the #1 clue for any > investigation. |
From: John R. <jr...@bi...> - 2023-06-29 13:10:15
|
> Running valgrind on GnuCOBOL errors out with > > vex amd64->IR: unhandled instruction bytes: > 0x62 0xF1 0xFE 0x8 0x6F 0x7 0x48 0xC7 0x5 0x6F > vex amd64->IR: REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0 > vex amd64->IR: VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE > vex amd64->IR: PFX.66=0 PFX.F2=0 PFX.F3=0 > valgrind: Unrecognised instruction at address 0x4e75f20. > at 0x4E75F20: cob_string_init (strings.c:742) > 132 (gdb) disassemble /s > 133 Dump of assembler code for function cob_string_init: > 134 ../../libcob/strings.c: > 135 741 { > 136 742 string_dst_copy = *dst; > 137 => 0x0000000004e75f20 <+0>: vmovdqu64 (%rdi),%xmm0 > Is there anything I can do this to still run the application with valgrind or do I need to wait for a hotfix? As always: report the version of valgrind. Run "valgrind --version", then copy+paste the output here. The version is the #1 clue for any investigation. |
From: Simon S. <sim...@gn...> - 2023-06-29 08:12:57
|
Running valgrind on GnuCOBOL errors out with vex amd64->IR: unhandled instruction bytes: 0x62 0xF1 0xFE 0x8 0x6F 0x7 0x48 0xC7 0x5 0x6F vex amd64->IR: REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0 vex amd64->IR: VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE vex amd64->IR: PFX.66=0 PFX.F2=0 PFX.F3=0 valgrind: Unrecognised instruction at address 0x4e75f20. at 0x4E75F20: cob_string_init (strings.c:742) Using vgdb there were the following details: 132 (gdb) disassemble /s 133 Dump of assembler code for function cob_string_init: 134 ../../libcob/strings.c: 135 741 { 136 742 string_dst_copy = *dst; 137 => 0x0000000004e75f20 <+0>: vmovdqu64 (%rdi),%xmm0 138 139 744 string_ptr = NULL; 140 0x0000000004e75f26 <+6>: movq $0x0,0x244e6f(%rip) # 0x50bada0 <string_ptr> 141 142 742 string_dst_copy = *dst; 143 0x0000000004e75f31 <+17>: vmovaps %xmm0,0x244e47(%rip) # 0x50bad80 <string_dst_copy> 144 0x0000000004e75f39 <+25>: mov 0x10(%rdi),%rax 145 0x0000000004e75f3d <+29>: mov %rax,0x244e4c(%rip) # 0x50bad90 <string_dst_copy+16> 146 147 743 string_dst = &string_dst_copy; 148 0x0000000004e75f44 <+36>: lea 0x244e35(%rip),%rax # 0x50bad80 <string_dst_copy> 149 0x0000000004e75f4b <+43>: mov %rax,0x244e56(%rip) # 0x50bada8 <string_dst> Is there anything I can do this to still run the application with valgrind or do I need to wait for a hotfix? Thanks, Simon |
From: mamsds <ma...@ou...> - 2023-06-26 13:36:27
|
I just tried running the program with Valgrind on a machine with Debian 12 (bookworm). It ships with valgrind-3.19.0. I did exactly the same and the issue is gone. So I think the issue can be considered closed. Also, on the "bugs" you mentioned in the previous email. Note that this project is still under development and its README.md targets Debian only. The issues you are facing are either because it is still under development or you are not using Debian to build it. Alex On Sun, 2023-06-25 at 08:34 -0700, John Reiser wrote: > > Upgrade valgrind *TODAY*. The current version is valgrind-3.21.0. > > On a RaspberryPi model 3 in 32-bit mode (armhf) running Debian 11 > (bullseye), > then "apt-get install valgrind" installs valgrind-3.16.1 which is > much better > than the valgrind-3.7.0 which complained "not implemented" for the > "pac" app. > > On the same machine (1 GiB RAM, 4 CPU), valgrind-3.22.0.GIT can be > built > from source git://sourceware.org/git/valgrind.git. It takes less > than > one hour if you invoke "make -j4" to use all 4 CPU; no dynamic paging > is used. > > > > _______________________________________________ > Valgrind-users mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-users |
From: John R. <jr...@bi...> - 2023-06-25 15:35:10
|
> Upgrade valgrind *TODAY*. The current version is valgrind-3.21.0. On a RaspberryPi model 3 in 32-bit mode (armhf) running Debian 11 (bullseye), then "apt-get install valgrind" installs valgrind-3.16.1 which is much better than the valgrind-3.7.0 which complained "not implemented" for the "pac" app. On the same machine (1 GiB RAM, 4 CPU), valgrind-3.22.0.GIT can be built from source git://sourceware.org/git/valgrind.git. It takes less than one hour if you invoke "make -j4" to use all 4 CPU; no dynamic paging is used. |
From: John R. <jr...@bi...> - 2023-06-24 21:37:35
|
> I am using Debian on RaspberryPi and everything is from the official > apt package manager. > > Hardware architecture: armv7l GNU/Linux > OS version: Raspbian GNU/Linux 11 (bullseye) > Libmicrohttpd: stable, 0.9.72-2 armhf > Valgrind: valgrind-3.7.0 Upgrade valgrind *TODAY*. The current version is valgrind-3.21.0. Valgrind-3.7.0 was released in November 2011: commit 261bffdb4c2a52014ee10b4d68a75db0ec5834e60. It is a waste of everyone's time to chase "not implemented" in software that is over 11 years old and has been updated frequently since then. re: https://github.com/alex-lt-kong/public-address-client Fix your bugs: 1. README.md: libao-devel must be installed, else "#include ao/ao.h" fails. 2. gcc ./src/utils.c -c -O2 -Wall -pedantic -Wextra -Wc++-compat -fsanitize=address -g ./src/utils.c: In function ‘handle_sound_name_queue’: ./src/utils.c:155:75: warning: format ‘%d’ expects argument of type ‘int’, but argument 4 has type ‘size_t’ {aka ‘long unsigned int’} [-Wformat=] 155 | syslog(LOG_INFO, "Currently playing: [%s], current sound_queue_size: %d", | ~^ | | | int | %ld 156 | sound_realpath, qs); | ~~ | | | size_t {aka long unsigned int} ./src/utils.c:169:1: warning: control reaches end of non-void function [-Wreturn-type] 169 | } | ^ 3. libasan is required. 4. libasan must be first in the list presented to /usr/bin/ld, else it does not work correctly: "ASan runtime does not come first in initial library list; you should either link runtime to your application or manually preload it with LD_PRELOAD." ----- pac.out: $(SRC_DIR)/main.c queue.o utils.o $(CC) -lasan $(SRC_DIR)/main.c queue.o utils.o -o pac.out $(CFLAGS) $(LDFLAGS) $(SANITIZER) ----- -- |
From: ISHIKAWA,chiaki <ish...@yk...> - 2023-06-24 20:52:57
|
On 2023/06/25 2:09, mamsds wrote: > Hi John, > > I am using Debian on RaspberryPi and everything is from the official > apt package manager. > > Hardware architecture: armv7l GNU/Linux > OS version: Raspbian GNU/Linux 11 (bullseye) > Libmicrohttpd: stable, 0.9.72-2 armhf > Valgrind: valgrind-3.7.0 > > I can also share the code that I was testing: > https://github.com/alex-lt-kong/public-address-client > > > > Exact invocation and surrounding output: > > # valgrind --leak-check=yes --log-file=/tmp/valgrind.rpt > $HOME/bin/public-address-client/pac.out > Error accepting connection: Function not implemented > Error accepting connection: Function not implemented > Error accepting connection: Function not implemented > Error accepting connection: Function not implemented > Error accepting connection: Function not implemented > Error accepting connection: Function not implemented > Error accepting connection: Function not implemented > Error accepting connection: Function not implemented > Error accepting connection: Function not implemented > Error accepting connection: Function not implemented > [a lot more identical rows...] > > > > strace output: > > 3454 cacheflush(0x42714b70, 0x42714ca0, 0) = 0 > 3454 cacheflush(0x42714ca0, 0x42714d20, 0) = 0 > 3454 cacheflush(0x42714d20, 0x42714dec, 0) = 0 > 3454 cacheflush(0x42714df0, 0x42714f28, 0) = 0 > 3454 cacheflush(0x42714f28, 0x427150e0, 0) = 0 > 3454 getpid() = 3454 > 3454 write(1026, "==3454== \n", 10) = 10 > 3454 getpid() = 3454 > 3454 getpid() = 3454 > 3454 getpid() = 3454 > 3454 getpid() = 3454 > 3454 write(1026, "==3454== HEAP SUMMARY:\n==3454== "..., 170) = 170 > 3454 rt_sigprocmask(SIG_SETMASK, NULL, ~[ILL TRAP BUS FPE KILL SEGV > STOP], 8) = 0 > 3454 rt_sigprocmask(SIG_SETMASK, ~[ILL TRAP BUS FPE KILL SEGV STOP], > NULL, 8) = 0 > 3454 rt_sigprocmask(SIG_SETMASK, NULL, ~[ILL TRAP BUS FPE KILL SEGV > STOP], 8) = 0 > 3454 rt_sigprocmask(SIG_SETMASK, ~[ILL TRAP BUS FPE KILL SEGV STOP], > NULL, 8) = 0 > 3454 rt_sigprocmask(SIG_SETMASK, NULL, ~[ILL TRAP BUS FPE KILL SEGV > STOP], 8) = 0 > 3454 rt_sigprocmask(SIG_SETMASK, ~[ILL TRAP BUS FPE KILL SEGV STOP], > NULL, 8) = 0 > [The last row repeats a lot of times] > > > > I am not very sure on how to count twenty syscalls since the beginning > of the error though. > > Please let me know if you notice any issues. I will leave the mail for > a while then I will file a report to bugs.kde.org. > > Thanks, > Alex > > On Fri, 2023-06-23 at 12:54 -0700, John Reiser wrote: >>> ... each time a client makes a [HTTP] request, Valgrind complains >>> "Error >>> accepting connection: Function not implemented" and my program >>> fails to >>> handle the request as a result. >> Which versions of each of these pieces are you running: >> Libmicrohttpd, valgrind, >> hardware architecture, OS? >> >> Please give the exact copy+paste of the invocation of valgrind that >> fails, >> together with the output from Terminal that surrounds the complaint >> "Error accepting connection: Function not implemented". >> >> Please run under /usr/bin/strace, and report the twenty system calls >> that are run >> shortly before the complaint: >> strace -f -o strace.out valgrind ./my_app args... >> You may wish to compare versus the output from strace on the same >> command >> but without using 'valgrind'. >> >> Then the best way to gain attention of valgrind *developers* >> is to put all that info into a bug report at: >> https://bugs.kde.org/ , >> and post here in the mailing list the URL of the bug report that you >> created. >> I think it may be wiser to discuss the details in the web pub report mechanism. But for now, I have a question. - Does your program run without valgrind? (i.e. does it accept connection from outside?) If so, please capture the syscalls in that scenario and try capturing the syscalls when it accepts the connection. Compare that syscalls with the syscalls when your program is run under valgrind. Then you will see where your program's behavior under valgrind deviates from the normal flow. I *THINK* the issue is related to a possible timeout in the library which does not occur in the code usually and may not be handled very well. I have seen some cases where the slowdown under valgrind is like x20 and due to this, the ordinary program execution disrupted so much that the program bails out due to timeout. It occurs quite often of testing of thunderbird mail client, for example. Judicious use of larger timeout values often fixed the similar issues for me in the past. Given that you need to post so many contextual information, I think filing the bug to the kde bug reporting system would be wiser. Chiaki |
From: mamsds <ma...@ou...> - 2023-06-24 17:09:56
|
Hi John, I am using Debian on RaspberryPi and everything is from the official apt package manager. Hardware architecture: armv7l GNU/Linux OS version: Raspbian GNU/Linux 11 (bullseye) Libmicrohttpd: stable, 0.9.72-2 armhf Valgrind: valgrind-3.7.0 I can also share the code that I was testing: https://github.com/alex-lt-kong/public-address-client Exact invocation and surrounding output: # valgrind --leak-check=yes --log-file=/tmp/valgrind.rpt $HOME/bin/public-address-client/pac.out Error accepting connection: Function not implemented Error accepting connection: Function not implemented Error accepting connection: Function not implemented Error accepting connection: Function not implemented Error accepting connection: Function not implemented Error accepting connection: Function not implemented Error accepting connection: Function not implemented Error accepting connection: Function not implemented Error accepting connection: Function not implemented Error accepting connection: Function not implemented [a lot more identical rows...] strace output: 3454 cacheflush(0x42714b70, 0x42714ca0, 0) = 0 3454 cacheflush(0x42714ca0, 0x42714d20, 0) = 0 3454 cacheflush(0x42714d20, 0x42714dec, 0) = 0 3454 cacheflush(0x42714df0, 0x42714f28, 0) = 0 3454 cacheflush(0x42714f28, 0x427150e0, 0) = 0 3454 getpid() = 3454 3454 write(1026, "==3454== \n", 10) = 10 3454 getpid() = 3454 3454 getpid() = 3454 3454 getpid() = 3454 3454 getpid() = 3454 3454 write(1026, "==3454== HEAP SUMMARY:\n==3454== "..., 170) = 170 3454 rt_sigprocmask(SIG_SETMASK, NULL, ~[ILL TRAP BUS FPE KILL SEGV STOP], 8) = 0 3454 rt_sigprocmask(SIG_SETMASK, ~[ILL TRAP BUS FPE KILL SEGV STOP], NULL, 8) = 0 3454 rt_sigprocmask(SIG_SETMASK, NULL, ~[ILL TRAP BUS FPE KILL SEGV STOP], 8) = 0 3454 rt_sigprocmask(SIG_SETMASK, ~[ILL TRAP BUS FPE KILL SEGV STOP], NULL, 8) = 0 3454 rt_sigprocmask(SIG_SETMASK, NULL, ~[ILL TRAP BUS FPE KILL SEGV STOP], 8) = 0 3454 rt_sigprocmask(SIG_SETMASK, ~[ILL TRAP BUS FPE KILL SEGV STOP], NULL, 8) = 0 [The last row repeats a lot of times] I am not very sure on how to count twenty syscalls since the beginning of the error though. Please let me know if you notice any issues. I will leave the mail for a while then I will file a report to bugs.kde.org. Thanks, Alex On Fri, 2023-06-23 at 12:54 -0700, John Reiser wrote: > > ... each time a client makes a [HTTP] request, Valgrind complains > > "Error > > accepting connection: Function not implemented" and my program > > fails to > > handle the request as a result. > > Which versions of each of these pieces are you running: > Libmicrohttpd, valgrind, > hardware architecture, OS? > > Please give the exact copy+paste of the invocation of valgrind that > fails, > together with the output from Terminal that surrounds the complaint > "Error accepting connection: Function not implemented". > > Please run under /usr/bin/strace, and report the twenty system calls > that are run > shortly before the complaint: > strace -f -o strace.out valgrind ./my_app args... > You may wish to compare versus the output from strace on the same > command > but without using 'valgrind'. > > Then the best way to gain attention of valgrind *developers* > is to put all that info into a bug report at: > https://bugs.kde.org/ , > and post here in the mailing list the URL of the bug report that you > created. > |
From: John R. <jr...@bi...> - 2023-06-23 20:11:25
|
> ... each time a client makes a [HTTP] request, Valgrind complains "Error > accepting connection: Function not implemented" and my program fails to > handle the request as a result. Which versions of each of these pieces are you running: Libmicrohttpd, valgrind, hardware architecture, OS? Please give the exact copy+paste of the invocation of valgrind that fails, together with the output from Terminal that surrounds the complaint "Error accepting connection: Function not implemented". Please run under /usr/bin/strace, and report the twenty system calls that are run shortly before the complaint: strace -f -o strace.out valgrind ./my_app args... You may wish to compare versus the output from strace on the same command but without using 'valgrind'. Then the best way to gain attention of valgrind *developers* is to put all that info into a bug report at: https://bugs.kde.org/ , and post here in the mailing list the URL of the bug report that you created. -- |
From: mamsds <ma...@ou...> - 2023-06-23 15:50:00
|
Hi, I am trying to use Valgrind to check my program that uses GNU's [Libmicrohttpd](https://www.gnu.org/software/libmicrohttpd/) to handle HTTP requests. However, each time a client makes a request, Valgrind complains "Error accepting connection: Function not implemented" and my program fails to handle the request as a result. Is this expected? If the answer is yes, is there anything I can do to make Valgrind work? Thanks, Best regards, Alex Kong |
From: Wu, F. <fe...@in...> - 2023-06-05 01:24:05
|
On 6/1/2023 7:13 PM, LATHUILIERE Bruno via Valgrind-developers wrote: > > -------- Courriel original -------- > Objet: Re: [Valgrind-developers] RFC: support scalable vector model / riscv vector > Date: 2023-05-29 05:29 > De: "Wu, Fei" <fe...@in...> > À: Petr Pavlu <pet...@da...>, Jojo R <rj...@gm...> > Cc: pa...@so..., yun...@al..., val...@li..., > val...@li..., zha...@al... > >> On 5/28/2023 1:06 AM, Petr Pavlu wrote: >>> On 21. Apr 23 17:25, Jojo R wrote: >>>> The last remaining big issue is 3, which we introduce some ad-hoc >>>> approaches to deal with. We summarize these approaches into three >>>> types as >>>> following: >>>> >>>> 1. Break down a vector instruction to scalar VEX IR ops. >>>> 2. Break down a vector instruction to fixed-length VEX IR ops. >>>> 3. Use dirty helpers to realize vector instructions. >>> >>> I would also look at adding new VEX IR ops for scalable vector >>> instructions. In particular, if it could be shown that RVV and SVE can >>> use same new ops then it could make a good argument for adding them. >>> >>> Perhaps interesting is if such new scalable vector ops could also >>> represent fixed operations on other architectures, but that is just me >>> thinking out loud. >>> >> It's a good idea to consolidate all vector/simd together, the challenge is to verify its feasibility and to speedup the adaption progress, as it's supposed to take more efforts and longer time. Is there anyone with knowledge or experience of other ISA such as avx/sve on valgrind >can share the pain and gain, or we can do some quick prototype? >> >> Thanks, >> Fei. > > Hi, > > I don't know if my experience is the one you expect, nevertheless I will try to share it. Hi Bruno, Thank you for sharing this, it's definitely worth reading. > I'm the main developer of a valgrind tool called verrou (url: https://github.com/edf-hpc/verrou ) which currently only works with x86_64 architecture. > From user's point of view, verrou enables to estimate the effect of the floating-point rounding error propagation (If you are interested by the subject, there are documentation and publication). > It looks interesting, good job. > From valgrind tool developer's point of view, we need to replace all floating-point operations (fpo) by our own modified fpo implemented with C++ functions. One C++ function has 1,2 or 3 floating point input values and one floating point output value. > Do you use libvex_BackEnd() to translate the insn to host, e.g. host_riscv64_isel.c to select the host insn, Is there any difference of processing flow between verrou and memcheck? > As we have to replace all VEX fpo, the way we handle with SSE and AVX has consequences for us. For each kind of fpo (add,sub,mul,div,sqrt)x(float,double), we have to replace VEX op for the following variants : scalar, SSE low lane, SSE, AVX. It is painful but possible via code generation. Thanks to the multiple VEX ops it is possible to select only one type of instruction (it can be useful to 1- get speed up, 2- know if floating point errors come from scalar or vector instructions). > > On the other hand, for fma operations (madd,msub)x(float,double) we have less work to do, as valgrind do the un-vectorisation for us, but it is impossible to instrument selectively scalar or vector ops. As these insns are un-vectorised, are there any other issues besides the 1 (performance) & 2 (original type) mentioned above? I want to make sure if there is any risk of the un-vectorisation design, e.g. when the vector length is large such as 2k vlen on rvv. > We could think that the multiple VEX ops enable performance improvements via the vectorisation of C++ call, but it is not now possible (at least to my knowledge). Indeed, with the valgrind API I don't know how I can get the floating-point values in the register without applying un-vectorisation : To get the values in the AVX register, I do an awful sequence of Iop_V256to64_0, Iop_V256to64_1, Iop_V256to64_2, Iop_V256to64_3 for the 2 arguments. As it is not possible to do a IRStmt_Dirty call with a function with 9 args (9=2*4+1 2 for a binary operation, 4 for the vector length and 1 for the result), I do a first call to copy the 4 values of the first arg somewhere then a second one to perform the 4 C++ calls. > Due to the algorithm inside the C++ calls it could be tricky to vectorise, but I even didn't try because of the sequence of Iop_V256to64_*. For memcheck, the process is as follows if we put it simple: toIR -> instrumentation -> Backend isel If the vector insn is split into scalar at the stage of toIR just as I did in this series, the advantage looks obvious as I only need to deal with this single stage and leverage the existing code to handle the scalar version, the disadvantage is that it might lose some opportunities to optimize, e.g. * toIR - introduce extra temp variables for generated scalars * instrumentation - for memcheck, the key is to trace the V+A bits instead of the real results of the ops, the ideal case is V+A of the whole vector can be checked together w/o breaking it to scalars * Backend isel - the ideal case is to use the vector insn on host for guest vector insn, but I'm not sure how much effort will be taken to achieve this. > In my dreams I would like Iop_ to convert a V256 or V128 type to an aligned pointer on floating point args. > > So, I don't know if my experience can be useful for you, but if someone has a better solution to my needs it will be useful at least ... to me :) > Thank you again for this sharing. I hope the discussion can help both of us, and others. Best regards, Fei. > Best regards, > Bruno Lathuilière > > > > > Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à l'intention exclusive des destinataires et les informations qui y figurent sont strictement confidentielles. Toute utilisation de ce Message non conforme à sa destination, toute diffusion ou toute publication totale ou partielle, est interdite sauf autorisation expresse. > > Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si vous avez reçu ce Message par erreur, merci de le supprimer de votre système, ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support que ce soit. Nous vous remercions également d'en avertir immédiatement l'expéditeur par retour du message. > > Il est impossible de garantir que les communications par messagerie électronique arrivent en temps utile, sont sécurisées ou dénuées de toute erreur ou virus. > ____________________________________________________ > > This message and any attachments (the 'Message') are intended solely for the addressees. The information contained in this Message is confidential. Any use of information contained in this Message not in accord with its purpose, any dissemination or disclosure, either whole or partial, is prohibited except formal approval. > > If you are not the addressee, you may not copy, forward, disclose or use any part of it. If you have received this message in error, please delete it and all copies from your system and notify the sender immediately by return message. > > E-mail communication cannot be guaranteed to be timely secure, error or virus-free. > > > > _______________________________________________ > Valgrind-developers mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-developers |