You can subscribe to this list here.
2003 |
Jan
|
Feb
|
Mar
(58) |
Apr
(261) |
May
(169) |
Jun
(214) |
Jul
(201) |
Aug
(219) |
Sep
(198) |
Oct
(203) |
Nov
(241) |
Dec
(94) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2004 |
Jan
(137) |
Feb
(149) |
Mar
(150) |
Apr
(193) |
May
(95) |
Jun
(173) |
Jul
(137) |
Aug
(236) |
Sep
(157) |
Oct
(150) |
Nov
(136) |
Dec
(90) |
2005 |
Jan
(139) |
Feb
(130) |
Mar
(274) |
Apr
(138) |
May
(184) |
Jun
(152) |
Jul
(261) |
Aug
(409) |
Sep
(239) |
Oct
(241) |
Nov
(260) |
Dec
(137) |
2006 |
Jan
(191) |
Feb
(142) |
Mar
(169) |
Apr
(75) |
May
(141) |
Jun
(169) |
Jul
(131) |
Aug
(141) |
Sep
(192) |
Oct
(176) |
Nov
(142) |
Dec
(95) |
2007 |
Jan
(98) |
Feb
(120) |
Mar
(93) |
Apr
(96) |
May
(95) |
Jun
(65) |
Jul
(62) |
Aug
(56) |
Sep
(53) |
Oct
(95) |
Nov
(106) |
Dec
(87) |
2008 |
Jan
(58) |
Feb
(149) |
Mar
(175) |
Apr
(110) |
May
(106) |
Jun
(72) |
Jul
(55) |
Aug
(89) |
Sep
(26) |
Oct
(96) |
Nov
(83) |
Dec
(93) |
2009 |
Jan
(97) |
Feb
(106) |
Mar
(74) |
Apr
(64) |
May
(115) |
Jun
(83) |
Jul
(137) |
Aug
(103) |
Sep
(56) |
Oct
(59) |
Nov
(61) |
Dec
(37) |
2010 |
Jan
(94) |
Feb
(71) |
Mar
(53) |
Apr
(105) |
May
(79) |
Jun
(111) |
Jul
(110) |
Aug
(81) |
Sep
(50) |
Oct
(82) |
Nov
(49) |
Dec
(21) |
2011 |
Jan
(87) |
Feb
(105) |
Mar
(108) |
Apr
(99) |
May
(91) |
Jun
(94) |
Jul
(114) |
Aug
(77) |
Sep
(58) |
Oct
(58) |
Nov
(131) |
Dec
(62) |
2012 |
Jan
(76) |
Feb
(93) |
Mar
(68) |
Apr
(95) |
May
(62) |
Jun
(109) |
Jul
(90) |
Aug
(87) |
Sep
(49) |
Oct
(54) |
Nov
(66) |
Dec
(84) |
2013 |
Jan
(67) |
Feb
(52) |
Mar
(93) |
Apr
(65) |
May
(33) |
Jun
(34) |
Jul
(52) |
Aug
(42) |
Sep
(52) |
Oct
(48) |
Nov
(66) |
Dec
(14) |
2014 |
Jan
(66) |
Feb
(51) |
Mar
(34) |
Apr
(47) |
May
(58) |
Jun
(27) |
Jul
(52) |
Aug
(41) |
Sep
(78) |
Oct
(30) |
Nov
(28) |
Dec
(26) |
2015 |
Jan
(41) |
Feb
(42) |
Mar
(20) |
Apr
(73) |
May
(31) |
Jun
(48) |
Jul
(23) |
Aug
(55) |
Sep
(36) |
Oct
(47) |
Nov
(48) |
Dec
(41) |
2016 |
Jan
(32) |
Feb
(34) |
Mar
(33) |
Apr
(22) |
May
(14) |
Jun
(31) |
Jul
(29) |
Aug
(41) |
Sep
(17) |
Oct
(27) |
Nov
(38) |
Dec
(28) |
2017 |
Jan
(28) |
Feb
(30) |
Mar
(16) |
Apr
(9) |
May
(27) |
Jun
(57) |
Jul
(28) |
Aug
(43) |
Sep
(31) |
Oct
(20) |
Nov
(24) |
Dec
(18) |
2018 |
Jan
(34) |
Feb
(50) |
Mar
(18) |
Apr
(26) |
May
(13) |
Jun
(31) |
Jul
(13) |
Aug
(11) |
Sep
(15) |
Oct
(12) |
Nov
(18) |
Dec
(13) |
2019 |
Jan
(12) |
Feb
(29) |
Mar
(51) |
Apr
(22) |
May
(13) |
Jun
(20) |
Jul
(13) |
Aug
(12) |
Sep
(21) |
Oct
(6) |
Nov
(9) |
Dec
(5) |
2020 |
Jan
(13) |
Feb
(5) |
Mar
(25) |
Apr
(4) |
May
(40) |
Jun
(27) |
Jul
(5) |
Aug
(17) |
Sep
(21) |
Oct
(1) |
Nov
(5) |
Dec
(15) |
2021 |
Jan
(28) |
Feb
(6) |
Mar
(11) |
Apr
(5) |
May
(7) |
Jun
(8) |
Jul
(5) |
Aug
(5) |
Sep
(11) |
Oct
(9) |
Nov
(10) |
Dec
(12) |
2022 |
Jan
(7) |
Feb
(13) |
Mar
(8) |
Apr
(7) |
May
(12) |
Jun
(27) |
Jul
(14) |
Aug
(27) |
Sep
(27) |
Oct
(17) |
Nov
(17) |
Dec
|
2023 |
Jan
(10) |
Feb
(18) |
Mar
(9) |
Apr
(26) |
May
|
Jun
(13) |
Jul
(18) |
Aug
(5) |
Sep
(12) |
Oct
(16) |
Nov
(1) |
Dec
|
2024 |
Jan
(4) |
Feb
(3) |
Mar
(6) |
Apr
(17) |
May
(2) |
Jun
(33) |
Jul
(13) |
Aug
(1) |
Sep
(6) |
Oct
(8) |
Nov
(6) |
Dec
(15) |
2025 |
Jan
(5) |
Feb
(11) |
Mar
(8) |
Apr
(20) |
May
(1) |
Jun
|
Jul
|
Aug
(9) |
Sep
(1) |
Oct
|
Nov
|
Dec
|
From: Simon S. <sim...@gn...> - 2023-06-29 08:12:57
|
Running valgrind on GnuCOBOL errors out with vex amd64->IR: unhandled instruction bytes: 0x62 0xF1 0xFE 0x8 0x6F 0x7 0x48 0xC7 0x5 0x6F vex amd64->IR: REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0 vex amd64->IR: VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE vex amd64->IR: PFX.66=0 PFX.F2=0 PFX.F3=0 valgrind: Unrecognised instruction at address 0x4e75f20. at 0x4E75F20: cob_string_init (strings.c:742) Using vgdb there were the following details: 132 (gdb) disassemble /s 133 Dump of assembler code for function cob_string_init: 134 ../../libcob/strings.c: 135 741 { 136 742 string_dst_copy = *dst; 137 => 0x0000000004e75f20 <+0>: vmovdqu64 (%rdi),%xmm0 138 139 744 string_ptr = NULL; 140 0x0000000004e75f26 <+6>: movq $0x0,0x244e6f(%rip) # 0x50bada0 <string_ptr> 141 142 742 string_dst_copy = *dst; 143 0x0000000004e75f31 <+17>: vmovaps %xmm0,0x244e47(%rip) # 0x50bad80 <string_dst_copy> 144 0x0000000004e75f39 <+25>: mov 0x10(%rdi),%rax 145 0x0000000004e75f3d <+29>: mov %rax,0x244e4c(%rip) # 0x50bad90 <string_dst_copy+16> 146 147 743 string_dst = &string_dst_copy; 148 0x0000000004e75f44 <+36>: lea 0x244e35(%rip),%rax # 0x50bad80 <string_dst_copy> 149 0x0000000004e75f4b <+43>: mov %rax,0x244e56(%rip) # 0x50bada8 <string_dst> Is there anything I can do this to still run the application with valgrind or do I need to wait for a hotfix? Thanks, Simon |
From: mamsds <ma...@ou...> - 2023-06-26 13:36:27
|
I just tried running the program with Valgrind on a machine with Debian 12 (bookworm). It ships with valgrind-3.19.0. I did exactly the same and the issue is gone. So I think the issue can be considered closed. Also, on the "bugs" you mentioned in the previous email. Note that this project is still under development and its README.md targets Debian only. The issues you are facing are either because it is still under development or you are not using Debian to build it. Alex On Sun, 2023-06-25 at 08:34 -0700, John Reiser wrote: > > Upgrade valgrind *TODAY*. The current version is valgrind-3.21.0. > > On a RaspberryPi model 3 in 32-bit mode (armhf) running Debian 11 > (bullseye), > then "apt-get install valgrind" installs valgrind-3.16.1 which is > much better > than the valgrind-3.7.0 which complained "not implemented" for the > "pac" app. > > On the same machine (1 GiB RAM, 4 CPU), valgrind-3.22.0.GIT can be > built > from source git://sourceware.org/git/valgrind.git. It takes less > than > one hour if you invoke "make -j4" to use all 4 CPU; no dynamic paging > is used. > > > > _______________________________________________ > Valgrind-users mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-users |
From: John R. <jr...@bi...> - 2023-06-25 15:35:10
|
> Upgrade valgrind *TODAY*. The current version is valgrind-3.21.0. On a RaspberryPi model 3 in 32-bit mode (armhf) running Debian 11 (bullseye), then "apt-get install valgrind" installs valgrind-3.16.1 which is much better than the valgrind-3.7.0 which complained "not implemented" for the "pac" app. On the same machine (1 GiB RAM, 4 CPU), valgrind-3.22.0.GIT can be built from source git://sourceware.org/git/valgrind.git. It takes less than one hour if you invoke "make -j4" to use all 4 CPU; no dynamic paging is used. |
From: John R. <jr...@bi...> - 2023-06-24 21:37:35
|
> I am using Debian on RaspberryPi and everything is from the official > apt package manager. > > Hardware architecture: armv7l GNU/Linux > OS version: Raspbian GNU/Linux 11 (bullseye) > Libmicrohttpd: stable, 0.9.72-2 armhf > Valgrind: valgrind-3.7.0 Upgrade valgrind *TODAY*. The current version is valgrind-3.21.0. Valgrind-3.7.0 was released in November 2011: commit 261bffdb4c2a52014ee10b4d68a75db0ec5834e60. It is a waste of everyone's time to chase "not implemented" in software that is over 11 years old and has been updated frequently since then. re: https://github.com/alex-lt-kong/public-address-client Fix your bugs: 1. README.md: libao-devel must be installed, else "#include ao/ao.h" fails. 2. gcc ./src/utils.c -c -O2 -Wall -pedantic -Wextra -Wc++-compat -fsanitize=address -g ./src/utils.c: In function ‘handle_sound_name_queue’: ./src/utils.c:155:75: warning: format ‘%d’ expects argument of type ‘int’, but argument 4 has type ‘size_t’ {aka ‘long unsigned int’} [-Wformat=] 155 | syslog(LOG_INFO, "Currently playing: [%s], current sound_queue_size: %d", | ~^ | | | int | %ld 156 | sound_realpath, qs); | ~~ | | | size_t {aka long unsigned int} ./src/utils.c:169:1: warning: control reaches end of non-void function [-Wreturn-type] 169 | } | ^ 3. libasan is required. 4. libasan must be first in the list presented to /usr/bin/ld, else it does not work correctly: "ASan runtime does not come first in initial library list; you should either link runtime to your application or manually preload it with LD_PRELOAD." ----- pac.out: $(SRC_DIR)/main.c queue.o utils.o $(CC) -lasan $(SRC_DIR)/main.c queue.o utils.o -o pac.out $(CFLAGS) $(LDFLAGS) $(SANITIZER) ----- -- |
From: ISHIKAWA,chiaki <ish...@yk...> - 2023-06-24 20:52:57
|
On 2023/06/25 2:09, mamsds wrote: > Hi John, > > I am using Debian on RaspberryPi and everything is from the official > apt package manager. > > Hardware architecture: armv7l GNU/Linux > OS version: Raspbian GNU/Linux 11 (bullseye) > Libmicrohttpd: stable, 0.9.72-2 armhf > Valgrind: valgrind-3.7.0 > > I can also share the code that I was testing: > https://github.com/alex-lt-kong/public-address-client > > > > Exact invocation and surrounding output: > > # valgrind --leak-check=yes --log-file=/tmp/valgrind.rpt > $HOME/bin/public-address-client/pac.out > Error accepting connection: Function not implemented > Error accepting connection: Function not implemented > Error accepting connection: Function not implemented > Error accepting connection: Function not implemented > Error accepting connection: Function not implemented > Error accepting connection: Function not implemented > Error accepting connection: Function not implemented > Error accepting connection: Function not implemented > Error accepting connection: Function not implemented > Error accepting connection: Function not implemented > [a lot more identical rows...] > > > > strace output: > > 3454 cacheflush(0x42714b70, 0x42714ca0, 0) = 0 > 3454 cacheflush(0x42714ca0, 0x42714d20, 0) = 0 > 3454 cacheflush(0x42714d20, 0x42714dec, 0) = 0 > 3454 cacheflush(0x42714df0, 0x42714f28, 0) = 0 > 3454 cacheflush(0x42714f28, 0x427150e0, 0) = 0 > 3454 getpid() = 3454 > 3454 write(1026, "==3454== \n", 10) = 10 > 3454 getpid() = 3454 > 3454 getpid() = 3454 > 3454 getpid() = 3454 > 3454 getpid() = 3454 > 3454 write(1026, "==3454== HEAP SUMMARY:\n==3454== "..., 170) = 170 > 3454 rt_sigprocmask(SIG_SETMASK, NULL, ~[ILL TRAP BUS FPE KILL SEGV > STOP], 8) = 0 > 3454 rt_sigprocmask(SIG_SETMASK, ~[ILL TRAP BUS FPE KILL SEGV STOP], > NULL, 8) = 0 > 3454 rt_sigprocmask(SIG_SETMASK, NULL, ~[ILL TRAP BUS FPE KILL SEGV > STOP], 8) = 0 > 3454 rt_sigprocmask(SIG_SETMASK, ~[ILL TRAP BUS FPE KILL SEGV STOP], > NULL, 8) = 0 > 3454 rt_sigprocmask(SIG_SETMASK, NULL, ~[ILL TRAP BUS FPE KILL SEGV > STOP], 8) = 0 > 3454 rt_sigprocmask(SIG_SETMASK, ~[ILL TRAP BUS FPE KILL SEGV STOP], > NULL, 8) = 0 > [The last row repeats a lot of times] > > > > I am not very sure on how to count twenty syscalls since the beginning > of the error though. > > Please let me know if you notice any issues. I will leave the mail for > a while then I will file a report to bugs.kde.org. > > Thanks, > Alex > > On Fri, 2023-06-23 at 12:54 -0700, John Reiser wrote: >>> ... each time a client makes a [HTTP] request, Valgrind complains >>> "Error >>> accepting connection: Function not implemented" and my program >>> fails to >>> handle the request as a result. >> Which versions of each of these pieces are you running: >> Libmicrohttpd, valgrind, >> hardware architecture, OS? >> >> Please give the exact copy+paste of the invocation of valgrind that >> fails, >> together with the output from Terminal that surrounds the complaint >> "Error accepting connection: Function not implemented". >> >> Please run under /usr/bin/strace, and report the twenty system calls >> that are run >> shortly before the complaint: >> strace -f -o strace.out valgrind ./my_app args... >> You may wish to compare versus the output from strace on the same >> command >> but without using 'valgrind'. >> >> Then the best way to gain attention of valgrind *developers* >> is to put all that info into a bug report at: >> https://bugs.kde.org/ , >> and post here in the mailing list the URL of the bug report that you >> created. >> I think it may be wiser to discuss the details in the web pub report mechanism. But for now, I have a question. - Does your program run without valgrind? (i.e. does it accept connection from outside?) If so, please capture the syscalls in that scenario and try capturing the syscalls when it accepts the connection. Compare that syscalls with the syscalls when your program is run under valgrind. Then you will see where your program's behavior under valgrind deviates from the normal flow. I *THINK* the issue is related to a possible timeout in the library which does not occur in the code usually and may not be handled very well. I have seen some cases where the slowdown under valgrind is like x20 and due to this, the ordinary program execution disrupted so much that the program bails out due to timeout. It occurs quite often of testing of thunderbird mail client, for example. Judicious use of larger timeout values often fixed the similar issues for me in the past. Given that you need to post so many contextual information, I think filing the bug to the kde bug reporting system would be wiser. Chiaki |
From: mamsds <ma...@ou...> - 2023-06-24 17:09:56
|
Hi John, I am using Debian on RaspberryPi and everything is from the official apt package manager. Hardware architecture: armv7l GNU/Linux OS version: Raspbian GNU/Linux 11 (bullseye) Libmicrohttpd: stable, 0.9.72-2 armhf Valgrind: valgrind-3.7.0 I can also share the code that I was testing: https://github.com/alex-lt-kong/public-address-client Exact invocation and surrounding output: # valgrind --leak-check=yes --log-file=/tmp/valgrind.rpt $HOME/bin/public-address-client/pac.out Error accepting connection: Function not implemented Error accepting connection: Function not implemented Error accepting connection: Function not implemented Error accepting connection: Function not implemented Error accepting connection: Function not implemented Error accepting connection: Function not implemented Error accepting connection: Function not implemented Error accepting connection: Function not implemented Error accepting connection: Function not implemented Error accepting connection: Function not implemented [a lot more identical rows...] strace output: 3454 cacheflush(0x42714b70, 0x42714ca0, 0) = 0 3454 cacheflush(0x42714ca0, 0x42714d20, 0) = 0 3454 cacheflush(0x42714d20, 0x42714dec, 0) = 0 3454 cacheflush(0x42714df0, 0x42714f28, 0) = 0 3454 cacheflush(0x42714f28, 0x427150e0, 0) = 0 3454 getpid() = 3454 3454 write(1026, "==3454== \n", 10) = 10 3454 getpid() = 3454 3454 getpid() = 3454 3454 getpid() = 3454 3454 getpid() = 3454 3454 write(1026, "==3454== HEAP SUMMARY:\n==3454== "..., 170) = 170 3454 rt_sigprocmask(SIG_SETMASK, NULL, ~[ILL TRAP BUS FPE KILL SEGV STOP], 8) = 0 3454 rt_sigprocmask(SIG_SETMASK, ~[ILL TRAP BUS FPE KILL SEGV STOP], NULL, 8) = 0 3454 rt_sigprocmask(SIG_SETMASK, NULL, ~[ILL TRAP BUS FPE KILL SEGV STOP], 8) = 0 3454 rt_sigprocmask(SIG_SETMASK, ~[ILL TRAP BUS FPE KILL SEGV STOP], NULL, 8) = 0 3454 rt_sigprocmask(SIG_SETMASK, NULL, ~[ILL TRAP BUS FPE KILL SEGV STOP], 8) = 0 3454 rt_sigprocmask(SIG_SETMASK, ~[ILL TRAP BUS FPE KILL SEGV STOP], NULL, 8) = 0 [The last row repeats a lot of times] I am not very sure on how to count twenty syscalls since the beginning of the error though. Please let me know if you notice any issues. I will leave the mail for a while then I will file a report to bugs.kde.org. Thanks, Alex On Fri, 2023-06-23 at 12:54 -0700, John Reiser wrote: > > ... each time a client makes a [HTTP] request, Valgrind complains > > "Error > > accepting connection: Function not implemented" and my program > > fails to > > handle the request as a result. > > Which versions of each of these pieces are you running: > Libmicrohttpd, valgrind, > hardware architecture, OS? > > Please give the exact copy+paste of the invocation of valgrind that > fails, > together with the output from Terminal that surrounds the complaint > "Error accepting connection: Function not implemented". > > Please run under /usr/bin/strace, and report the twenty system calls > that are run > shortly before the complaint: > strace -f -o strace.out valgrind ./my_app args... > You may wish to compare versus the output from strace on the same > command > but without using 'valgrind'. > > Then the best way to gain attention of valgrind *developers* > is to put all that info into a bug report at: > https://bugs.kde.org/ , > and post here in the mailing list the URL of the bug report that you > created. > |
From: John R. <jr...@bi...> - 2023-06-23 20:11:25
|
> ... each time a client makes a [HTTP] request, Valgrind complains "Error > accepting connection: Function not implemented" and my program fails to > handle the request as a result. Which versions of each of these pieces are you running: Libmicrohttpd, valgrind, hardware architecture, OS? Please give the exact copy+paste of the invocation of valgrind that fails, together with the output from Terminal that surrounds the complaint "Error accepting connection: Function not implemented". Please run under /usr/bin/strace, and report the twenty system calls that are run shortly before the complaint: strace -f -o strace.out valgrind ./my_app args... You may wish to compare versus the output from strace on the same command but without using 'valgrind'. Then the best way to gain attention of valgrind *developers* is to put all that info into a bug report at: https://bugs.kde.org/ , and post here in the mailing list the URL of the bug report that you created. -- |
From: mamsds <ma...@ou...> - 2023-06-23 15:50:00
|
Hi, I am trying to use Valgrind to check my program that uses GNU's [Libmicrohttpd](https://www.gnu.org/software/libmicrohttpd/) to handle HTTP requests. However, each time a client makes a request, Valgrind complains "Error accepting connection: Function not implemented" and my program fails to handle the request as a result. Is this expected? If the answer is yes, is there anything I can do to make Valgrind work? Thanks, Best regards, Alex Kong |
From: Wu, F. <fe...@in...> - 2023-06-05 01:24:05
|
On 6/1/2023 7:13 PM, LATHUILIERE Bruno via Valgrind-developers wrote: > > -------- Courriel original -------- > Objet: Re: [Valgrind-developers] RFC: support scalable vector model / riscv vector > Date: 2023-05-29 05:29 > De: "Wu, Fei" <fe...@in...> > À: Petr Pavlu <pet...@da...>, Jojo R <rj...@gm...> > Cc: pa...@so..., yun...@al..., val...@li..., > val...@li..., zha...@al... > >> On 5/28/2023 1:06 AM, Petr Pavlu wrote: >>> On 21. Apr 23 17:25, Jojo R wrote: >>>> The last remaining big issue is 3, which we introduce some ad-hoc >>>> approaches to deal with. We summarize these approaches into three >>>> types as >>>> following: >>>> >>>> 1. Break down a vector instruction to scalar VEX IR ops. >>>> 2. Break down a vector instruction to fixed-length VEX IR ops. >>>> 3. Use dirty helpers to realize vector instructions. >>> >>> I would also look at adding new VEX IR ops for scalable vector >>> instructions. In particular, if it could be shown that RVV and SVE can >>> use same new ops then it could make a good argument for adding them. >>> >>> Perhaps interesting is if such new scalable vector ops could also >>> represent fixed operations on other architectures, but that is just me >>> thinking out loud. >>> >> It's a good idea to consolidate all vector/simd together, the challenge is to verify its feasibility and to speedup the adaption progress, as it's supposed to take more efforts and longer time. Is there anyone with knowledge or experience of other ISA such as avx/sve on valgrind >can share the pain and gain, or we can do some quick prototype? >> >> Thanks, >> Fei. > > Hi, > > I don't know if my experience is the one you expect, nevertheless I will try to share it. Hi Bruno, Thank you for sharing this, it's definitely worth reading. > I'm the main developer of a valgrind tool called verrou (url: https://github.com/edf-hpc/verrou ) which currently only works with x86_64 architecture. > From user's point of view, verrou enables to estimate the effect of the floating-point rounding error propagation (If you are interested by the subject, there are documentation and publication). > It looks interesting, good job. > From valgrind tool developer's point of view, we need to replace all floating-point operations (fpo) by our own modified fpo implemented with C++ functions. One C++ function has 1,2 or 3 floating point input values and one floating point output value. > Do you use libvex_BackEnd() to translate the insn to host, e.g. host_riscv64_isel.c to select the host insn, Is there any difference of processing flow between verrou and memcheck? > As we have to replace all VEX fpo, the way we handle with SSE and AVX has consequences for us. For each kind of fpo (add,sub,mul,div,sqrt)x(float,double), we have to replace VEX op for the following variants : scalar, SSE low lane, SSE, AVX. It is painful but possible via code generation. Thanks to the multiple VEX ops it is possible to select only one type of instruction (it can be useful to 1- get speed up, 2- know if floating point errors come from scalar or vector instructions). > > On the other hand, for fma operations (madd,msub)x(float,double) we have less work to do, as valgrind do the un-vectorisation for us, but it is impossible to instrument selectively scalar or vector ops. As these insns are un-vectorised, are there any other issues besides the 1 (performance) & 2 (original type) mentioned above? I want to make sure if there is any risk of the un-vectorisation design, e.g. when the vector length is large such as 2k vlen on rvv. > We could think that the multiple VEX ops enable performance improvements via the vectorisation of C++ call, but it is not now possible (at least to my knowledge). Indeed, with the valgrind API I don't know how I can get the floating-point values in the register without applying un-vectorisation : To get the values in the AVX register, I do an awful sequence of Iop_V256to64_0, Iop_V256to64_1, Iop_V256to64_2, Iop_V256to64_3 for the 2 arguments. As it is not possible to do a IRStmt_Dirty call with a function with 9 args (9=2*4+1 2 for a binary operation, 4 for the vector length and 1 for the result), I do a first call to copy the 4 values of the first arg somewhere then a second one to perform the 4 C++ calls. > Due to the algorithm inside the C++ calls it could be tricky to vectorise, but I even didn't try because of the sequence of Iop_V256to64_*. For memcheck, the process is as follows if we put it simple: toIR -> instrumentation -> Backend isel If the vector insn is split into scalar at the stage of toIR just as I did in this series, the advantage looks obvious as I only need to deal with this single stage and leverage the existing code to handle the scalar version, the disadvantage is that it might lose some opportunities to optimize, e.g. * toIR - introduce extra temp variables for generated scalars * instrumentation - for memcheck, the key is to trace the V+A bits instead of the real results of the ops, the ideal case is V+A of the whole vector can be checked together w/o breaking it to scalars * Backend isel - the ideal case is to use the vector insn on host for guest vector insn, but I'm not sure how much effort will be taken to achieve this. > In my dreams I would like Iop_ to convert a V256 or V128 type to an aligned pointer on floating point args. > > So, I don't know if my experience can be useful for you, but if someone has a better solution to my needs it will be useful at least ... to me :) > Thank you again for this sharing. I hope the discussion can help both of us, and others. Best regards, Fei. > Best regards, > Bruno Lathuilière > > > > > Ce message et toutes les pièces jointes (ci-après le 'Message') sont établis à l'intention exclusive des destinataires et les informations qui y figurent sont strictement confidentielles. Toute utilisation de ce Message non conforme à sa destination, toute diffusion ou toute publication totale ou partielle, est interdite sauf autorisation expresse. > > Si vous n'êtes pas le destinataire de ce Message, il vous est interdit de le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si vous avez reçu ce Message par erreur, merci de le supprimer de votre système, ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support que ce soit. Nous vous remercions également d'en avertir immédiatement l'expéditeur par retour du message. > > Il est impossible de garantir que les communications par messagerie électronique arrivent en temps utile, sont sécurisées ou dénuées de toute erreur ou virus. > ____________________________________________________ > > This message and any attachments (the 'Message') are intended solely for the addressees. The information contained in this Message is confidential. Any use of information contained in this Message not in accord with its purpose, any dissemination or disclosure, either whole or partial, is prohibited except formal approval. > > If you are not the addressee, you may not copy, forward, disclose or use any part of it. If you have received this message in error, please delete it and all copies from your system and notify the sender immediately by return message. > > E-mail communication cannot be guaranteed to be timely secure, error or virus-free. > > > > _______________________________________________ > Valgrind-developers mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-developers |
From: Mark W. <ma...@kl...> - 2023-04-29 00:15:24
|
We are pleased to announce a new release of Valgrind, version 3.21.0, available from https://valgrind.org/downloads/current.html. See the release notes below for details of changes. Our thanks to all those who contribute to Valgrind's development. This release represents a great deal of time, energy and effort on the part of many people. Happy and productive debugging and profiling, -- The Valgrind Developers ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This release supports X86/Linux, AMD64/Linux, ARM32/Linux, ARM64/Linux, PPC32/Linux, PPC64BE/Linux, PPC64LE/Linux, S390X/Linux, MIPS32/Linux, MIPS64/Linux, ARM/Android, ARM64/Android, MIPS32/Android, X86/Android, X86/Solaris, AMD64/Solaris, AMD64/MacOSX 10.12, X86/FreeBSD and AMD64/FreeBSD. There is also preliminary support for X86/macOS 10.13, AMD64/macOS 10.13 and nanoMIPS/Linux. * ==================== CORE CHANGES =================== * When GDB is used to debug a program running under valgrind using the valgrind gdbserver, GDB will automatically load some python code provided in valgrind defining GDB front end commands corresponding to the valgrind monitor commands. These GDB front end commands accept the same format as the monitor commands directly sent to the Valgrind gdbserver. These GDB front end commands provide a better integration in the GDB command line interface, so as to use for example GDB auto-completion, command specific help, searching for a command or command help matching a regexp, ... For relevant monitor commands, GDB will evaluate arguments to make the use of monitor commands easier. For example, instead of having to print the address of a variable to pass it to a subsequent monitor command, the GDB front end command will evaluate the address argument. It is for example possible to do: (gdb) memcheck who_points_at &some_struct sizeof(some_struct) instead of: (gdb) p &some_struct $2 = (some_struct_type *) 0x1130a0 <some_struct> (gdb) p sizeof(some_struct) $3 = 40 (gdb) monitor who_point_at 0x1130a0 40 * The vgdb utility now supports extended-remote protocol when invoked with --multi. In this mode the GDB run command is supported. Which means you don't need to run gdb and valgrind from different terminals. So for example to start your program in gdb and run it under valgrind you can do: $ gdb prog (gdb) set remote exec-file prog (gdb) set sysroot / (gdb) target extended-remote | vgdb --multi (gdb) start * The behaviour of realloc with a size of zero can now be changed for tools that intercept malloc. Those tools are memcheck, helgrind, drd, massif and dhat. Realloc implementations generally do one of two things - free the memory like free() and return NULL (GNU libc and ptmalloc). - either free the memory and then allocate a minimum sized block or just return the original pointer. Return NULL if the allocation of the minimum sized block fails (jemalloc, musl, snmalloc, Solaris, macOS). When Valgrind is configured and built it will try to match the OS and libc behaviour. However if you are using a non-default library to replace malloc and family (e.g., musl on a glibc Linux or tcmalloc on FreeBSD) then you can use a command line option to change the behaviour of Valgrind: --realloc-zero-bytes-frees=yes|no [yes on Linux glibc, no otherwise] * ================== PLATFORM CHANGES ================= * Make the address space limit on FreeBSD amd64 128Gbytes (the same as Linux and Solaris, it was 32Gbytes) * ==================== TOOL CHANGES =================== * Memcheck: - When doing a delta leak_search, it is now possible to only output the new loss records compared to the previous leak search. This is available in the memcheck monitor command 'leak_search' by specifying the "new" keyword or in your program by using the client request VALGRIND_DO_NEW_LEAK_CHECK. Whenever a "delta" leak search is done (i.e. when specifying "new" or "increased" or "changed" in the monitor command), the new loss records have a "new" marker. - Valgrind now contains python code that defines GDB memcheck front end monitor commands. See CORE CHANGES. - Performs checks for the use of realloc with a size of zero. This is non-portable and a source of errors. If memcheck detects such a usage it will generate an error realloc() with size 0 followed by the usual callstacks. A switch has been added to allow this to be turned off: --show-realloc-size-zero=yes|no [yes] * Helgrind: - The option ---history-backtrace-size=<number> allows to configure the number of entries to record in the stack traces of "old" accesses. Previously, this number was hardcoded to 8. - Valgrind now contains python code that defines GDB helgrind front end monitor commands. See CORE CHANGES. * Cachegrind: - `--cache-sim=no` is now the default. The cache simulation is old and unlikely to match any real modern machine. This means only the `Ir` event are gathered by default, but that is by far the most useful event. - `cg_annotate`, `cg_diff`, and `cg_merge` have been rewritten in Python. As a result, they all have more flexible command line argument handling, e.g. supporting `--show-percs` and `--no-show-percs` forms as well as the existing `--show-percs=yes` and `--show-percs=no`. - `cg_annotate` has some functional changes. - It's much faster, e.g. 3-4x on common cases. - It now supports diffing (with `--diff`, `--mod-filename`, and `--mod-funcname`) and merging (by passing multiple data files). - It now provides more information at the file and function level. There are now "File:function" and "Function:file" sections. These are very useful for programs that use inlining a lot. - Support for user-annotated files and the `-I`/`--include` option has been removed, because it was of little use and blocked other improvements. - The `--auto` option is renamed `--annotate`, though the old `--auto=yes`/`--auto=no` forms are still supported. - `cg_diff` and `cg_merge` are now deprecated, because `cg_annotate` now does a better job of diffing and merging. - The Cachegrind output file format has changed very slightly, but in ways nobody is likely to notice. * Callgrind: - Valgrind now contains python code that defines GDB callgrind front end monitor commands. See CORE CHANGES. * Massif: - Valgrind now contains python code that defines GDB massif front end monitor commands. See CORE CHANGES. * DHAT: - A new kind of user request has been added which allows you to override the 1024 byte limit on access count histograms for blocks of memory. The client request is DHAT_HISTOGRAM_MEMORY. * ==================== FIXED BUGS ==================== The following bugs have been fixed or resolved. Note that "n-i-bz" stands for "not in bugzilla" -- that is, a bug that was reported to us but never got a bugzilla entry. We encourage you to file bugs in bugzilla (https://bugs.kde.org/enter_bug.cgi?product=valgrind) rather than mailing the developers (or mailing lists) directly -- bugs that are not entered into bugzilla tend to get forgotten about or ignored. 170510 Don't warn about ioctl of size 0 without direction hint 241072 List tools in --help output 327548 false positive while destroying mutex 382034 Testcases build fixes for musl 351857 confusing error message about valid command line option 374596 inconsistent RDTSCP support on x86_64 392331 Spurious lock not held error from inside pthread_cond_timedwait 397083 Likely false positive "uninitialised value(s)" for __wmemchr_avx2 and __wmemcmp_avx2_movbe 400793 pthread_rwlock_timedwrlock false positive 419054 Unhandled syscall getcpu on arm32 433873 openat2 syscall unimplemented on Linux 434057 Add stdio mode to valgrind's gdbserver 435441 valgrind fails to interpose malloc on musl 1.2.2 due to weak symbol name and no libc soname 436413 Warn about realloc of size zero 439685 compiler warning in callgrind/main.c 444110 priv/guest_ppc_toIR.c:36198:31: warning: duplicated 'if' condition. 444487 hginfo test detects an extra lock inside data symbol "_rtld_local" 444488 Use glibc.pthread.stack_cache_size tunable 444568 drd/tests/pth_barrier_thr_cr fails on Fedora 38 445743 "The impossible happened: mutex is locked simultaneously by two threads" while using mutexes with priority inheritance and signals 449309 Missing loopback device ioctl(s) 459476 vgdb: allow address reuse to avoid "address already in use" errorsuse" errors 460356 s390: Sqrt32Fx4 -- cannot reduce tree 462830 WARNING: unhandled amd64-freebsd syscall: 474 463027 broken check for MPX instruction support in assembler 464103 Enhancement: add a client request to DHAT to mark memory to be histogrammed 464476 Firefox fails to start under Valgrind 464609 Valgrind memcheck should support Linux pidfd_open 464680 Show issues caused by memory policies like selinux deny_execmem 464859 Build failures with GCC-13 (drd tsan_unittest) 464969 D language demangling 465435 m_libcfile.c:66 (vgPlain_safe_fd): Assertion 'newfd >= VG_(fd_hard_limit)' failed. 466104 aligned_alloc problems, part 1 467036 Add time cost statistics for Regtest 467482 Build failure on aarch64 Alpine 467714 fdleak_* and rlimit tests fail when parent process has more than 64 descriptors opened 467839 Gdbserver: Improve compatibility of library directory name 468401 [PATCH] Add a style file for clang-format 468556 Build failure for vgdb 468606 build: remove "Valgrind relies on GCC" check/output 469097 ppc64(be) doesn't support SCV syscall instruction n-i-bz FreeBSD rfork syscall fail with EINVAL or ENOSYS rather than VG_(unimplemented) To see details of a given bug, visit https://bugs.kde.org/show_bug.cgi?id=XXXXXX where XXXXXX is the bug number as listed above. * ==================== KNOWN ISSUES =================== * configure --enable-lto=yes is know to not work in all setups. See bug 469049. Workaround: Build without LTO. |
From: John R. <jr...@bi...> - 2023-04-25 21:28:31
|
> I would think that, regardless of cache size, the *first* access to a line > causes a miss. The PowerPC and relatives have an instruction which force-allocates (if necessary) AND zeroes an entire cache line, so in some ways the "first" access is a Write which succeeds with no miss. |
From: Eliot M. <mo...@cs...> - 2023-04-25 20:41:57
|
On 4/25/2023 3:56 PM, Volker Dirr wrote: > Hallo, > > maybe I misunderstood, but it look like I don't understand tool=cachegrind correct (or there is a bug). > > I have a software. If it runs, then the task manager tells me that is use only 38 MB memory. > > Now i used cachegrind like this: > valgrind --tool=cachegrind --LL=2097152,16,64 ./fet-cl --inputfile=German-100_and_0.fet > --randomseeds10=10 --randomseeds11=11 --randomseeds12=12 --randomseeds20=20 --randomseeds21=21 > --randomseeds22=22 > > i got this report: > [...] > ==43032== LL misses: 8,283,261 ( 7,840,142 rd + 443,119 wr) > > > i doubled the LL cache by this: > valgrind --tool=cachegrind --LL=4194304,16,64 ./fet-cl --inputfile=German-100_and_0.fet > --randomseeds10=10 --randomseeds11=11 --randomseeds12=12 --randomseeds20=20 --randomseeds21=21 > --randomseeds22=22 > > and got this: > [...] > ==48663== LL misses: 230,426 ( 89,082 rd + 141,344 wr) > > > That sound ok (larger cache should reduce the misses). > > But I continued to double the cache again and again. Up to: > valgrind --tool=cachegrind --LL=268435456,16,64 ./fet-cl --inputfile=German-100_and_0.fet > --randomseeds10=10 --randomseeds11=11 --randomseeds12=12 --randomseeds20=20 --randomseeds21=21 > --randomseeds22=22 > > So much more memory then my program use at all. > But the misses never drop down to 0. They stay at: > [...] > ==6637== LL misses: 180,120 ( 41,252 rd + 138,868 wr) > > > I don't understand that. Shouldn't the misses drop down to 0 as soon as LL is >64MB (since my > software use only 38 MB)? (But i tried up to 256MB and it doesn't drop). > Is that a bug in valgrind or is there a bug in my logic in understanding the LL misses? > > Please let me know. I would think that, regardless of cache size, the *first* access to a line causes a miss. These are sometimes called mandatory misses. A large cache *will* eliminate *capacity* misses. HTH -- Eliot Moss |
From: Volker D. <u6...@ti...> - 2023-04-25 19:57:12
|
Hallo, maybe I misunderstood, but it look like I don't understand tool=cachegrind correct (or there is a bug). I have a software. If it runs, then the task manager tells me that is use only 38 MB memory. Now i used cachegrind like this: valgrind --tool=cachegrind --LL=2097152,16,64 ./fet-cl --inputfile=German-100_and_0.fet --randomseeds10=10 --randomseeds11=11 --randomseeds12=12 --randomseeds20=20 --randomseeds21=21 --randomseeds22=22 i got this report: [...] ==43032== LL misses: 8,283,261 ( 7,840,142 rd + 443,119 wr) i doubled the LL cache by this: valgrind --tool=cachegrind --LL=4194304,16,64 ./fet-cl --inputfile=German-100_and_0.fet --randomseeds10=10 --randomseeds11=11 --randomseeds12=12 --randomseeds20=20 --randomseeds21=21 --randomseeds22=22 and got this: [...] ==48663== LL misses: 230,426 ( 89,082 rd + 141,344 wr) That sound ok (larger cache should reduce the misses). But I continued to double the cache again and again. Up to: valgrind --tool=cachegrind --LL=268435456,16,64 ./fet-cl --inputfile=German-100_and_0.fet --randomseeds10=10 --randomseeds11=11 --randomseeds12=12 --randomseeds20=20 --randomseeds21=21 --randomseeds22=22 So much more memory then my program use at all. But the misses never drop down to 0. They stay at: [...] ==6637== LL misses: 180,120 ( 41,252 rd + 138,868 wr) I don't understand that. Shouldn't the misses drop down to 0 as soon as LL is >64MB (since my software use only 38 MB)? (But i tried up to 256MB and it doesn't drop). Is that a bug in valgrind or is there a bug in my logic in understanding the LL misses? Please let me know. Thank you! |
From: Paul F. <pj...@wa...> - 2023-04-23 12:25:19
|
On 23-04-23 13:58, Paul Floyd wrote: > > > Still to come: Alpine/musl. It configures. It all builds. It's not great. == 619 tests, 102 stderr failures, 27 stdout failures, 1 stderrB failure, 2 stdoutB failures, 4 post failures == But that's good enough for me :-) A+ Paul |
From: Paul F. <pj...@wa...> - 2023-04-23 11:59:06
|
On 04/22/23 11:41 PM, Paul Floyd wrote: > > Nothing bad to report. I'll do one more test on Solaris 11.3 tomorrow. > Solaris 11.3 amd64 No python3, and I only have python3.4 (and no 'python3' metapackage either) on my machine that is long out of service contract. So I got some moans about that. No aligned_alloc ! The memalign wrapper also wasn't correct for an alignment of zero. I just fixed that. The changes should only affect non-Linux platforms. I'll try to do one more round of tests with these changes from git head. Still to come: Alpine/musl. A+ Paul |
From: Paul F. <pj...@wa...> - 2023-04-22 21:53:47
|
On 22-04-23 23:41, Paul Floyd wrote: > > OpenIndiana 22.10 > > Everything builds. All the gdbserver tests hang. And I should have said the hangs aren't new. A+ Paul |
From: Paul F. <pj...@wa...> - 2023-04-22 21:41:29
|
Nothing bad to report. I'll do one more test on Solaris 11.3 tomorrow. FreeBSD 13.1 amd64 No change FreeBSD 13.2 amd64 I get a hang in drd/tests/pth_cancel_locked and a few more fails, most likely related to switching from clang 13 to 14. FreedBSD 13.2 x86 As for amd64 but also a new hang in drd/tests/swapcontext OpenIndiana 22.10 Everything builds. All the gdbserver tests hang. == 832 tests, 82 stderr failures, 19 stdout failures, 15 stderrB failures, 17 stdoutB failures, 6 post failures == macOS 10.13 Everything builds but beyond that pretty grim. == 706 tests, 322 stderr failures, 91 stdout failures, 0 stderrB failures, 0 stdoutB failures, 32 post failures == A+ Paul |
From: Mark W. <ma...@kl...> - 2023-04-22 01:47:41
|
An RC2 tarball for 3.21.0 is now available at https://sourceware.org/pub/valgrind/valgrind-3.21.0.RC2.tar.bz2 (md5sum = f33407fdffbfa78f5014781cc92297cf) (sha1sum = c520ee0c28d9e20d28aa25d05ce2525c39a69135) https://sourceware.org/pub/valgrind/valgrind-3.21.0.RC2.tar.bz2.asc Please give it a try in configurations that are important for you and report any problems you have, either on this mailing list, or (preferably) via our bug tracker at https://bugs.kde.org/enter_bug.cgi?product=valgrind Please check the NEWS entry below for new features that could use some extra testing. Note that there has also been a dhat extension which hasn't yet been added to NEWS. There is now a a client request for DHAT to mark memory to be histogrammed: https://bugs.kde.org/464103 https://snapshots.sourceware.org/valgrind/trunk/latest/html/dh-manual.html#dh-access-counts If nothing critical emerges, a final release will happen on Friday 28 April. * ==================== CORE CHANGES =================== * When GDB is used to debug a program running under valgrind using the valgrind gdbserver, GDB will automatically load some python code provided in valgrind defining GDB front end commands corresponding to the valgrind monitor commands. These GDB front end commands accept the same format as the monitor commands directly sent to the Valgrind gdbserver. These GDB front end commands provide a better integration in the GDB command line interface, so as to use for example GDB auto-completion, command specific help, searching for a command or command help matching a regexp, ... For relevant monitor commands, GDB will evaluate arguments to make the use of monitor commands easier. For example, instead of having to print the address of a variable to pass it to a subsequent monitor command, the GDB front end command will evaluate the address argument. It is for example possible to do: (gdb) memcheck who_point_at &some_struct sizeof(some_struct) instead of: (gdb) p &some_struct $2 = (some_struct_type *) 0x1130a0 <some_struct> (gdb) p sizeof(some_struct) $3 = 40 (gdb) monitor who_point_at 0x1130a0 40 * The vgdb utility now supports extended-remote protocol when invoked with --multi. In this mode the GDB run command is supported. Which means you don't need to run gdb and valgrind from different terminals. So for example to start you program in gdb and run it under valgrind you can do: $ gdb prog (gdb) set remote exec-file prog (gdb) set sysroot / (gdb) target extended-remote | vgdb --multi (gdb) start * The behaviour of realloc with a size of zero can now be changed for tools that intercept malloc. Those tools are memcheck, helgrind, drd, massif and dhat. Realloc implementations generally do one of two things - free the memory like free() and return NULL (GNU libc and ptmalloc). - either free the memory and then allocate a minumum siized block or just return the original pointer. Return NULL if the allocation of the minimum sized block fails (jemalloc, musl, snmalloc, Solaris, macOS). When Valgrind is configured and built it will try to match the OS and libc behaviour. However if you are using a non-default library to replace malloc and family (e.g., musl on a glibc Linux or tcmalloc on FreeBSD) then you can use a command line option to change the behaviour of Valgrind: --realloc-zero-bytes-frees=yes|no [yes on Linux glibc, no otherwise] * ================== PLATFORM CHANGES ================= * Make the address space limit on FreeBSD amd64 128Gbytes (the same as Linux and Solaris, it was 32Gbytes) * ==================== TOOL CHANGES =================== * Memcheck: - When doing a delta leak_search, it is now possible to only output the new loss records compared to the previous leak search. This is available in the memcheck monitor command 'leak_search' by specifying the "new" keyword or in your program by using the client request VALGRIND_DO_NEW_LEAK_CHECK. Whenever a "delta" leak search is done (i.e. when specifying "new" or "increased" or "changed" in the monitor command), the new loss records have a "new" marker. - Valgrind now contains python code that defines GDB memcheck front end monitor commands. See CORE CHANGES. - Performs checks for the use of realloc with a size of zero. This is non-portable and a source of errors. If memcheck detects such a usage it will generate an error realloc() with size 0 followed by the usual callstacks. A switch has been added to allow this to be turned off: --show-realloc-size-zero=yes|no [yes] * Helgrind: - The option ---history-backtrace-size=<number> allows to configure the number of entries to record in the stack traces of "old" accesses. Previously, this number was hardcoded to 8. - Valgrind now contains python code that defines GDB helgrind front end monitor commands. See CORE CHANGES. * Cachegrind: - `--cache-sim=no` is now the default. The cache simulation is old and unlikely to match any real modern machine. This means only the `Ir` event are gathered by default, but that is by far the most useful event. - `cg_annotate`, `cg_diff`, and `cg_merge` have been rewritten in Python. As a result, they all have more flexible command line argument handling, e.g. supporting `--show-percs` and `--no-show-percs` forms as well as the existing `--show-percs=yes` and `--show-percs=no`. - `cg_annotate` has some functional changes. - It's much faster, e.g. 3-4x on common cases. - It now supports diffing (with `--diff`, `--mod-filename`, and `--mod-funcname`) and merging (by passing multiple data files). - It now provides more information at the file and function level. There are now "File:function" and "Function:file" sections. These are very useful for programs that use inlining a lot. - Support for user-annotated files and the `-I`/`--include` option has been removed, because it was of little use and blocked other improvements. - The `--auto` option is renamed `--annotate`, though the old `--auto=yes`/`--auto=no` forms are still supported. - `cg_diff` and `cg_merge` are now deprecated, because `cg_annotate` now does a better job of diffing and merging. - The Cachegrind output file format has changed very slightly, but in ways nobody is likely to notice. * Callgrind: - Valgrind now contains python code that defines GDB callgrind front end monitor commands. See CORE CHANGES. * Massif: - Valgrind now contains python code that defines GDB massif front end monitor commands. See CORE CHANGES. |
From: Mark W. <ma...@kl...> - 2023-04-20 20:16:42
|
On Wed, Apr 19, 2023 at 11:46:34AM +0200, folkert wrote: > > > The 2 calls it does are: > > > > > > print_char: > > > movb (%esi), %al > > > movb %al, buffer > > > movl $4, %eax > > > movl $1, %ebx > > > movl $buffer, %ecx > > > movl $1, %edx > > > int $0x80 > > > ret > > > > > > exit: > > > movl $1, %eax > > > movl $0, %ebx > > > int $0x80 > > > > Valgrind can't run just any executable binary. It has quite a lot of hard > > coded limitations that correspont (mostly) to what compilers and link > > editors will produce. So if you use assembler and use opcodes not normally > > generated by compilers then it won't work. > ... > > So int 0x80 results in a decode error. > > > > Can you use syscall? > > That solves the problem. Glad that resolved it.I also didn't know int 0x80 worked on amd64 as syscal (but not under valgrind). Note that this also https://bugs.kde.org/show_bug.cgi?id=342988 Cheers, Mark |
From: folkert <fo...@va...> - 2023-04-19 09:46:48
|
> > The 2 calls it does are: > > > > print_char: > > movb (%esi), %al > > movb %al, buffer > > movl $4, %eax > > movl $1, %ebx > > movl $buffer, %ecx > > movl $1, %edx > > int $0x80 > > ret > > > > exit: > > movl $1, %eax > > movl $0, %ebx > > int $0x80 > > Valgrind can't run just any executable binary. It has quite a lot of hard > coded limitations that correspont (mostly) to what compilers and link > editors will produce. So if you use assembler and use opcodes not normally > generated by compilers then it won't work. ... > So int 0x80 results in a decode error. > > Can you use syscall? That solves the problem. Thanks! |
From: Floyd, P. <pj...@wa...> - 2023-04-18 16:34:19
|
On 18/04/2023 17:46, folkert wrote: > The 2 calls it does are: > > print_char: > movb (%esi), %al > movb %al, buffer > movl $4, %eax > movl $1, %ebx > movl $buffer, %ecx > movl $1, %edx > int $0x80 > ret > > exit: > movl $1, %eax > movl $0, %ebx > int $0x80 Valgrind can't run just any executable binary. It has quite a lot of hard coded limitations that correspont (mostly) to what compilers and link editors will produce. So if you use assembler and use opcodes not normally generated by compilers then it won't work. The code that handles this is case 0xCD: /* INT imm8 */ d64 = getUChar(delta); delta++; /* Handle int $0xD2 (Solaris fasttrap syscalls). */ if (d64 == 0xD2) { jmp_lit(dres, Ijk_Sys_int210, guest_RIP_bbstart + delta); vassert(dres->whatNext == Dis_StopHere); DIP("int $0xD2\n"); return delta; } goto decode_failure; So int 0x80 results in a decode error. Can you use syscall? A+ Paul |
From: Eliot M. <mo...@cs...> - 2023-04-18 15:50:31
|
On 4/18/2023 10:51 AM, folkert wrote: > Hi, > > I wrote a compiler for brainfuck to x86. > The result is quite fast but I was curious if I could tune it even more. > So I ran it in callgrind but this resulted in: > > folkert@snsv ~/Projects/bf-compiler (master)$ valgrind --tool=callgrind ./test > ==77043== Callgrind, a call-graph generating cache profiler > ==77043== Copyright (C) 2002-2017, and GNU GPL'd, by Josef Weidendorfer et al. > ==77043== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info > ==77043== Command: ./test > ==77043== > ==77043== For interactive control, run 'callgrind_control -h'. > vex amd64->IR: unhandled instruction bytes: 0xCD 0x80 0xC3 0x67 0x80 0x3E 0x0 0x74 0x5 0x83 > vex amd64->IR: REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0 > vex amd64->IR: VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE > vex amd64->IR: PFX.66=0 PFX.F2=0 PFX.F3=0 > ==77043== valgrind: Unrecognised instruction at address 0x40274e. > ==77043== at 0x40274E: ??? (in /home/folkert/Projects/bf-compiler/test) > ==77043== by 0x4020EE: ??? (in /home/folkert/Projects/bf-compiler/test) > ==77043== Your program just tried to execute an instruction that Valgrind > ==77043== did not recognise. There are two possible reasons for this. > ==77043== 1. Your program has a bug and erroneously jumped to a non-code > ==77043== location. If you are running Memcheck and you just saw a > ==77043== warning about a bad jump, it's probably your program's fault. > ==77043== 2. The instruction is legitimate but Valgrind doesn't handle it, > ==77043== i.e. it's Valgrind's fault. If you think this is the case or > ==77043== you are not sure, please let us know and we'll try to fix it. > ==77043== Either way, Valgrind will now raise a SIGILL signal which will > ==77043== probably kill your program. > ==77043== > ==77043== Process terminating with default action of signal 4 (SIGILL) > ==77043== Illegal opcode at address 0x40274E > ==77043== at 0x40274E: ??? (in /home/folkert/Projects/bf-compiler/test) > ==77043== by 0x4020EE: ??? (in /home/folkert/Projects/bf-compiler/test) > ==77043== > ==77043== Events : Ir > ==77043== Collected : 28836 > ==77043== > ==77043== I refs: 28,836 > Illegal instruction (core dumped) > > If you're curious what is going wrong here, the source assembly and the > x86 binary can be retrieved from > https://vanheusden.com/permshare/callgrind-error.tar.xz > > Oh and if you would like to assemble the assembly yourself: > > as -g mandelbrot.s > ld -g a.out -o test > > ./test then results in the mandelbrot-fractal. Using an online disassembler, I found that the initial bytes decode to int 0x80, which (under Linux) is a system call. Maybe you're making a system call that valgrind does not recognize? One would need to know register contents to go further with that. Btw, naming a program "test" is not necessarily a wonderful idea if the current directory happens to be on your path, since "test" is a program often used by scripts. Cheers - Eliot Moss |
From: folkert <fo...@va...> - 2023-04-18 15:46:41
|
> > I wrote a compiler for brainfuck to x86. > > The result is quite fast but I was curious if I could tune it even more. > > So I ran it in callgrind but this resulted in: ... > > ==77043== Process terminating with default action of signal 4 (SIGILL) > > ==77043== Illegal opcode at address 0x40274E > > ==77043== at 0x40274E: ??? (in /home/folkert/Projects/bf-compiler/test) > > ==77043== by 0x4020EE: ??? (in /home/folkert/Projects/bf-compiler/test) ... > > If you're curious what is going wrong here, the source assembly and the > > x86 binary can be retrieved from > > https://vanheusden.com/permshare/callgrind-error.tar.xz ... > Using an online disassembler, I found that the initial bytes decode to > int 0x80, which (under Linux) is a system call. Maybe you're making a > system call that valgrind does not recognize? One would need to know > register contents to go further with that. The 2 calls it does are: print_char: movb (%esi), %al movb %al, buffer movl $4, %eax movl $1, %ebx movl $buffer, %ecx movl $1, %edx int $0x80 ret exit: movl $1, %eax movl $0, %ebx int $0x80 When the program is ran directly from the command, it runs fine. So that's not the problem. |
From: folkert <fo...@va...> - 2023-04-18 15:12:47
|
Hi, I wrote a compiler for brainfuck to x86. The result is quite fast but I was curious if I could tune it even more. So I ran it in callgrind but this resulted in: folkert@snsv ~/Projects/bf-compiler (master)$ valgrind --tool=callgrind ./test ==77043== Callgrind, a call-graph generating cache profiler ==77043== Copyright (C) 2002-2017, and GNU GPL'd, by Josef Weidendorfer et al. ==77043== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info ==77043== Command: ./test ==77043== ==77043== For interactive control, run 'callgrind_control -h'. vex amd64->IR: unhandled instruction bytes: 0xCD 0x80 0xC3 0x67 0x80 0x3E 0x0 0x74 0x5 0x83 vex amd64->IR: REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0 vex amd64->IR: VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE vex amd64->IR: PFX.66=0 PFX.F2=0 PFX.F3=0 ==77043== valgrind: Unrecognised instruction at address 0x40274e. ==77043== at 0x40274E: ??? (in /home/folkert/Projects/bf-compiler/test) ==77043== by 0x4020EE: ??? (in /home/folkert/Projects/bf-compiler/test) ==77043== Your program just tried to execute an instruction that Valgrind ==77043== did not recognise. There are two possible reasons for this. ==77043== 1. Your program has a bug and erroneously jumped to a non-code ==77043== location. If you are running Memcheck and you just saw a ==77043== warning about a bad jump, it's probably your program's fault. ==77043== 2. The instruction is legitimate but Valgrind doesn't handle it, ==77043== i.e. it's Valgrind's fault. If you think this is the case or ==77043== you are not sure, please let us know and we'll try to fix it. ==77043== Either way, Valgrind will now raise a SIGILL signal which will ==77043== probably kill your program. ==77043== ==77043== Process terminating with default action of signal 4 (SIGILL) ==77043== Illegal opcode at address 0x40274E ==77043== at 0x40274E: ??? (in /home/folkert/Projects/bf-compiler/test) ==77043== by 0x4020EE: ??? (in /home/folkert/Projects/bf-compiler/test) ==77043== ==77043== Events : Ir ==77043== Collected : 28836 ==77043== ==77043== I refs: 28,836 Illegal instruction (core dumped) If you're curious what is going wrong here, the source assembly and the x86 binary can be retrieved from https://vanheusden.com/permshare/callgrind-error.tar.xz Oh and if you would like to assemble the assembly yourself: as -g mandelbrot.s ld -g a.out -o test ./test then results in the mandelbrot-fractal. Regards, Folkert van Heusden |
From: Nicholas N. <n.n...@gm...> - 2023-04-17 07:20:13
|
I am planning to also remove the `-I`/`--include` option from cg_annotate, for much the same reasons that I removed user annotated files: it's an option that made sense in the very early days of cg_annotate, but is of little or no use today, and it's getting in the way of some other changes I want to make. Nick On Tue, 4 Apr 2023 at 15:52, Nicholas Nethercote <n.n...@gm...> wrote: > There were no objections, and I have now removed user annotations from > `cg_annotate`. > > Nick > > On Wed, 29 Mar 2023 at 09:03, Nicholas Nethercote <n.n...@gm...> > wrote: > >> Hi, >> >> I recently rewrote `cg_annotate`, `cg_diff`, and `cg_merge` in Python. >> The old versions were written in Perl, Perl, and C, respectively. The new >> versions are much nicer and easier to modify, and I have various ideas for >> improving `cg_annotate`. This email is about one of those ideas. >> >> A typical way to invoke `cg_annotate` is like this: >> >> > cg_annotate cachegrind.out.12345 >> >> This implies `--auto=yes`, which requests line-by-line "auto-annotation" >> of source files. I.e. `cg_annotate` will automatically annotate all files >> in the profile that meet the significance threshold. >> >> It's also possible to do something like this: >> >> > cg_annotate --auto=no cachegrind.out.12345 a.c b.c >> >> Which instead requests "user annotation" of the files `a.c` and `b.c`. >> >> My thesis is that auto-annotation suffices in practice for all reasonable >> use cases, and that user annotation is unnecessary and can be removed. >> >> When I first wrote `cg_annotate` in 2002, only user annotation was >> implemented. Shortly after, I added the `--auto={yes,no}` option. Since >> then I've never used user annotation, and I suspect nobody else has either. >> User annotation is ok when dealing with tiny programs, but as soon as you >> are profiling a program with more than a handful of source files it becomes >> impractical. >> >> The only possible use cases I can think of for user annotation are as >> follows. >> >> - If you want to see a particular file(s) annotated but you don't >> want to see any others, then you can use user annotation in combination >> with `--auto=no`. But it's trivial to search through the output for the >> particular file, so this doesn't seem important. >> - If the path to a file is somehow really messed up in the debug >> info, it might be possible that auto-annotation would fail to find it, but >> user annotation could find it, possibly in combination with `-I`. But this >> seems unlikely. Some basic testing shows that gcc, clang and rustc all >> default to using full paths in debug info. gcc supports >> `-fdebug-prefix-map` but that seems to mostly be used for changing full >> paths to relative paths, which will still work fine. >> >> Removing user annotation would (a) simplify the code and docs, and (b) >> enable the possibility of moving the merge functionality from `cg_merge` >> into `cg_annotate`, by allowing the user to specify multiple cachegrind.out >> files as input. >> >> So: is anybody using user annotation? Does anybody see any problems with >> this proposal? >> >> Thanks. >> >> Nick >> > |