You can subscribe to this list here.
| 2003 |
Jan
|
Feb
|
Mar
(58) |
Apr
(261) |
May
(169) |
Jun
(214) |
Jul
(201) |
Aug
(219) |
Sep
(198) |
Oct
(203) |
Nov
(241) |
Dec
(94) |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2004 |
Jan
(137) |
Feb
(149) |
Mar
(150) |
Apr
(193) |
May
(95) |
Jun
(173) |
Jul
(137) |
Aug
(236) |
Sep
(157) |
Oct
(150) |
Nov
(136) |
Dec
(90) |
| 2005 |
Jan
(139) |
Feb
(130) |
Mar
(274) |
Apr
(138) |
May
(184) |
Jun
(152) |
Jul
(261) |
Aug
(409) |
Sep
(239) |
Oct
(241) |
Nov
(260) |
Dec
(137) |
| 2006 |
Jan
(191) |
Feb
(142) |
Mar
(169) |
Apr
(75) |
May
(141) |
Jun
(169) |
Jul
(131) |
Aug
(141) |
Sep
(192) |
Oct
(176) |
Nov
(142) |
Dec
(95) |
| 2007 |
Jan
(98) |
Feb
(120) |
Mar
(93) |
Apr
(96) |
May
(95) |
Jun
(65) |
Jul
(62) |
Aug
(56) |
Sep
(53) |
Oct
(95) |
Nov
(106) |
Dec
(87) |
| 2008 |
Jan
(58) |
Feb
(149) |
Mar
(175) |
Apr
(110) |
May
(106) |
Jun
(72) |
Jul
(55) |
Aug
(89) |
Sep
(26) |
Oct
(96) |
Nov
(83) |
Dec
(93) |
| 2009 |
Jan
(97) |
Feb
(106) |
Mar
(74) |
Apr
(64) |
May
(115) |
Jun
(83) |
Jul
(137) |
Aug
(103) |
Sep
(56) |
Oct
(59) |
Nov
(61) |
Dec
(37) |
| 2010 |
Jan
(94) |
Feb
(71) |
Mar
(53) |
Apr
(105) |
May
(79) |
Jun
(111) |
Jul
(110) |
Aug
(81) |
Sep
(50) |
Oct
(82) |
Nov
(49) |
Dec
(21) |
| 2011 |
Jan
(87) |
Feb
(105) |
Mar
(108) |
Apr
(99) |
May
(91) |
Jun
(94) |
Jul
(114) |
Aug
(77) |
Sep
(58) |
Oct
(58) |
Nov
(131) |
Dec
(62) |
| 2012 |
Jan
(76) |
Feb
(93) |
Mar
(68) |
Apr
(95) |
May
(62) |
Jun
(109) |
Jul
(90) |
Aug
(87) |
Sep
(49) |
Oct
(54) |
Nov
(66) |
Dec
(84) |
| 2013 |
Jan
(67) |
Feb
(52) |
Mar
(93) |
Apr
(65) |
May
(33) |
Jun
(34) |
Jul
(52) |
Aug
(42) |
Sep
(52) |
Oct
(48) |
Nov
(66) |
Dec
(14) |
| 2014 |
Jan
(66) |
Feb
(51) |
Mar
(34) |
Apr
(47) |
May
(58) |
Jun
(27) |
Jul
(52) |
Aug
(41) |
Sep
(78) |
Oct
(30) |
Nov
(28) |
Dec
(26) |
| 2015 |
Jan
(41) |
Feb
(42) |
Mar
(20) |
Apr
(73) |
May
(31) |
Jun
(48) |
Jul
(23) |
Aug
(55) |
Sep
(36) |
Oct
(47) |
Nov
(48) |
Dec
(41) |
| 2016 |
Jan
(32) |
Feb
(34) |
Mar
(33) |
Apr
(22) |
May
(14) |
Jun
(31) |
Jul
(29) |
Aug
(41) |
Sep
(17) |
Oct
(27) |
Nov
(38) |
Dec
(28) |
| 2017 |
Jan
(28) |
Feb
(30) |
Mar
(16) |
Apr
(9) |
May
(27) |
Jun
(57) |
Jul
(28) |
Aug
(43) |
Sep
(31) |
Oct
(20) |
Nov
(24) |
Dec
(18) |
| 2018 |
Jan
(34) |
Feb
(50) |
Mar
(18) |
Apr
(26) |
May
(13) |
Jun
(31) |
Jul
(13) |
Aug
(11) |
Sep
(15) |
Oct
(12) |
Nov
(18) |
Dec
(13) |
| 2019 |
Jan
(12) |
Feb
(29) |
Mar
(51) |
Apr
(22) |
May
(13) |
Jun
(20) |
Jul
(13) |
Aug
(12) |
Sep
(21) |
Oct
(6) |
Nov
(9) |
Dec
(5) |
| 2020 |
Jan
(13) |
Feb
(5) |
Mar
(25) |
Apr
(4) |
May
(40) |
Jun
(27) |
Jul
(5) |
Aug
(17) |
Sep
(21) |
Oct
(1) |
Nov
(5) |
Dec
(15) |
| 2021 |
Jan
(28) |
Feb
(6) |
Mar
(11) |
Apr
(5) |
May
(7) |
Jun
(8) |
Jul
(5) |
Aug
(5) |
Sep
(11) |
Oct
(9) |
Nov
(10) |
Dec
(12) |
| 2022 |
Jan
(7) |
Feb
(13) |
Mar
(8) |
Apr
(7) |
May
(12) |
Jun
(27) |
Jul
(14) |
Aug
(27) |
Sep
(27) |
Oct
(17) |
Nov
(17) |
Dec
|
| 2023 |
Jan
(10) |
Feb
(18) |
Mar
(9) |
Apr
(26) |
May
|
Jun
(13) |
Jul
(18) |
Aug
(5) |
Sep
(12) |
Oct
(16) |
Nov
(1) |
Dec
|
| 2024 |
Jan
(4) |
Feb
(3) |
Mar
(6) |
Apr
(17) |
May
(2) |
Jun
(33) |
Jul
(13) |
Aug
(1) |
Sep
(6) |
Oct
(8) |
Nov
(6) |
Dec
(15) |
| 2025 |
Jan
(5) |
Feb
(11) |
Mar
(8) |
Apr
(20) |
May
(1) |
Jun
|
Jul
|
Aug
(9) |
Sep
(1) |
Oct
(7) |
Nov
(1) |
Dec
|
|
From: Julian S. <js...@ac...> - 2020-05-26 08:38:41
|
You can't easily avoid this problem, because it occurs in a system library, not in your own code: ==306851== valgrind: Unrecognised instruction at address 0x6ddf581. ==306851== at 0x6DDF581: opal_pointer_array_construct (in /opt/openmpi-GCC73/v3.1.x-20181010/lib/libopen-pal.so.40.10.3) Possibly your least-worst option is to talk with the people who built/ installed OpenMPI on the machine, to see if you can get a build that doesn't use AVX512 instructions. J |
|
From: Patrick B. <Pat...@le...> - 2020-05-26 08:26:26
|
Hi Tom, I'm a new user of Valgrind. I was needing it to check a large mpi code. So I downloaded 3.15 version but even if hardware and software are 2 to 3 years old, valgrind does'nt work for me. Nor gcc7, nor OpenMPI, nor my application (even the small test) used specific option when they were built. If this unsupported instruction (I do not know what is an EVEX prefix, sorry) is the problem, how can I avoid it to use valgrind ? I was just thinking that 3.16 could solve my problem.... Patrick Le 26/05/2020 à 10:12, Tom Hughes a écrit : > On 26/05/2020 09:06, Patrick Bégou wrote: > >> valgrind-3.16.0.RC2 doesn't work for me (as previous version on this >> server). > > Are you saying that it fails on a binary that worked before? > >> vex amd64->IR: unhandled instruction bytes: 0x62 0xF1 0xFD 0x8 0x6F >> 0x5 0x25 0xA8 0x18 0x0 >> vex amd64->IR: REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0 >> vex amd64->IR: VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE >> vex amd64->IR: PFX.66=0 PFX.F2=0 PFX.F3=0 >> ==306850== valgrind: Unrecognised instruction at address 0x6ddf581. >> vex amd64->IR: unhandled instruction bytes: 0x62 0xF1 0xFD 0x8 0x6F >> 0x5 0x25 0xA8 0x18 0x0 >> vex amd64->IR: REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0 >> vex amd64->IR: VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE >> vex amd64->IR: PFX.66=0 PFX.F2=0 PFX.F3=0 > > Because this is an instruction with an EVEX prefix that is not > supported by any version of valgrind ever so I don't see how this > binary can have worked with the previous version of valgrind. > > I suspect that you have in fact recompiled the program with > a different compiler or different optimization settings since > the time when it worked? > > Tom > |
|
From: Tom H. <to...@co...> - 2020-05-26 08:12:57
|
On 26/05/2020 09:06, Patrick Bégou wrote: > valgrind-3.16.0.RC2 doesn't work for me (as previous version on this > server). Are you saying that it fails on a binary that worked before? > vex amd64->IR: unhandled instruction bytes: 0x62 0xF1 0xFD 0x8 0x6F 0x5 > 0x25 0xA8 0x18 0x0 > vex amd64->IR: REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0 > vex amd64->IR: VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE > vex amd64->IR: PFX.66=0 PFX.F2=0 PFX.F3=0 > ==306850== valgrind: Unrecognised instruction at address 0x6ddf581. > vex amd64->IR: unhandled instruction bytes: 0x62 0xF1 0xFD 0x8 0x6F 0x5 > 0x25 0xA8 0x18 0x0 > vex amd64->IR: REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0 > vex amd64->IR: VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE > vex amd64->IR: PFX.66=0 PFX.F2=0 PFX.F3=0 Because this is an instruction with an EVEX prefix that is not supported by any version of valgrind ever so I don't see how this binary can have worked with the previous version of valgrind. I suspect that you have in fact recompiled the program with a different compiler or different optimization settings since the time when it worked? Tom -- Tom Hughes (to...@co...) http://compton.nu/ |
|
From: Patrick B. <Pat...@le...> - 2020-05-26 08:07:02
|
Le 19/05/2020 à 14:32, Julian Seward a écrit : > > Greetings. > > A first release candidate for 3.16.0 is available at > https://sourceware.org/pub/valgrind/valgrind-3.16.0.RC2.tar.bz2 > (md5 = 21ac87434ed32bcfe5ea86a0978440ba) > > Please give it a try on platforms that are important for you. If no > serious > issues are reported, the 3.16.0 final release will happen on 25 May, > that is, > next Monday. > > J Hi all, valgrind-3.16.0.RC2 doesn't work for me (as previous version on this server). _*My fortran test program (error prune I think) is as simple as:*_ PROGRAM reduce USE mpi IMPLICIT NONE INTEGER :: me, ncpus, ierr REAL :: buff, resu=0 CALL MPI_INIT(ierr) CALL MPI_COMM_RANK(MPI_COMM_WORLD,me,ierr) CALL MPI_COMM_SIZE(MPI_COMM_WORLD,ncpus,ierr) buff=1 CALL MPI_ALLREDUCE(buff,resu,1,MPI_REAL,MPI_SUM,MPI_COMM_WORLD,ierr) if (me == 0 ) WRITE(6,'(a,i0,2(a,f14.6))') 'On ',me,' I have ',buff,' and got ',resu CALL MPI_FINALIZE(ierr) END PROGRAM reduce _*Compilation with:*_ mpifort reduce.F90 -o reduce mpifort --show /opt/GCC73/bin/gfortran -I/opt/openmpi-GCC73/v3.1.x-20181010/include -pthread -I/opt/openmpi-GCC73/v3.1.x-20181010/lib -Wl,-rpath -Wl,/opt/openmpi-GCC73/v3.1.x-20181010/lib -Wl,--enable-new-dtags -L/opt/openmpi-GCC73/v3.1.x-20181010/lib -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi _*OS is *_CentOS Linux release 7.7.1908 (Core) _*Valgrind compiled with gcc7.3, configure options are:*_ ./configure --enable-only64bit --with-mpicc=$(which mpicc) --prefix=/robin/data/begou/VALGRIND/valgrind-binaries *Hardware is:* Dell Poweredge R940 4 x Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz (laubnched Q3 2017) 1.5 TB of RAM _*Compiler*_ is gcc (GCC) 7.3.0 (january 2018) _*OpenMPI *_is v3.1. from the git repo 2018/10/10 (because a patch was needed at this time) compiled with gcc7.3.0 *configure options are:* --prefix=/opt/openmpi-GCC73/v3.1.x-20181010' '--enable-mpirun-prefix-by-default' '--disable-dlopen' '--enable-mca-no-build=openib' '- -without-verbs' '--enable-mpi-cxx' '--without-slurm' '--enable-mpi-thread-multiple _*Error is:*_ [begou@grivola TESTS]$valgrind --version valgrind-3.16.0.RC2 [begou@grivola TESTS]$mpirun -np 2 valgrind ./reduce ==306850== Memcheck, a memory error detector ==306850== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al. ==306850== Using Valgrind-3.16.0.RC2 and LibVEX; rerun with -h for copyright info ==306850== Command: ./reduce ==306850== ==306851== Memcheck, a memory error detector ==306851== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al. ==306851== Using Valgrind-3.16.0.RC2 and LibVEX; rerun with -h for copyright info ==306851== Command: ./reduce ==306851== vex amd64->IR: unhandled instruction bytes: 0x62 0xF1 0xFD 0x8 0x6F 0x5 0x25 0xA8 0x18 0x0 vex amd64->IR: REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0 vex amd64->IR: VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE vex amd64->IR: PFX.66=0 PFX.F2=0 PFX.F3=0 ==306850== valgrind: Unrecognised instruction at address 0x6ddf581. vex amd64->IR: unhandled instruction bytes: 0x62 0xF1 0xFD 0x8 0x6F 0x5 0x25 0xA8 0x18 0x0 vex amd64->IR: REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0 vex amd64->IR: VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE vex amd64->IR: PFX.66=0 PFX.F2=0 PFX.F3=0 ==306851== valgrind: Unrecognised instruction at address 0x6ddf581. ==306851== at 0x6DDF581: opal_pointer_array_construct (in /opt/openmpi-GCC73/v3.1.x-20181010/lib/libopen-pal.so.40.10.3) ==306851== by 0x6E01A78: mca_base_var_init (in /opt/openmpi-GCC73/v3.1.x-20181010/lib/libopen-pal.so.40.10.3) ==306851== by 0x6DE3E39: opal_init_util (in /opt/openmpi-GCC73/v3.1.x-20181010/lib/libopen-pal.so.40.10.3) ==306851== by 0x552ED60: ompi_mpi_init (in /opt/openmpi-GCC73/v3.1.x-20181010/lib/libmpi.so.40.10.3) ==306851== by 0x555F9ED: PMPI_Init (in /opt/openmpi-GCC73/v3.1.x-20181010/lib/libmpi.so.40.10.3) ==306851== by 0x52ABBB7: PMPI_INIT (in /opt/openmpi-GCC73/v3.1.x-20181010/lib/libmpi_mpifh.so.40.11.2) ==306851== by 0x400CDD: MAIN__ (in /HA/sources/begou/TESTS/reduce) ==306851== by 0x400E8C: main (in /HA/sources/begou/TESTS/reduce) ==306851== Your program just tried to execute an instruction that Valgrind ==306851== did not recognise. There are two possible reasons for this. ==306851== 1. Your program has a bug and erroneously jumped to a non-code ==306851== location. If you are running Memcheck and you just saw a ==306851== warning about a bad jump, it's probably your program's fault. ==306851== 2. The instruction is legitimate but Valgrind doesn't handle it, ==306851== i.e. it's Valgrind's fault. If you think this is the case or ==306851== you are not sure, please let us know and we'll try to fix it. ==306851== Either way, Valgrind will now raise a SIGILL signal which will ==306851== probably kill your program. ==306850== at 0x6DDF581: opal_pointer_array_construct (in /opt/openmpi-GCC73/v3.1.x-20181010/lib/libopen-pal.so.40.10.3) ==306850== by 0x6E01A78: mca_base_var_init (in /opt/openmpi-GCC73/v3.1.x-20181010/lib/libopen-pal.so.40.10.3) ==306850== by 0x6DE3E39: opal_init_util (in /opt/openmpi-GCC73/v3.1.x-20181010/lib/libopen-pal.so.40.10.3) ==306850== by 0x552ED60: ompi_mpi_init (in /opt/openmpi-GCC73/v3.1.x-20181010/lib/libmpi.so.40.10.3) ==306850== by 0x555F9ED: PMPI_Init (in /opt/openmpi-GCC73/v3.1.x-20181010/lib/libmpi.so.40.10.3) ==306850== by 0x52ABBB7: PMPI_INIT (in /opt/openmpi-GCC73/v3.1.x-20181010/lib/libmpi_mpifh.so.40.11.2) ==306850== by 0x400CDD: MAIN__ (in /HA/sources/begou/TESTS/reduce) ==306850== by 0x400E8C: main (in /HA/sources/begou/TESTS/reduce) ==306850== Your program just tried to execute an instruction that Valgrind ==306850== did not recognise. There are two possible reasons for this. ==306850== 1. Your program has a bug and erroneously jumped to a non-code ==306850== location. If you are running Memcheck and you just saw a ==306850== warning about a bad jump, it's probably your program's fault. ==306850== 2. The instruction is legitimate but Valgrind doesn't handle it, ==306850== i.e. it's Valgrind's fault. If you think this is the case or ==306850== you are not sure, please let us know and we'll try to fix it. ==306850== Either way, Valgrind will now raise a SIGILL signal which will ==306850== probably kill your program. Program received signal SIGILL: Illegal instruction. Program received signal SIGILL: Illegal instruction. Backtrace for this error: Backtrace for this error: #0 0x66ce3af in ??? #0 0x66ce3af in ??? #1 0x6ddf581 in ??? #2 0x6e01a78 in ??? #3 0x6de3e39 in ??? #1 0x6ddf581 in ??? #4 0x552ed60 in ??? #5 0x555f9ed in ??? #6 0x52abbb7 in ??? #2 0x6e01a78 in ??? #7 0x400cdd in ??? #8 0x400e8c in ??? #9 0x66ba504 in ??? #10 0x400c18 in ??? #3 0x6de3e39 in ??? #11 0xffffffffffffffff in ??? #4 0x552ed60 in ??? #5 0x555f9ed in ??? #6 0x52abbb7 in ??? #7 0x400cdd in ??? #8 0x400e8c in ??? ==306851== ==306851== Process terminating with default action of signal 4 (SIGILL) #9 0x66ba504 in ??? ==306851== at 0x648B4BB: raise (in /usr/lib64/libpthread-2.17.so) ==306851== by 0x66CE3AF: ??? (in /usr/lib64/libc-2.17.so) ==306851== by 0x6DDF580: opal_pointer_array_construct (in /opt/openmpi-GCC73/v3.1.x-20181010/lib/libopen-pal.so.40.10.3) ==306851== by 0x6E01A78: mca_base_var_init (in /opt/openmpi-GCC73/v3.1.x-20181010/lib/libopen-pal.so.40.10.3) ==306851== by 0x6DE3E39: opal_init_util (in /opt/openmpi-GCC73/v3.1.x-20181010/lib/libopen-pal.so.40.10.3) ==306851== by 0x552ED60: ompi_mpi_init (in /opt/openmpi-GCC73/v3.1.x-20181010/lib/libmpi.so.40.10.3) ==306851== by 0x555F9ED: PMPI_Init (in /opt/openmpi-GCC73/v3.1.x-20181010/lib/libmpi.so.40.10.3) ==306851== by 0x52ABBB7: PMPI_INIT (in /opt/openmpi-GCC73/v3.1.x-20181010/lib/libmpi_mpifh.so.40.11.2) ==306851== by 0x400CDD: MAIN__ (in /HA/sources/begou/TESTS/reduce) ==306851== by 0x400E8C: main (in /HA/sources/begou/TESTS/reduce) #10 0x400c18 in ??? #11 0xffffffffffffffff in ??? ==306850== ==306850== Process terminating with default action of signal 4 (SIGILL) ==306850== at 0x648B4BB: raise (in /usr/lib64/libpthread-2.17.so) ==306850== by 0x66CE3AF: ??? (in /usr/lib64/libc-2.17.so) ==306850== by 0x6DDF580: opal_pointer_array_construct (in /opt/openmpi-GCC73/v3.1.x-20181010/lib/libopen-pal.so.40.10.3) ==306850== by 0x6E01A78: mca_base_var_init (in /opt/openmpi-GCC73/v3.1.x-20181010/lib/libopen-pal.so.40.10.3) ==306850== by 0x6DE3E39: opal_init_util (in /opt/openmpi-GCC73/v3.1.x-20181010/lib/libopen-pal.so.40.10.3) ==306850== by 0x552ED60: ompi_mpi_init (in /opt/openmpi-GCC73/v3.1.x-20181010/lib/libmpi.so.40.10.3) ==306850== by 0x555F9ED: PMPI_Init (in /opt/openmpi-GCC73/v3.1.x-20181010/lib/libmpi.so.40.10.3) ==306850== by 0x52ABBB7: PMPI_INIT (in /opt/openmpi-GCC73/v3.1.x-20181010/lib/libmpi_mpifh.so.40.11.2) ==306850== by 0x400CDD: MAIN__ (in /HA/sources/begou/TESTS/reduce) ==306850== by 0x400E8C: main (in /HA/sources/begou/TESTS/reduce) ==306851== ==306851== HEAP SUMMARY: ==306851== in use at exit: 8,830 bytes in 65 blocks ==306851== total heap usage: 123 allocs, 58 frees, 90,778 bytes allocated ==306851== ==306850== ==306850== HEAP SUMMARY: ==306850== in use at exit: 8,830 bytes in 65 blocks ==306850== total heap usage: 123 allocs, 58 frees, 90,778 bytes allocated ==306850== ==306851== LEAK SUMMARY: ==306851== definitely lost: 0 bytes in 0 blocks ==306851== indirectly lost: 0 bytes in 0 blocks ==306851== possibly lost: 0 bytes in 0 blocks ==306851== still reachable: 8,830 bytes in 65 blocks ==306851== suppressed: 0 bytes in 0 blocks ==306851== Rerun with --leak-check=full to see details of leaked memory ==306851== ==306851== For lists of detected and suppressed errors, rerun with: -s ==306851== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) ==306850== LEAK SUMMARY: ==306850== definitely lost: 0 bytes in 0 blocks ==306850== indirectly lost: 0 bytes in 0 blocks ==306850== possibly lost: 0 bytes in 0 blocks ==306850== still reachable: 8,830 bytes in 65 blocks ==306850== suppressed: 0 bytes in 0 blocks ==306850== Rerun with --leak-check=full to see details of leaked memory ==306850== ==306850== For lists of detected and suppressed errors, rerun with: -s ==306850== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) -------------------------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code. Per user-direction, the job has been aborted. -------------------------------------------------------------------------- -------------------------------------------------------------------------- mpirun noticed that process rank 0 with PID 0 on node grivola exited on signal 4 (Illegal instruction). Thanks Patrick |
|
From: Paul F. <pj...@wa...> - 2020-05-26 07:21:15
|
Hi I'm running DHAT on what I consider to be a relatively small example. Standalone the executable runs in a bit under 10 minutes. Based on the CPU time that we print after every 10% of progress, under DHAT the same executable is going to take about 422 hours - about two and a half weeks. Does anyone have any ideas what could be causing it to be so slow? Indeed, is this the sort of slowdown that I should be expecting with DHAT? The executable is intensive in both memory and floating point. Probably not helping matters, the data structures that I want to look at are over 1kB in size so I tweaked the HISTOGRAM_SIZE_LIMIT to bump it up to 2kb. On the DHAT side, I have thought of trying to use some macro hackery to try to inline the avl comparator function calls. Otherwise I don't have much in the way of other ideas, and DHAT doesn't have any cli options to tweak things. A+ Paul |
|
From: John R. <jr...@bi...> - 2020-05-22 14:10:15
|
> What would help a lot is to have a VALGRIND request, like
> VALGRIND_DO_CLIENT_REQUEST_STMT, that we could use in our signal
> handler to turn off leak checking.
In valgrind-3.16 (to be released in 3 days; RC2 available today) see
valgrind --help-dyn-options
|
|
From: Philippe W. <phi...@sk...> - 2020-05-22 13:47:06
|
On Fri, 2020-05-22 at 15:22 +0300, Michael Widenius wrote:
> Hi!
> I have searched documentation, internet and header files like
> memcheck.h, but not found a solution:
>
> When running the MariaDB test suite under valgrind, we sometimes may
> get a core dump. In this case, the leaked memory report can be very
> long and will be totally useless.
>
> What would help a lot is to have a VALGRIND request, like
> VALGRIND_DO_CLIENT_REQUEST_STMT, that we could use in our signal
> handler to turn off leak checking.
>
> Is that possible and if not, is that something that could get
> implemented in the future?
> Is this something that anyone else has ever requested ?
>
> Regards,
> Monty
Hello,
The next version of valgrind is almost ready (Release Candidate was produced
a few days ago).
This release contains a feature to dynamically change many options.
You can obtain the list of dynamically changeable options doing:
valgrind --help-dyn-options
For memcheck, this gives the below help.
Based on this, you should be able to obtain what you need.
Hope this helps
Philippe
valgrind --help-dyn-options
Some command line settings are "dynamic", meaning they can be changed
while Valgrind is running, like this:
From the shell, using vgdb. Example:
$ vgdb "v.clo --trace-children=yes --child-silent-after-fork=no"
From a gdb attached to the valgrind gdbserver. Example:
(gdb) monitor v.clo --trace-children=yes --child-silent-after-fork=no"
From your program, using a client request. Example:
#include <valgrind/valgrind.h>
VALGRIND_CLO_CHANGE("--trace-children=yes");
VALGRIND_CLO_CHANGE("--child-silent-after-fork=no");
dynamically changeable options:
-v --verbose -q --quiet -d --stats --vgdb=no --vgdb=yes --vgdb=full
--vgdb-poll --vgdb-error --vgdb-stop-at --error-markers --show-error-list -s
--show-below-main --time-stamp --trace-children --child-silent-after-fork
--trace-sched --trace-signals --trace-symtab --trace-cfi --debug-dump=syms
--debug-dump=line --debug-dump=frames --trace-redir --trace-syscalls
--sym-offsets --progress-interval --merge-recursive-frames
--vex-iropt-verbosity --suppressions --trace-flags --trace-notbelow
--trace-notabove --profile-flags --gen-suppressions=no
--gen-suppressions=yes --gen-suppressions=all --errors-for-leak-kinds
--show-leak-kinds --leak-check-heuristics --show-reachable
--show-possibly-lost --freelist-vol --freelist-big-blocks --leak-check=no
--leak-check=summary --leak-check=yes --leak-check=full --ignore-ranges
--ignore-range-below-sp --show-mismatched-frees
valgrind: Use --help for more information.
|
|
From: Derrick M. <der...@gm...> - 2020-05-22 13:21:56
|
I am not too familiar with memcheck, but there are client requests to enable/disable checks within a memory range using VG_USERREQ__ENABLE_ADDR_ERROR_REPORTING_IN_RANGE and VG_USERREQ__DISABLE_ADDR_ERROR_REPORTING_IN_RANGE. On Fri, May 22, 2020 at 8:24 AM Michael Widenius <mic...@gm...> wrote: > > Hi! > I have searched documentation, internet and header files like > memcheck.h, but not found a solution: > > When running the MariaDB test suite under valgrind, we sometimes may > get a core dump. In this case, the leaked memory report can be very > long and will be totally useless. > > What would help a lot is to have a VALGRIND request, like > VALGRIND_DO_CLIENT_REQUEST_STMT, that we could use in our signal > handler to turn off leak checking. > > Is that possible and if not, is that something that could get > implemented in the future? > Is this something that anyone else has ever requested ? > > Regards, > Monty > > > _______________________________________________ > Valgrind-users mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-users -- Derrick McKee Phone: (703) 957-9362 Email: der...@gm... |
|
From: Michael W. <mic...@gm...> - 2020-05-22 12:22:49
|
Hi! I have searched documentation, internet and header files like memcheck.h, but not found a solution: When running the MariaDB test suite under valgrind, we sometimes may get a core dump. In this case, the leaked memory report can be very long and will be totally useless. What would help a lot is to have a VALGRIND request, like VALGRIND_DO_CLIENT_REQUEST_STMT, that we could use in our signal handler to turn off leak checking. Is that possible and if not, is that something that could get implemented in the future? Is this something that anyone else has ever requested ? Regards, Monty |
|
From: James R. <jam...@gm...> - 2020-05-20 16:31:13
|
On Wed, May 20, 2020 at 5:04 PM Tom Hughes <to...@co...> wrote: > On 20/05/2020 17:01, James Read wrote: > > > > > > On Wed, May 20, 2020 at 2:31 PM Tom Hughes <to...@co... > > <mailto:to...@co...>> wrote: > > > > On 20/05/2020 14:23, James Read wrote: > > > > > I'm trying to use valgrind to track down a memory leak in my web > > > crawling application. The problem is my application runs just fine > > > without valgrind but when I run it under valgrind the program > > crashes > > > before it has a chance to crawl any websites. Any ideas why this > > > behaviour could happen? > > > > On the basis of the information supplied I'd say it was > > caused by excess neutron flux in the discombobulator. > > > > Seriously, if you want anybody to actually try and answer > > your question then you'll have to provide some actual > > information like, what exactly it says... > > > > > > A typical run of my program gives the following output: > > > > Redis server: :0 > > Mongo server: 127.0.0.1:27017 <http://127.0.0.1:27017> > > URL file: links/links-2 > > Max connections: 1000 > > Selected JUST CRAWLER MODE > > > > Parsed sites: 132 ^C > > Crawler thread exiting. > > Exiting. > > > > But with valgrind ./crawler -c I get the following output: > > > > ==415433== Memcheck, a memory error detector > > ==415433== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et > al. > > ==415433== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright > > info > > ==415433== Command: ./crawler -c > > ==415433== > > Redis server: :0 > > Mongo server: 127.0.0.1:27017 <http://127.0.0.1:27017> > > URL file: links/links-2 > > Max connections: 1000 > > Selected JUST CRAWLER MODE > > ==415433== Warning: ignored attempt to set SIGKILL handler in > sigaction(); > > ==415433== the SIGKILL signal is uncatchable > > setrlimit() failed > > ==415433== > > ==415433== HEAP SUMMARY: > > ==415433== in use at exit: 37,773 bytes in 92 blocks > > ==415433== total heap usage: 6,112 allocs, 6,020 frees, 460,106 bytes > > allocated > > ==415433== > > ==415433== LEAK SUMMARY: > > ==415433== definitely lost: 0 bytes in 0 blocks > > ==415433== indirectly lost: 0 bytes in 0 blocks > > ==415433== possibly lost: 0 bytes in 0 blocks > > ==415433== still reachable: 37,773 bytes in 92 blocks > > ==415433== suppressed: 0 bytes in 0 blocks > > ==415433== Rerun with --leak-check=full to see details of leaked memory > > ==415433== > > ==415433== For lists of detected and suppressed errors, rerun with: -s > > ==415433== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) > > > > As you can see no Parsed sites: value message just crashes and burns. > > I don't see any crash there, just a program that has chosen to exit. > > Does your code exit when setrlimit fails? What sort of limit is it > trying to set? My guess is that it's trying to play with RLIMIT_NOFILE > in a way that would encroach on valgrind's reserved descriptors so > valgrind is refusing the request and your program is then chooding > to exist rather than continue. > > Good guess. Will comment out that code and try to run without. James Read > Tom > > -- > Tom Hughes (to...@co...) > http://compton.nu/ > |
|
From: Tom H. <to...@co...> - 2020-05-20 16:05:12
|
On 20/05/2020 17:01, James Read wrote: > > > On Wed, May 20, 2020 at 2:31 PM Tom Hughes <to...@co... > <mailto:to...@co...>> wrote: > > On 20/05/2020 14:23, James Read wrote: > > > I'm trying to use valgrind to track down a memory leak in my web > > crawling application. The problem is my application runs just fine > > without valgrind but when I run it under valgrind the program > crashes > > before it has a chance to crawl any websites. Any ideas why this > > behaviour could happen? > > On the basis of the information supplied I'd say it was > caused by excess neutron flux in the discombobulator. > > Seriously, if you want anybody to actually try and answer > your question then you'll have to provide some actual > information like, what exactly it says... > > > A typical run of my program gives the following output: > > Redis server: :0 > Mongo server: 127.0.0.1:27017 <http://127.0.0.1:27017> > URL file: links/links-2 > Max connections: 1000 > Selected JUST CRAWLER MODE > > Parsed sites: 132 ^C > Crawler thread exiting. > Exiting. > > But with valgrind ./crawler -c I get the following output: > > ==415433== Memcheck, a memory error detector > ==415433== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al. > ==415433== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright > info > ==415433== Command: ./crawler -c > ==415433== > Redis server: :0 > Mongo server: 127.0.0.1:27017 <http://127.0.0.1:27017> > URL file: links/links-2 > Max connections: 1000 > Selected JUST CRAWLER MODE > ==415433== Warning: ignored attempt to set SIGKILL handler in sigaction(); > ==415433== the SIGKILL signal is uncatchable > setrlimit() failed > ==415433== > ==415433== HEAP SUMMARY: > ==415433== in use at exit: 37,773 bytes in 92 blocks > ==415433== total heap usage: 6,112 allocs, 6,020 frees, 460,106 bytes > allocated > ==415433== > ==415433== LEAK SUMMARY: > ==415433== definitely lost: 0 bytes in 0 blocks > ==415433== indirectly lost: 0 bytes in 0 blocks > ==415433== possibly lost: 0 bytes in 0 blocks > ==415433== still reachable: 37,773 bytes in 92 blocks > ==415433== suppressed: 0 bytes in 0 blocks > ==415433== Rerun with --leak-check=full to see details of leaked memory > ==415433== > ==415433== For lists of detected and suppressed errors, rerun with: -s > ==415433== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) > > As you can see no Parsed sites: value message just crashes and burns. I don't see any crash there, just a program that has chosen to exit. Does your code exit when setrlimit fails? What sort of limit is it trying to set? My guess is that it's trying to play with RLIMIT_NOFILE in a way that would encroach on valgrind's reserved descriptors so valgrind is refusing the request and your program is then chooding to exist rather than continue. Tom -- Tom Hughes (to...@co...) http://compton.nu/ |
|
From: James R. <jam...@gm...> - 2020-05-20 16:01:41
|
On Wed, May 20, 2020 at 2:31 PM Tom Hughes <to...@co...> wrote: > On 20/05/2020 14:23, James Read wrote: > > > I'm trying to use valgrind to track down a memory leak in my web > > crawling application. The problem is my application runs just fine > > without valgrind but when I run it under valgrind the program crashes > > before it has a chance to crawl any websites. Any ideas why this > > behaviour could happen? > > On the basis of the information supplied I'd say it was > caused by excess neutron flux in the discombobulator. > > Seriously, if you want anybody to actually try and answer > your question then you'll have to provide some actual > information like, what exactly it says... > > A typical run of my program gives the following output: Redis server: :0 Mongo server: 127.0.0.1:27017 URL file: links/links-2 Max connections: 1000 Selected JUST CRAWLER MODE Parsed sites: 132 ^C Crawler thread exiting. Exiting. But with valgrind ./crawler -c I get the following output: ==415433== Memcheck, a memory error detector ==415433== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al. ==415433== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info ==415433== Command: ./crawler -c ==415433== Redis server: :0 Mongo server: 127.0.0.1:27017 URL file: links/links-2 Max connections: 1000 Selected JUST CRAWLER MODE ==415433== Warning: ignored attempt to set SIGKILL handler in sigaction(); ==415433== the SIGKILL signal is uncatchable setrlimit() failed ==415433== ==415433== HEAP SUMMARY: ==415433== in use at exit: 37,773 bytes in 92 blocks ==415433== total heap usage: 6,112 allocs, 6,020 frees, 460,106 bytes allocated ==415433== ==415433== LEAK SUMMARY: ==415433== definitely lost: 0 bytes in 0 blocks ==415433== indirectly lost: 0 bytes in 0 blocks ==415433== possibly lost: 0 bytes in 0 blocks ==415433== still reachable: 37,773 bytes in 92 blocks ==415433== suppressed: 0 bytes in 0 blocks ==415433== Rerun with --leak-check=full to see details of leaked memory ==415433== ==415433== For lists of detected and suppressed errors, rerun with: -s ==415433== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) As you can see no Parsed sites: value message just crashes and burns. James Read > Tom > > -- > Tom Hughes (to...@co...) > http://compton.nu/ > |
|
From: Tom H. <to...@co...> - 2020-05-20 13:32:08
|
On 20/05/2020 14:23, James Read wrote: > I'm trying to use valgrind to track down a memory leak in my web > crawling application. The problem is my application runs just fine > without valgrind but when I run it under valgrind the program crashes > before it has a chance to crawl any websites. Any ideas why this > behaviour could happen? On the basis of the information supplied I'd say it was caused by excess neutron flux in the discombobulator. Seriously, if you want anybody to actually try and answer your question then you'll have to provide some actual information like, what exactly it says... Tom -- Tom Hughes (to...@co...) http://compton.nu/ |
|
From: James R. <jam...@gm...> - 2020-05-20 13:24:17
|
Hi, I'm trying to use valgrind to track down a memory leak in my web crawling application. The problem is my application runs just fine without valgrind but when I run it under valgrind the program crashes before it has a chance to crawl any websites. Any ideas why this behaviour could happen? thanks James Read |
|
From: Mark W. <ma...@kl...> - 2020-05-20 10:33:56
|
Hi, On Tue, 2020-05-19 at 14:32 +0200, Julian Seward wrote: > A first release candidate for 3.16.0 is available at > https://sourceware.org/pub/valgrind/valgrind-3.16.0.RC2.tar.bz2 > (md5 = 21ac87434ed32bcfe5ea86a0978440ba) > > Please give it a try on platforms that are important for you. If no serious > issues are reported, the 3.16.0 final release will happen on 25 May, that is, > next Monday. Looks good! In case people want binaries to test, I made Fedora Rawhide test packages: (aarch64, armv7hl, i686, ppc64le, s390x, x86_64) https://bodhi.fedoraproject.org/updates/FEDORA-2020-31868cd970 And packages for stable Fedora and CentOS releases: https://copr.fedorainfracloud.org/coprs/mjw/valgrind-3.16.0/ (Epel for CentOS 7, aarch64 and x86_64 Epel for CentOS 8, aarch64 and x86_64 Fedora 30, aarch64, i386 and x86_64 Fedora 31, aarch64 and x86_64 Fedora 32, aarch64 and x86_64) Note to packagers who run make check and/or make regtest (which is optional, but recommended to check the sanity of the binaries), this now validates the docbookx xml documentation, so you'll need to have xmllint (from libxml2) and the docbook-dtds catalog installed. Cheers, Mark |
|
From: Julian S. <js...@ac...> - 2020-05-19 12:32:41
|
Greetings. A first release candidate for 3.16.0 is available at https://sourceware.org/pub/valgrind/valgrind-3.16.0.RC2.tar.bz2 (md5 = 21ac87434ed32bcfe5ea86a0978440ba) Please give it a try on platforms that are important for you. If no serious issues are reported, the 3.16.0 final release will happen on 25 May, that is, next Monday. J |
|
From: Patrick B. <Pat...@le...> - 2020-05-18 19:11:27
|
Le 18/05/2020 à 19:48, Julian Seward a écrit : > >> Program received signal SIGILL: Illegal instruction. >> >> Backtrace for this error: >> vex amd64->IR: unhandled instruction bytes: 0x62 0xF1 0xFD 0x8 0x6F 0x5 >> 0x25 0xA8 0x18 0x0 >> vex amd64->IR: REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0 >> vex amd64->IR: VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE >> vex amd64->IR: PFX.66=0 PFX.F2=0 PFX.F3=0 >> ==377969== valgrind: Unrecognised instruction at address 0xabf9581. >> ==377969== at 0xABF9581: opal_pointer_array_construct (in >> /opt/openmpi-GCC73/v3.1.x-20181010/lib/libopen-pal.so.40.10.3) > > It sounds like there's an instruction in libopen-pal.so.40.10.3 that > Valgrind doesn't like. What CPU does the machine have? > > J Hi Julian,*This machine is a fat node with 4 Intel Xeon Gold 6148 (20 cores each): vendor_id : GenuineIntel cpu family : 6 model : 85 model name : Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz stepping : 4 microcode : 0x2000064 cpu MHz : 2400.000 cache size : 28160 KB physical id : 3 siblings : 40 core id : 26 cpu cores : 20 apicid : 245 initial apicid : 245 fpu : yes fpu_exception : yes cpuid level : 22 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb cat_l3 cdp_l3 invpcid_single intel_ppin intel_pt ssbd mba ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts pku ospke md_clear spec_ctrl intel_stibp flush_l1d bogomips : 4806.68 clflush size : 64 cache_alignment : 64 address sizes : 46 bits physical, 48 bits virtual |
|
From: Tom H. <to...@co...> - 2020-05-18 18:36:52
|
On 18/05/2020 18:48, Julian Seward wrote: > >> Program received signal SIGILL: Illegal instruction. >> >> Backtrace for this error: >> vex amd64->IR: unhandled instruction bytes: 0x62 0xF1 0xFD 0x8 0x6F 0x5 >> 0x25 0xA8 0x18 0x0 >> vex amd64->IR: REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0 >> vex amd64->IR: VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE >> vex amd64->IR: PFX.66=0 PFX.F2=0 PFX.F3=0 >> ==377969== valgrind: Unrecognised instruction at address 0xabf9581. >> ==377969== at 0xABF9581: opal_pointer_array_construct (in >> /opt/openmpi-GCC73/v3.1.x-20181010/lib/libopen-pal.so.40.10.3) > > It sounds like there's an instruction in libopen-pal.so.40.10.3 that > Valgrind doesn't like. What CPU does the machine have? 0x62 is an EVEX prefix from the AVX512 extensions, so isn't supported yet. Tom -- Tom Hughes (to...@co...) http://compton.nu/ |
|
From: Julian S. <js...@ac...> - 2020-05-18 17:48:40
|
> Program received signal SIGILL: Illegal instruction. > > Backtrace for this error: > vex amd64->IR: unhandled instruction bytes: 0x62 0xF1 0xFD 0x8 0x6F 0x5 > 0x25 0xA8 0x18 0x0 > vex amd64->IR: REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0 > vex amd64->IR: VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE > vex amd64->IR: PFX.66=0 PFX.F2=0 PFX.F3=0 > ==377969== valgrind: Unrecognised instruction at address 0xabf9581. > ==377969== at 0xABF9581: opal_pointer_array_construct (in > /opt/openmpi-GCC73/v3.1.x-20181010/lib/libopen-pal.so.40.10.3) It sounds like there's an instruction in libopen-pal.so.40.10.3 that Valgrind doesn't like. What CPU does the machine have? J |
|
From: Patrick B. <Pat...@le...> - 2020-05-18 16:06:09
|
Hi, I'm new to valgrind. My goal is to investigate a possible memory problem in a large parallel MPI+OpenMP code. I've cloned Valgrind from git and built it with GCC7.3 and fortran 3.1 for mpicc (my application is built with the same environment). I'm using these 2 options: --enable-only64bit --with-mpicc=$(which mpicc) "mpirun -np 8 my_application" is working on my fat node (just to have few processes for the test, I use nearly 60GB of RAM over more than 1TB). It fails after some tenth of iterations. "mpirun -np 8 valgrind /bin/hostname" works too. So Valgrind seams working with MPI 3.1 compiled with GCC7.3. But "mpirun -np 8 valgrind ./my_application" immediately fails with: Program received signal SIGILL: Illegal instruction. Backtrace for this error: vex amd64->IR: unhandled instruction bytes: 0x62 0xF1 0xFD 0x8 0x6F 0x5 0x25 0xA8 0x18 0x0 vex amd64->IR: REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0 vex amd64->IR: VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE vex amd64->IR: PFX.66=0 PFX.F2=0 PFX.F3=0 ==377969== valgrind: Unrecognised instruction at address 0xabf9581. ==377969== at 0xABF9581: opal_pointer_array_construct (in /opt/openmpi-GCC73/v3.1.x-20181010/lib/libopen-pal.so.40.10.3) ==377969== by 0xAC1BA78: mca_base_var_init (in /opt/openmpi-GCC73/v3.1.x-20181010/lib/libopen-pal.so.40.10.3) ==377969== by 0xABFDE39: opal_init_util (in /opt/openmpi-GCC73/v3.1.x-20181010/lib/libopen-pal.so.40.10.3) ==377969== by 0x911AD60: ompi_mpi_init (in /opt/openmpi-GCC73/v3.1.x-20181010/lib/libmpi.so.40.10.3) ==377969== by 0x914BB34: PMPI_Init_thread (in /opt/openmpi-GCC73/v3.1.x-20181010/lib/libmpi.so.40.10.3) ==377969== by 0x8E97C1F: MPI_INIT_THREAD (in /opt/openmpi-GCC73/v3.1.x-20181010/lib/libmpi_mpifh.so.40.11.2) ==377969== by 0x543066: __mpi_m_MOD_init_mpi (mpi_m.f90:140) ==377969== by 0x411447: __yales2_m_MOD_init_yales2_env (yales2_m.f90:511) ==377969== by 0x411595: __yales2_m_MOD_run_yales2 (yales2_m.f90:378) ==377969== by 0x40B9E0: MAIN__ (3D_cylinder.f90:20) ==377969== by 0x40B9E0: main (3D_cylinder.f90:8) ==377969== Your program just tried to execute an instruction that Valgrind ==377969== did not recognise. There are two possible reasons for this. ==377969== 1. Your program has a bug and erroneously jumped to a non-code ==377969== location. If you are running Memcheck and you just saw a ==377969== warning about a bad jump, it's probably your program's fault. ==377969== 2. The instruction is legitimate but Valgrind doesn't handle it, ==377969== i.e. it's Valgrind's fault. If you think this is the case or ==377969== you are not sure, please let us know and we'll try to fix it. ==377969== Either way, Valgrind will now raise a SIGILL signal which will ==377969== probably kill your program. May be I've missed something ? I'm using master branch. The branch VALGRIND_3_16_BRANCH that I have tested do not build: make: *** Aucune règle pour fabriquer la cible « exp-sgcheck.supp », nécessaire pour « default.supp ». Arrêt. Thanks for your help Patrick |
|
From: John R. <jr...@bi...> - 2020-05-15 18:35:17
|
> Below is the error coming when running valgrind , Do not know how to tackle it > valgrind --tool=memcheck ./lte_tr069 > ==6660== > ==6660== Warning: Can't execute setuid/setgid/setcap executable: ./lte_tr069 > ==6660== Possible workaround: remove --trace-children=yes, if in effect > ==6660== > valgrind: ./lte_tr069: Permission denied READ THE Warning MESSAGE. It says *EXACTLY* what the problem is. ./lte_tr069 has one of the attributes setuid, or setgid, or setcap. If you're clever enough to set such an attribute then you should be clever enough to deal with the complaint from valgrind/memcheck. |
|
From: Kunal C. <atk...@gm...> - 2020-05-15 11:08:56
|
Hi Team, 1. I have created a simple program ./a.out and make a check if(RUNNING_ON_VALGRIND) but still same error is coming. 2. I am trying to run ./a.out on arm board as it is compile for it also valgrind is running there only Thanks Kunal On Fri, May 15, 2020 at 2:51 PM Paul FLOYD <pj...@wa...> wrote: > > > > Message du 15/05/20 10:22 > > De : "Kunal Chauhan" > > > > ==6660== Warning: Can't execute setuid/setgid/setcap executable: > ./lte_tr069 > > ==6660== Possible workaround: remove --trace-children=yes, if in effect > > ==6660== > > valgrind: ./lte_tr069: Permission denied > > > Hi > > The problem is that your application (or something that your application > is executing) is trying to switch user, group or capabilities. This is > likely to interfere with the workings of Valgrind. As an > example of this, Valgrind often needs to open new log files. This may fail > if Valgrind has switched user/group/caps. > > This is more likely to happen if you are running Valgrind on a wrapper > script and using the --trace-children=yes. For instance you might be on a > system where utilities like 'basename' and > 'dirname' are security hardened. If they are used in your wrapper this > could result in the problem you are seeing. > > Alternatively, if you are not using a wrapper, then the best workaround is > to have some special Valgrind handling in your executable. See > http://valgrind.org/docs/manual/manual-core- > adv.html#manual-core-adv.clientreq > <http://valgrind.org/docs/manual/manual-core-adv.html#manual-core-adv.clientreq> > > So you could write some code like > > if (!RUNNING_ON_VALGRIND) > { > // your setcap calls > } > > A+ > Paul > > > > _______________________________________________ > Valgrind-users mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-users > -- *Thanks with Regards!* *Kunal Chauhan* *Mob:09813614826* *Mob:08860397903* *E-mail:atk...@gm... <E-mail%3Aa...@gm...>* |
|
From: Paul F. <pj...@wa...> - 2020-05-15 09:19:52
|
> Message du 15/05/20 10:22 > De : "Kunal Chauhan" > ==6660== Warning: Can't execute setuid/setgid/setcap executable: ./lte_tr069 > ==6660== Possible workaround: remove --trace-children=yes, if in effect > ==6660== > valgrind: ./lte_tr069: Permission denied Hi The problem is that your application (or something that your application is executing) is trying to switch user, group or capabilities. This is likely to interfere with the workings of Valgrind. As an example of this, Valgrind often needs to open new log files. This may fail if Valgrind has switched user/group/caps. This is more likely to happen if you are running Valgrind on a wrapper script and using the --trace-children=yes. For instance you might be on a system where utilities like 'basename' and 'dirname' are security hardened. If they are used in your wrapper this could result in the problem you are seeing. Alternatively, if you are not using a wrapper, then the best workaround is to have some special Valgrind handling in your executable. See http://valgrind.org/docs/manual/manual-core- adv.html#manual-core-adv.clientreq So you could write some code like if (!RUNNING_ON_VALGRIND) { // your setcap calls } A+ Paul |
|
From: Kunal C. <atk...@gm...> - 2020-05-15 08:21:12
|
Hi Team, Below is the error coming when running valgrind , Do not know how to tackle it ............................. valgrind --tool=memcheck ./lte_tr069 ==6660== ==6660== Warning: Can't execute setuid/setgid/setcap executable: ./lte_tr069 ==6660== Possible workaround: remove --trace-children=yes, if in effect ==6660== valgrind: ./lte_tr069: Permission denied -- *Thanks with Regards!* *Kunal Chauhan* *Mob:09813614826* *Mob:08860397903* *E-mail:atk...@gm... <E-mail%3Aa...@gm...>* |
|
From: John R. <jr...@bi...> - 2020-05-14 14:37:08
|
On 5/14/20, Dalon Work wrote:
> I have an application with two threads on linux which I was testing
> under valgrind. We have a strange memory behavior that is leading to
> an internal assert terminating the application, hence the
> investigation. After running the application, I have many, many
> errors, (invalid reads and writes), which all follow a similar
> pattern:
>
> Invalid read/write of size 4:
> at thread1_func()
> at thread1_func1()
> ...
The rule for dealing with output from memcheck is:
Fix the first complaint. Do not ignore it; FIX IT.
Many times the command-line option --track-origins=yes provides helpful clues.
> Address 0x##### is ##### bytes inside a block of size #### free'd
> free()
> myAllocator::~myAllocator()
> __run_exit_handlers
> exit
> terminate_handler_sp
> thread2_func()
> thread2_func1()
> ...
>
> The call stack from the free() stack is obviously from the second
> thread (NOT the one that started the error, oddly enough), while the
> invalid read is from the first thread.
Based on this evidence, the most likely scenario is thread2 decided to
(or was forced to) terminate, and started running exit handlers
for the whole process. Meanwhile, thread1 continues running (thread1
has received no news that it should stop) and references blocks that
have been free()d by the termination handlers invoked by thread2.
So the termination handler should first rendezvous with all threads
before beginning to shutdown the process.
|