|
From: Julian S. <js...@ac...> - 2020-05-19 12:32:41
|
Greetings. A first release candidate for 3.16.0 is available at https://sourceware.org/pub/valgrind/valgrind-3.16.0.RC2.tar.bz2 (md5 = 21ac87434ed32bcfe5ea86a0978440ba) Please give it a try on platforms that are important for you. If no serious issues are reported, the 3.16.0 final release will happen on 25 May, that is, next Monday. J |
|
From: Mark W. <ma...@kl...> - 2020-05-20 10:33:56
|
Hi, On Tue, 2020-05-19 at 14:32 +0200, Julian Seward wrote: > A first release candidate for 3.16.0 is available at > https://sourceware.org/pub/valgrind/valgrind-3.16.0.RC2.tar.bz2 > (md5 = 21ac87434ed32bcfe5ea86a0978440ba) > > Please give it a try on platforms that are important for you. If no serious > issues are reported, the 3.16.0 final release will happen on 25 May, that is, > next Monday. Looks good! In case people want binaries to test, I made Fedora Rawhide test packages: (aarch64, armv7hl, i686, ppc64le, s390x, x86_64) https://bodhi.fedoraproject.org/updates/FEDORA-2020-31868cd970 And packages for stable Fedora and CentOS releases: https://copr.fedorainfracloud.org/coprs/mjw/valgrind-3.16.0/ (Epel for CentOS 7, aarch64 and x86_64 Epel for CentOS 8, aarch64 and x86_64 Fedora 30, aarch64, i386 and x86_64 Fedora 31, aarch64 and x86_64 Fedora 32, aarch64 and x86_64) Note to packagers who run make check and/or make regtest (which is optional, but recommended to check the sanity of the binaries), this now validates the docbookx xml documentation, so you'll need to have xmllint (from libxml2) and the docbook-dtds catalog installed. Cheers, Mark |
|
From: Patrick B. <Pat...@le...> - 2020-05-26 08:07:02
|
Le 19/05/2020 à 14:32, Julian Seward a écrit : > > Greetings. > > A first release candidate for 3.16.0 is available at > https://sourceware.org/pub/valgrind/valgrind-3.16.0.RC2.tar.bz2 > (md5 = 21ac87434ed32bcfe5ea86a0978440ba) > > Please give it a try on platforms that are important for you. If no > serious > issues are reported, the 3.16.0 final release will happen on 25 May, > that is, > next Monday. > > J Hi all, valgrind-3.16.0.RC2 doesn't work for me (as previous version on this server). _*My fortran test program (error prune I think) is as simple as:*_ PROGRAM reduce USE mpi IMPLICIT NONE INTEGER :: me, ncpus, ierr REAL :: buff, resu=0 CALL MPI_INIT(ierr) CALL MPI_COMM_RANK(MPI_COMM_WORLD,me,ierr) CALL MPI_COMM_SIZE(MPI_COMM_WORLD,ncpus,ierr) buff=1 CALL MPI_ALLREDUCE(buff,resu,1,MPI_REAL,MPI_SUM,MPI_COMM_WORLD,ierr) if (me == 0 ) WRITE(6,'(a,i0,2(a,f14.6))') 'On ',me,' I have ',buff,' and got ',resu CALL MPI_FINALIZE(ierr) END PROGRAM reduce _*Compilation with:*_ mpifort reduce.F90 -o reduce mpifort --show /opt/GCC73/bin/gfortran -I/opt/openmpi-GCC73/v3.1.x-20181010/include -pthread -I/opt/openmpi-GCC73/v3.1.x-20181010/lib -Wl,-rpath -Wl,/opt/openmpi-GCC73/v3.1.x-20181010/lib -Wl,--enable-new-dtags -L/opt/openmpi-GCC73/v3.1.x-20181010/lib -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi _*OS is *_CentOS Linux release 7.7.1908 (Core) _*Valgrind compiled with gcc7.3, configure options are:*_ ./configure --enable-only64bit --with-mpicc=$(which mpicc) --prefix=/robin/data/begou/VALGRIND/valgrind-binaries *Hardware is:* Dell Poweredge R940 4 x Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz (laubnched Q3 2017) 1.5 TB of RAM _*Compiler*_ is gcc (GCC) 7.3.0 (january 2018) _*OpenMPI *_is v3.1. from the git repo 2018/10/10 (because a patch was needed at this time) compiled with gcc7.3.0 *configure options are:* --prefix=/opt/openmpi-GCC73/v3.1.x-20181010' '--enable-mpirun-prefix-by-default' '--disable-dlopen' '--enable-mca-no-build=openib' '- -without-verbs' '--enable-mpi-cxx' '--without-slurm' '--enable-mpi-thread-multiple _*Error is:*_ [begou@grivola TESTS]$valgrind --version valgrind-3.16.0.RC2 [begou@grivola TESTS]$mpirun -np 2 valgrind ./reduce ==306850== Memcheck, a memory error detector ==306850== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al. ==306850== Using Valgrind-3.16.0.RC2 and LibVEX; rerun with -h for copyright info ==306850== Command: ./reduce ==306850== ==306851== Memcheck, a memory error detector ==306851== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al. ==306851== Using Valgrind-3.16.0.RC2 and LibVEX; rerun with -h for copyright info ==306851== Command: ./reduce ==306851== vex amd64->IR: unhandled instruction bytes: 0x62 0xF1 0xFD 0x8 0x6F 0x5 0x25 0xA8 0x18 0x0 vex amd64->IR: REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0 vex amd64->IR: VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE vex amd64->IR: PFX.66=0 PFX.F2=0 PFX.F3=0 ==306850== valgrind: Unrecognised instruction at address 0x6ddf581. vex amd64->IR: unhandled instruction bytes: 0x62 0xF1 0xFD 0x8 0x6F 0x5 0x25 0xA8 0x18 0x0 vex amd64->IR: REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0 vex amd64->IR: VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE vex amd64->IR: PFX.66=0 PFX.F2=0 PFX.F3=0 ==306851== valgrind: Unrecognised instruction at address 0x6ddf581. ==306851== at 0x6DDF581: opal_pointer_array_construct (in /opt/openmpi-GCC73/v3.1.x-20181010/lib/libopen-pal.so.40.10.3) ==306851== by 0x6E01A78: mca_base_var_init (in /opt/openmpi-GCC73/v3.1.x-20181010/lib/libopen-pal.so.40.10.3) ==306851== by 0x6DE3E39: opal_init_util (in /opt/openmpi-GCC73/v3.1.x-20181010/lib/libopen-pal.so.40.10.3) ==306851== by 0x552ED60: ompi_mpi_init (in /opt/openmpi-GCC73/v3.1.x-20181010/lib/libmpi.so.40.10.3) ==306851== by 0x555F9ED: PMPI_Init (in /opt/openmpi-GCC73/v3.1.x-20181010/lib/libmpi.so.40.10.3) ==306851== by 0x52ABBB7: PMPI_INIT (in /opt/openmpi-GCC73/v3.1.x-20181010/lib/libmpi_mpifh.so.40.11.2) ==306851== by 0x400CDD: MAIN__ (in /HA/sources/begou/TESTS/reduce) ==306851== by 0x400E8C: main (in /HA/sources/begou/TESTS/reduce) ==306851== Your program just tried to execute an instruction that Valgrind ==306851== did not recognise. There are two possible reasons for this. ==306851== 1. Your program has a bug and erroneously jumped to a non-code ==306851== location. If you are running Memcheck and you just saw a ==306851== warning about a bad jump, it's probably your program's fault. ==306851== 2. The instruction is legitimate but Valgrind doesn't handle it, ==306851== i.e. it's Valgrind's fault. If you think this is the case or ==306851== you are not sure, please let us know and we'll try to fix it. ==306851== Either way, Valgrind will now raise a SIGILL signal which will ==306851== probably kill your program. ==306850== at 0x6DDF581: opal_pointer_array_construct (in /opt/openmpi-GCC73/v3.1.x-20181010/lib/libopen-pal.so.40.10.3) ==306850== by 0x6E01A78: mca_base_var_init (in /opt/openmpi-GCC73/v3.1.x-20181010/lib/libopen-pal.so.40.10.3) ==306850== by 0x6DE3E39: opal_init_util (in /opt/openmpi-GCC73/v3.1.x-20181010/lib/libopen-pal.so.40.10.3) ==306850== by 0x552ED60: ompi_mpi_init (in /opt/openmpi-GCC73/v3.1.x-20181010/lib/libmpi.so.40.10.3) ==306850== by 0x555F9ED: PMPI_Init (in /opt/openmpi-GCC73/v3.1.x-20181010/lib/libmpi.so.40.10.3) ==306850== by 0x52ABBB7: PMPI_INIT (in /opt/openmpi-GCC73/v3.1.x-20181010/lib/libmpi_mpifh.so.40.11.2) ==306850== by 0x400CDD: MAIN__ (in /HA/sources/begou/TESTS/reduce) ==306850== by 0x400E8C: main (in /HA/sources/begou/TESTS/reduce) ==306850== Your program just tried to execute an instruction that Valgrind ==306850== did not recognise. There are two possible reasons for this. ==306850== 1. Your program has a bug and erroneously jumped to a non-code ==306850== location. If you are running Memcheck and you just saw a ==306850== warning about a bad jump, it's probably your program's fault. ==306850== 2. The instruction is legitimate but Valgrind doesn't handle it, ==306850== i.e. it's Valgrind's fault. If you think this is the case or ==306850== you are not sure, please let us know and we'll try to fix it. ==306850== Either way, Valgrind will now raise a SIGILL signal which will ==306850== probably kill your program. Program received signal SIGILL: Illegal instruction. Program received signal SIGILL: Illegal instruction. Backtrace for this error: Backtrace for this error: #0 0x66ce3af in ??? #0 0x66ce3af in ??? #1 0x6ddf581 in ??? #2 0x6e01a78 in ??? #3 0x6de3e39 in ??? #1 0x6ddf581 in ??? #4 0x552ed60 in ??? #5 0x555f9ed in ??? #6 0x52abbb7 in ??? #2 0x6e01a78 in ??? #7 0x400cdd in ??? #8 0x400e8c in ??? #9 0x66ba504 in ??? #10 0x400c18 in ??? #3 0x6de3e39 in ??? #11 0xffffffffffffffff in ??? #4 0x552ed60 in ??? #5 0x555f9ed in ??? #6 0x52abbb7 in ??? #7 0x400cdd in ??? #8 0x400e8c in ??? ==306851== ==306851== Process terminating with default action of signal 4 (SIGILL) #9 0x66ba504 in ??? ==306851== at 0x648B4BB: raise (in /usr/lib64/libpthread-2.17.so) ==306851== by 0x66CE3AF: ??? (in /usr/lib64/libc-2.17.so) ==306851== by 0x6DDF580: opal_pointer_array_construct (in /opt/openmpi-GCC73/v3.1.x-20181010/lib/libopen-pal.so.40.10.3) ==306851== by 0x6E01A78: mca_base_var_init (in /opt/openmpi-GCC73/v3.1.x-20181010/lib/libopen-pal.so.40.10.3) ==306851== by 0x6DE3E39: opal_init_util (in /opt/openmpi-GCC73/v3.1.x-20181010/lib/libopen-pal.so.40.10.3) ==306851== by 0x552ED60: ompi_mpi_init (in /opt/openmpi-GCC73/v3.1.x-20181010/lib/libmpi.so.40.10.3) ==306851== by 0x555F9ED: PMPI_Init (in /opt/openmpi-GCC73/v3.1.x-20181010/lib/libmpi.so.40.10.3) ==306851== by 0x52ABBB7: PMPI_INIT (in /opt/openmpi-GCC73/v3.1.x-20181010/lib/libmpi_mpifh.so.40.11.2) ==306851== by 0x400CDD: MAIN__ (in /HA/sources/begou/TESTS/reduce) ==306851== by 0x400E8C: main (in /HA/sources/begou/TESTS/reduce) #10 0x400c18 in ??? #11 0xffffffffffffffff in ??? ==306850== ==306850== Process terminating with default action of signal 4 (SIGILL) ==306850== at 0x648B4BB: raise (in /usr/lib64/libpthread-2.17.so) ==306850== by 0x66CE3AF: ??? (in /usr/lib64/libc-2.17.so) ==306850== by 0x6DDF580: opal_pointer_array_construct (in /opt/openmpi-GCC73/v3.1.x-20181010/lib/libopen-pal.so.40.10.3) ==306850== by 0x6E01A78: mca_base_var_init (in /opt/openmpi-GCC73/v3.1.x-20181010/lib/libopen-pal.so.40.10.3) ==306850== by 0x6DE3E39: opal_init_util (in /opt/openmpi-GCC73/v3.1.x-20181010/lib/libopen-pal.so.40.10.3) ==306850== by 0x552ED60: ompi_mpi_init (in /opt/openmpi-GCC73/v3.1.x-20181010/lib/libmpi.so.40.10.3) ==306850== by 0x555F9ED: PMPI_Init (in /opt/openmpi-GCC73/v3.1.x-20181010/lib/libmpi.so.40.10.3) ==306850== by 0x52ABBB7: PMPI_INIT (in /opt/openmpi-GCC73/v3.1.x-20181010/lib/libmpi_mpifh.so.40.11.2) ==306850== by 0x400CDD: MAIN__ (in /HA/sources/begou/TESTS/reduce) ==306850== by 0x400E8C: main (in /HA/sources/begou/TESTS/reduce) ==306851== ==306851== HEAP SUMMARY: ==306851== in use at exit: 8,830 bytes in 65 blocks ==306851== total heap usage: 123 allocs, 58 frees, 90,778 bytes allocated ==306851== ==306850== ==306850== HEAP SUMMARY: ==306850== in use at exit: 8,830 bytes in 65 blocks ==306850== total heap usage: 123 allocs, 58 frees, 90,778 bytes allocated ==306850== ==306851== LEAK SUMMARY: ==306851== definitely lost: 0 bytes in 0 blocks ==306851== indirectly lost: 0 bytes in 0 blocks ==306851== possibly lost: 0 bytes in 0 blocks ==306851== still reachable: 8,830 bytes in 65 blocks ==306851== suppressed: 0 bytes in 0 blocks ==306851== Rerun with --leak-check=full to see details of leaked memory ==306851== ==306851== For lists of detected and suppressed errors, rerun with: -s ==306851== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) ==306850== LEAK SUMMARY: ==306850== definitely lost: 0 bytes in 0 blocks ==306850== indirectly lost: 0 bytes in 0 blocks ==306850== possibly lost: 0 bytes in 0 blocks ==306850== still reachable: 8,830 bytes in 65 blocks ==306850== suppressed: 0 bytes in 0 blocks ==306850== Rerun with --leak-check=full to see details of leaked memory ==306850== ==306850== For lists of detected and suppressed errors, rerun with: -s ==306850== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) -------------------------------------------------------------------------- Primary job terminated normally, but 1 process returned a non-zero exit code. Per user-direction, the job has been aborted. -------------------------------------------------------------------------- -------------------------------------------------------------------------- mpirun noticed that process rank 0 with PID 0 on node grivola exited on signal 4 (Illegal instruction). Thanks Patrick |
|
From: Tom H. <to...@co...> - 2020-05-26 08:12:57
|
On 26/05/2020 09:06, Patrick Bégou wrote: > valgrind-3.16.0.RC2 doesn't work for me (as previous version on this > server). Are you saying that it fails on a binary that worked before? > vex amd64->IR: unhandled instruction bytes: 0x62 0xF1 0xFD 0x8 0x6F 0x5 > 0x25 0xA8 0x18 0x0 > vex amd64->IR: REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0 > vex amd64->IR: VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE > vex amd64->IR: PFX.66=0 PFX.F2=0 PFX.F3=0 > ==306850== valgrind: Unrecognised instruction at address 0x6ddf581. > vex amd64->IR: unhandled instruction bytes: 0x62 0xF1 0xFD 0x8 0x6F 0x5 > 0x25 0xA8 0x18 0x0 > vex amd64->IR: REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0 > vex amd64->IR: VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE > vex amd64->IR: PFX.66=0 PFX.F2=0 PFX.F3=0 Because this is an instruction with an EVEX prefix that is not supported by any version of valgrind ever so I don't see how this binary can have worked with the previous version of valgrind. I suspect that you have in fact recompiled the program with a different compiler or different optimization settings since the time when it worked? Tom -- Tom Hughes (to...@co...) http://compton.nu/ |
|
From: Patrick B. <Pat...@le...> - 2020-05-26 08:26:26
|
Hi Tom, I'm a new user of Valgrind. I was needing it to check a large mpi code. So I downloaded 3.15 version but even if hardware and software are 2 to 3 years old, valgrind does'nt work for me. Nor gcc7, nor OpenMPI, nor my application (even the small test) used specific option when they were built. If this unsupported instruction (I do not know what is an EVEX prefix, sorry) is the problem, how can I avoid it to use valgrind ? I was just thinking that 3.16 could solve my problem.... Patrick Le 26/05/2020 à 10:12, Tom Hughes a écrit : > On 26/05/2020 09:06, Patrick Bégou wrote: > >> valgrind-3.16.0.RC2 doesn't work for me (as previous version on this >> server). > > Are you saying that it fails on a binary that worked before? > >> vex amd64->IR: unhandled instruction bytes: 0x62 0xF1 0xFD 0x8 0x6F >> 0x5 0x25 0xA8 0x18 0x0 >> vex amd64->IR: REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0 >> vex amd64->IR: VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE >> vex amd64->IR: PFX.66=0 PFX.F2=0 PFX.F3=0 >> ==306850== valgrind: Unrecognised instruction at address 0x6ddf581. >> vex amd64->IR: unhandled instruction bytes: 0x62 0xF1 0xFD 0x8 0x6F >> 0x5 0x25 0xA8 0x18 0x0 >> vex amd64->IR: REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0 >> vex amd64->IR: VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE >> vex amd64->IR: PFX.66=0 PFX.F2=0 PFX.F3=0 > > Because this is an instruction with an EVEX prefix that is not > supported by any version of valgrind ever so I don't see how this > binary can have worked with the previous version of valgrind. > > I suspect that you have in fact recompiled the program with > a different compiler or different optimization settings since > the time when it worked? > > Tom > |
|
From: Julian S. <js...@ac...> - 2020-05-26 08:38:41
|
You can't easily avoid this problem, because it occurs in a system library, not in your own code: ==306851== valgrind: Unrecognised instruction at address 0x6ddf581. ==306851== at 0x6DDF581: opal_pointer_array_construct (in /opt/openmpi-GCC73/v3.1.x-20181010/lib/libopen-pal.so.40.10.3) Possibly your least-worst option is to talk with the people who built/ installed OpenMPI on the machine, to see if you can get a build that doesn't use AVX512 instructions. J |
|
From: Tom H. <to...@co...> - 2020-05-26 08:46:20
|
Sorry, I misunderstood what you meant by "as previous version" there. I thought you meant the previous version worked but you actually meant that it failed. As Julian says there is no easy fix - you have a library installed that has been compiled to assume certain instructions are available that are not in fact available under valgrind at the moment. Tom On 26/05/2020 09:26, Patrick Bégou wrote: > Hi Tom, > > I'm a new user of Valgrind. I was needing it to check a large mpi code. > So I downloaded 3.15 version but even if hardware and software are 2 to > 3 years old, valgrind does'nt work for me. > Nor gcc7, nor OpenMPI, nor my application (even the small test) used > specific option when they were built. > > If this unsupported instruction (I do not know what is an EVEX prefix, > sorry) is the problem, how can I avoid it to use valgrind ? > > I was just thinking that 3.16 could solve my problem.... > > Patrick > > > > Le 26/05/2020 à 10:12, Tom Hughes a écrit : >> On 26/05/2020 09:06, Patrick Bégou wrote: >> >>> valgrind-3.16.0.RC2 doesn't work for me (as previous version on this >>> server). >> >> Are you saying that it fails on a binary that worked before? >> >>> vex amd64->IR: unhandled instruction bytes: 0x62 0xF1 0xFD 0x8 0x6F >>> 0x5 0x25 0xA8 0x18 0x0 >>> vex amd64->IR: REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0 >>> vex amd64->IR: VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE >>> vex amd64->IR: PFX.66=0 PFX.F2=0 PFX.F3=0 >>> ==306850== valgrind: Unrecognised instruction at address 0x6ddf581. >>> vex amd64->IR: unhandled instruction bytes: 0x62 0xF1 0xFD 0x8 0x6F >>> 0x5 0x25 0xA8 0x18 0x0 >>> vex amd64->IR: REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0 >>> vex amd64->IR: VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE >>> vex amd64->IR: PFX.66=0 PFX.F2=0 PFX.F3=0 >> >> Because this is an instruction with an EVEX prefix that is not >> supported by any version of valgrind ever so I don't see how this >> binary can have worked with the previous version of valgrind. >> >> I suspect that you have in fact recompiled the program with >> a different compiler or different optimization settings since >> the time when it worked? >> >> Tom >> > > > > _______________________________________________ > Valgrind-users mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-users > -- Tom Hughes (to...@co...) http://compton.nu/ |
|
From: Patrick B. <Pat...@le...> - 2020-05-26 09:07:54
|
Thanks all for these precisions. I have deployed OpenMPI myself. So I have to build it again, disabling AVX512 optimizations. This level of optimization is used in all our CFD codes and libraries as it improve the global code performances. Patrick Le 26/05/2020 à 10:45, Tom Hughes a écrit : > Sorry, I misunderstood what you meant by "as previous version" there. > > I thought you meant the previous version worked but you actually > meant that it failed. > > As Julian says there is no easy fix - you have a library installed > that has been compiled to assume certain instructions are available > that are not in fact available under valgrind at the moment. > > Tom > > On 26/05/2020 09:26, Patrick Bégou wrote: >> Hi Tom, >> >> I'm a new user of Valgrind. I was needing it to check a large mpi code. >> So I downloaded 3.15 version but even if hardware and software are 2 to >> 3 years old, valgrind does'nt work for me. >> Nor gcc7, nor OpenMPI, nor my application (even the small test) used >> specific option when they were built. >> >> If this unsupported instruction (I do not know what is an EVEX prefix, >> sorry) is the problem, how can I avoid it to use valgrind ? >> >> I was just thinking that 3.16 could solve my problem.... >> >> Patrick >> >> >> >> Le 26/05/2020 à 10:12, Tom Hughes a écrit : >>> On 26/05/2020 09:06, Patrick Bégou wrote: >>> >>>> valgrind-3.16.0.RC2 doesn't work for me (as previous version on this >>>> server). >>> >>> Are you saying that it fails on a binary that worked before? >>> >>>> vex amd64->IR: unhandled instruction bytes: 0x62 0xF1 0xFD 0x8 0x6F >>>> 0x5 0x25 0xA8 0x18 0x0 >>>> vex amd64->IR: REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0 >>>> vex amd64->IR: VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE >>>> vex amd64->IR: PFX.66=0 PFX.F2=0 PFX.F3=0 >>>> ==306850== valgrind: Unrecognised instruction at address 0x6ddf581. >>>> vex amd64->IR: unhandled instruction bytes: 0x62 0xF1 0xFD 0x8 0x6F >>>> 0x5 0x25 0xA8 0x18 0x0 >>>> vex amd64->IR: REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0 >>>> vex amd64->IR: VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE >>>> vex amd64->IR: PFX.66=0 PFX.F2=0 PFX.F3=0 >>> >>> Because this is an instruction with an EVEX prefix that is not >>> supported by any version of valgrind ever so I don't see how this >>> binary can have worked with the previous version of valgrind. >>> >>> I suspect that you have in fact recompiled the program with >>> a different compiler or different optimization settings since >>> the time when it worked? >>> >>> Tom >>> >> >> >> >> _______________________________________________ >> Valgrind-users mailing list >> Val...@li... >> https://lists.sourceforge.net/lists/listinfo/valgrind-users >> > > |
|
From: Tom H. <to...@co...> - 2020-05-26 09:23:37
|
That's correct as AVX512 is not currently supported in valgrind so you will need a version that doesn't use that for valgrind use. Progress on adding AVX512 support is being tracked here: https://bugs.kde.org/show_bug.cgi?id=383010 Tom On 26/05/2020 10:07, Patrick Bégou wrote: > Thanks all for these precisions. > I have deployed OpenMPI myself. So I have to build it again, disabling > AVX512 optimizations. This level of optimization is used in all our CFD > codes and libraries as it improve the global code performances. > > Patrick > > Le 26/05/2020 à 10:45, Tom Hughes a écrit : >> Sorry, I misunderstood what you meant by "as previous version" there. >> >> I thought you meant the previous version worked but you actually >> meant that it failed. >> >> As Julian says there is no easy fix - you have a library installed >> that has been compiled to assume certain instructions are available >> that are not in fact available under valgrind at the moment. >> >> Tom >> >> On 26/05/2020 09:26, Patrick Bégou wrote: >>> Hi Tom, >>> >>> I'm a new user of Valgrind. I was needing it to check a large mpi code. >>> So I downloaded 3.15 version but even if hardware and software are 2 to >>> 3 years old, valgrind does'nt work for me. >>> Nor gcc7, nor OpenMPI, nor my application (even the small test) used >>> specific option when they were built. >>> >>> If this unsupported instruction (I do not know what is an EVEX prefix, >>> sorry) is the problem, how can I avoid it to use valgrind ? >>> >>> I was just thinking that 3.16 could solve my problem.... >>> >>> Patrick >>> >>> >>> >>> Le 26/05/2020 à 10:12, Tom Hughes a écrit : >>>> On 26/05/2020 09:06, Patrick Bégou wrote: >>>> >>>>> valgrind-3.16.0.RC2 doesn't work for me (as previous version on this >>>>> server). >>>> >>>> Are you saying that it fails on a binary that worked before? >>>> >>>>> vex amd64->IR: unhandled instruction bytes: 0x62 0xF1 0xFD 0x8 0x6F >>>>> 0x5 0x25 0xA8 0x18 0x0 >>>>> vex amd64->IR: REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0 >>>>> vex amd64->IR: VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE >>>>> vex amd64->IR: PFX.66=0 PFX.F2=0 PFX.F3=0 >>>>> ==306850== valgrind: Unrecognised instruction at address 0x6ddf581. >>>>> vex amd64->IR: unhandled instruction bytes: 0x62 0xF1 0xFD 0x8 0x6F >>>>> 0x5 0x25 0xA8 0x18 0x0 >>>>> vex amd64->IR: REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0 >>>>> vex amd64->IR: VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE >>>>> vex amd64->IR: PFX.66=0 PFX.F2=0 PFX.F3=0 >>>> >>>> Because this is an instruction with an EVEX prefix that is not >>>> supported by any version of valgrind ever so I don't see how this >>>> binary can have worked with the previous version of valgrind. >>>> >>>> I suspect that you have in fact recompiled the program with >>>> a different compiler or different optimization settings since >>>> the time when it worked? >>>> >>>> Tom >>>> >>> >>> >>> >>> _______________________________________________ >>> Valgrind-users mailing list >>> Val...@li... >>> https://lists.sourceforge.net/lists/listinfo/valgrind-users >>> >> >> > > > > _______________________________________________ > Valgrind-users mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-users > -- Tom Hughes (to...@co...) http://compton.nu/ |