You can subscribe to this list here.
2003 |
Jan
|
Feb
|
Mar
(58) |
Apr
(261) |
May
(169) |
Jun
(214) |
Jul
(201) |
Aug
(219) |
Sep
(198) |
Oct
(203) |
Nov
(241) |
Dec
(94) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2004 |
Jan
(137) |
Feb
(149) |
Mar
(150) |
Apr
(193) |
May
(95) |
Jun
(173) |
Jul
(137) |
Aug
(236) |
Sep
(157) |
Oct
(150) |
Nov
(136) |
Dec
(90) |
2005 |
Jan
(139) |
Feb
(130) |
Mar
(274) |
Apr
(138) |
May
(184) |
Jun
(152) |
Jul
(261) |
Aug
(409) |
Sep
(239) |
Oct
(241) |
Nov
(260) |
Dec
(137) |
2006 |
Jan
(191) |
Feb
(142) |
Mar
(169) |
Apr
(75) |
May
(141) |
Jun
(169) |
Jul
(131) |
Aug
(141) |
Sep
(192) |
Oct
(176) |
Nov
(142) |
Dec
(95) |
2007 |
Jan
(98) |
Feb
(120) |
Mar
(93) |
Apr
(96) |
May
(95) |
Jun
(65) |
Jul
(62) |
Aug
(56) |
Sep
(53) |
Oct
(95) |
Nov
(106) |
Dec
(87) |
2008 |
Jan
(58) |
Feb
(149) |
Mar
(175) |
Apr
(110) |
May
(106) |
Jun
(72) |
Jul
(55) |
Aug
(89) |
Sep
(26) |
Oct
(96) |
Nov
(83) |
Dec
(93) |
2009 |
Jan
(97) |
Feb
(106) |
Mar
(74) |
Apr
(64) |
May
(115) |
Jun
(83) |
Jul
(137) |
Aug
(103) |
Sep
(56) |
Oct
(59) |
Nov
(61) |
Dec
(37) |
2010 |
Jan
(94) |
Feb
(71) |
Mar
(53) |
Apr
(105) |
May
(79) |
Jun
(111) |
Jul
(110) |
Aug
(81) |
Sep
(50) |
Oct
(82) |
Nov
(49) |
Dec
(21) |
2011 |
Jan
(87) |
Feb
(105) |
Mar
(108) |
Apr
(99) |
May
(91) |
Jun
(94) |
Jul
(114) |
Aug
(77) |
Sep
(58) |
Oct
(58) |
Nov
(131) |
Dec
(62) |
2012 |
Jan
(76) |
Feb
(93) |
Mar
(68) |
Apr
(95) |
May
(62) |
Jun
(109) |
Jul
(90) |
Aug
(87) |
Sep
(49) |
Oct
(54) |
Nov
(66) |
Dec
(84) |
2013 |
Jan
(67) |
Feb
(52) |
Mar
(93) |
Apr
(65) |
May
(33) |
Jun
(34) |
Jul
(52) |
Aug
(42) |
Sep
(52) |
Oct
(48) |
Nov
(66) |
Dec
(14) |
2014 |
Jan
(66) |
Feb
(51) |
Mar
(34) |
Apr
(47) |
May
(58) |
Jun
(27) |
Jul
(52) |
Aug
(41) |
Sep
(78) |
Oct
(30) |
Nov
(28) |
Dec
(26) |
2015 |
Jan
(41) |
Feb
(42) |
Mar
(20) |
Apr
(73) |
May
(31) |
Jun
(48) |
Jul
(23) |
Aug
(55) |
Sep
(36) |
Oct
(47) |
Nov
(48) |
Dec
(41) |
2016 |
Jan
(32) |
Feb
(34) |
Mar
(33) |
Apr
(22) |
May
(14) |
Jun
(31) |
Jul
(29) |
Aug
(41) |
Sep
(17) |
Oct
(27) |
Nov
(38) |
Dec
(28) |
2017 |
Jan
(28) |
Feb
(30) |
Mar
(16) |
Apr
(9) |
May
(27) |
Jun
(57) |
Jul
(28) |
Aug
(43) |
Sep
(31) |
Oct
(20) |
Nov
(24) |
Dec
(18) |
2018 |
Jan
(34) |
Feb
(50) |
Mar
(18) |
Apr
(26) |
May
(13) |
Jun
(31) |
Jul
(13) |
Aug
(11) |
Sep
(15) |
Oct
(12) |
Nov
(18) |
Dec
(13) |
2019 |
Jan
(12) |
Feb
(29) |
Mar
(51) |
Apr
(22) |
May
(13) |
Jun
(20) |
Jul
(13) |
Aug
(12) |
Sep
(21) |
Oct
(6) |
Nov
(9) |
Dec
(5) |
2020 |
Jan
(13) |
Feb
(5) |
Mar
(25) |
Apr
(4) |
May
(40) |
Jun
(27) |
Jul
(5) |
Aug
(17) |
Sep
(21) |
Oct
(1) |
Nov
(5) |
Dec
(15) |
2021 |
Jan
(28) |
Feb
(6) |
Mar
(11) |
Apr
(5) |
May
(7) |
Jun
(8) |
Jul
(5) |
Aug
(5) |
Sep
(11) |
Oct
(9) |
Nov
(10) |
Dec
(12) |
2022 |
Jan
(7) |
Feb
(13) |
Mar
(8) |
Apr
(7) |
May
(12) |
Jun
(27) |
Jul
(14) |
Aug
(27) |
Sep
(27) |
Oct
(17) |
Nov
(17) |
Dec
|
2023 |
Jan
(10) |
Feb
(18) |
Mar
(9) |
Apr
(26) |
May
|
Jun
(13) |
Jul
(18) |
Aug
(5) |
Sep
(12) |
Oct
(16) |
Nov
(1) |
Dec
|
2024 |
Jan
(4) |
Feb
(3) |
Mar
(6) |
Apr
(17) |
May
(2) |
Jun
(33) |
Jul
(13) |
Aug
(1) |
Sep
(6) |
Oct
(8) |
Nov
(6) |
Dec
(15) |
2025 |
Jan
(5) |
Feb
(11) |
Mar
(8) |
Apr
(20) |
May
(1) |
Jun
|
Jul
|
Aug
(9) |
Sep
(1) |
Oct
|
Nov
|
Dec
|
From: Tia N. <new...@gm...> - 2023-08-02 15:27:32
|
hi Mark, Thanks for you reply! First, sorry but I'm cutting and pasting below your response from off the list archive below since I didn't receive an email with your reply. As far as your question about the tool I'm building...I'm building a C code tracing tool for educational purposes, and using valgrind as the backend of this tool. The resulting tool will be similar to Python tutor but with some differences. We also have an assembly code tracing tool that uses valgrind as the backend that we are close to completing. It doesn't work for ARM due to what I think is a vex bug in the ARM version ( https://bugs.kde.org/show_bug.cgi?id=460951), but it works for x86 architectures. The StackBlock and GlobalBlock structs and the interface for getting these are very useful for the tool I'm building. I really like the interfaces you were about to remove, but I'd like to add an TyEnt * entry to StackBlock and GlobalBlock to get detailed type information so that I can interpret and print out bytes of variables, and the memory they point to in some cases, in terms of the variable's specific C type. I believe the TyEnt has all the information I would need, but perhaps I am mistaken. What are your thoughts? Thanks, Tia Hi Tia, On Tue, 2023-08-01 at 08:56 -0500, Tia Newhall wrote: > I'm building a valgrind tool, and as part of its functionality I need to get (and ultimately print out) local and global variable values based on their C types. However, I do not see a way to get detailed type information for locals and globals via the tool public interface. > > The stack and global structs exported via VG_(di_get_global_blocks_from_dihandle) and VG_(di_get_stack_blocks_at_ip) give me back structs that include the variable name, the address, total size in bytes, and if it is an array or not, but this is not sufficient for my purposes. For example, if the variable is an array of 16 bytes I have no idea if it is an array of char, short, int, unsigned int, int *, etc., and if it is a struct or union I have no idea where the field types are, their names, nor their offsets and sizes from the base address, and if it is an array of structs or unions I have no idea if there is padding between elements or not, and enum and typedefs I'd just be out of luck. Even for non-array base types, I don't know if the value is signed or unsigned (ex. if 1 byte variable's value is 0xa1 is it -95 (char) or 161 (unsigned char)), and for Word sized values it could be a pointer or not, in which case I would want to display a pointer's value in hex, but if it is a long long I'd want to display it as a signed decimal. > > Since code in coregrind/m_debuginfo/ is parsing .debug to get the correct and detailed type information, offset, sizes, field names, etc., I'd like to get that info from coregrind for globals and locals through the public tool interface: it looks like the TyEnt struct has what I need. Good you write about this, because I was just about to commit the proposed patch from https://bugs.kde.org/show_bug.cgi?id=472512 "Remove Stack and Global Blocks from debuginfo handling" The VG_(di_get_stack_blocks_at_ip) and VG_(di_get_global_blocks_from_dihandle) functions were only used by the exp-sgcheck tool. Since this tool was removed a couple of years back this code hasn't been used or tested. Lets remove it to reduce the complexity of dealing with debuginfo. This code confused me till I realized it isn't actually used (and was last changed in 2008). So I think it is best to just remove it so it doesn't confuse others. But... now it seems you do want to use it. > First, is there a tool interface to this detailed type information about variables that I am missing (like info in TyEnt structs) and if so, can someone please point me to it? > > If not (and I think this is the case), I would have to add something new to the public tool interface to get the information I need, adding or modifying code in coregrind/m_debuginfo/ to do it. I can build my own custom version of valgrind with the functionality I need, but this is obviously not ideal for keeping up with new version releases. > > Is there developer interest in expanding the valgrind public tool interface to export the kind of detailed type information that I need for my tool? If so, I'd be happy to discuss with someone the best way to design and implement it and help work on its implementation. I think you would have to create a new interface. Unless you believe the current one is still useful. What does your tool do precisely? Cheers, Mark |
From: Mark W. <ma...@kl...> - 2023-08-01 14:31:26
|
Hi Tia, On Tue, 2023-08-01 at 08:56 -0500, Tia Newhall wrote: > I'm building a valgrind tool, and as part of its functionality I need to get (and ultimately print out) local and global variable values based on their C types. However, I do not see a way to get detailed type information for locals and globals via the tool public interface. > > The stack and global structs exported via VG_(di_get_global_blocks_from_dihandle) and VG_(di_get_stack_blocks_at_ip) give me back structs that include the variable name, the address, total size in bytes, and if it is an array or not, but this is not sufficient for my purposes. For example, if the variable is an array of 16 bytes I have no idea if it is an array of char, short, int, unsigned int, int *, etc., and if it is a struct or union I have no idea where the field types are, their names, nor their offsets and sizes from the base address, and if it is an array of structs or unions I have no idea if there is padding between elements or not, and enum and typedefs I'd just be out of luck. Even for non-array base types, I don't know if the value is signed or unsigned (ex. if 1 byte variable's value is 0xa1 is it -95 (char) or 161 (unsigned char)), and for Word sized values it could be a pointer or not, in which case I would want to display a pointer's value in hex, but if it is a long long I'd want to display it as a signed decimal. > > Since code in coregrind/m_debuginfo/ is parsing .debug to get the correct and detailed type information, offset, sizes, field names, etc., I'd like to get that info from coregrind for globals and locals through the public tool interface: it looks like the TyEnt struct has what I need. Good you write about this, because I was just about to commit the proposed patch from https://bugs.kde.org/show_bug.cgi?id=472512 "Remove Stack and Global Blocks from debuginfo handling" The VG_(di_get_stack_blocks_at_ip) and VG_(di_get_global_blocks_from_dihandle) functions were only used by the exp-sgcheck tool. Since this tool was removed a couple of years back this code hasn't been used or tested. Lets remove it to reduce the complexity of dealing with debuginfo. This code confused me till I realized it isn't actually used (and was last changed in 2008). So I think it is best to just remove it so it doesn't confuse others. But... now it seems you do want to use it. > First, is there a tool interface to this detailed type information about variables that I am missing (like info in TyEnt structs) and if so, can someone please point me to it? > > If not (and I think this is the case), I would have to add something new to the public tool interface to get the information I need, adding or modifying code in coregrind/m_debuginfo/ to do it. I can build my own custom version of valgrind with the functionality I need, but this is obviously not ideal for keeping up with new version releases. > > Is there developer interest in expanding the valgrind public tool interface to export the kind of detailed type information that I need for my tool? If so, I'd be happy to discuss with someone the best way to design and implement it and help work on its implementation. I think you would have to create a new interface. Unless you believe the current one is still useful. What does your tool do precisely? Cheers, Mark |
From: Tia N. <ne...@cs...> - 2023-08-01 14:15:00
|
hi, I'm building a valgrind tool, and as part of its functionality I need to get (and ultimately print out) local and global variable values based on their C types. However, I do not see a way to get detailed type information for locals and globals via the tool public interface. The stack and global structs exported via VG_(di_get_global_blocks_from_dihandle) and VG_(di_get_stack_blocks_at_ip) give me back structs that include the variable name, the address, total size in bytes, and if it is an array or not, but this is not sufficient for my purposes. For example, if the variable is an array of 16 bytes I have no idea if it is an array of char, short, int, unsigned int, int *, etc., and if it is a struct or union I have no idea where the field types are, their names, nor their offsets and sizes from the base address, and if it is an array of structs or unions I have no idea if there is padding between elements or not, and enum and typedefs I'd just be out of luck. Even for non-array base types, I don't know if the value is signed or unsigned (ex. if 1 byte variable's value is 0xa1 is it -95 (char) or 161 (unsigned char)), and for Word sized values it could be a pointer or not, in which case I would want to display a pointer's value in hex, but if it is a long long I'd want to display it as a signed decimal. Since code in coregrind/m_debuginfo/ is parsing .debug to get the correct and detailed type information, offset, sizes, field names, etc., I'd like to get that info from coregrind for globals and locals through the public tool interface: it looks like the TyEnt struct has what I need. First, is there a tool interface to this detailed type information about variables that I am missing (like info in TyEnt structs) and if so, can someone please point me to it? If not (and I think this is the case), I would have to add something new to the public tool interface to get the information I need, adding or modifying code in coregrind/m_debuginfo/ to do it. I can build my own custom version of valgrind with the functionality I need, but this is obviously not ideal for keeping up with new version releases. Is there developer interest in expanding the valgrind public tool interface to export the kind of detailed type information that I need for my tool? If so, I'd be happy to discuss with someone the best way to design and implement it and help work on its implementation. Thanks, Tia ------------------------------- Tia Newhall Professor, Computer Science Dept. Swarthmore College www.cs.swarthmore.edu/~newhall pronouns: she/her |
From: Petr P. <pet...@da...> - 2023-07-25 19:55:29
|
On 17. Jul 23 15:05, Jojo R wrote: > Hi, > > Sorry for the late reply, > > i have been pushing the progress of valgrind RVV implementation 😄 > We finished the first version and tested with full RVV intrinsics spec. > > For real project and developers, we implement the first useable/ full > functionality's RVV valgrind with dirtycall method, > and we will make experiment or optimize RVV implementation on ideal RVV > design. > > Back to the RVV RFC, we are happy to share our thinking of design, see > attachment for more details :) This is a good summary. As mentioned in another part of the thread, I think that in long run it will be indeed needed to implement the approach described as "RVV to variable-length IR". I hope to help with making sure it can work for Arm SVE too. I guess if initial experiments show that this option is hard and will take time to implement then it could make sense in short term for the RISC-V port to go with the "RVV to dirty helper" implementation. Thanks, Petr |
From: Nicholas N. <n.n...@gm...> - 2023-07-19 22:21:21
|
On Thu, 20 Jul 2023 at 00:50, John Reiser <jr...@bi...> wrote: > > RTFM. It's DOCUMENTED!! https://valgrind.org/info/platforms.html > John, please refrain from this kind of aggressive language. Stuart asked a reasonable question in good faith, and doesn't deserve a response with that tone. Nick |
From: Stuart F. <smf...@nt...> - 2023-07-19 19:05:54
|
Thanks for all the replies,. I use LFS/BLFS for my systems so I think given the feedback I will build an additional system on my Ryzen and build with march=x86-64 which from what I have understood will allow valgind to work. Please correct me if I am wrong. |
From: John R. <jr...@bi...> - 2023-07-19 14:49:03
|
> I am trying to find which of my systems will run valgrind, I know it will not run on my AMD FX-8370� and AMD FX-4350 systems. Does any one know if it should run on my AMD Ryzen 5 5600X (see failure below) ? > > I have access to an Intel core 7 laptop (Haswell), would I stand a better chance with that, I am reluctant to move my whole project to the laptop if there is no chance of Valgrind working there too. > > ==5096== Using Valgrind-3.21.0 and LibVEX; rerun with -h for copyright info > ==5096== Command: QtWeather -s moira2 > ==5096== > vex amd64->IR: unhandled instruction bytes: 0xC4 0xE2 0x7D 0xDC 0xC9 0x48 0x39 0xD1 0x73 0x37 > vex amd64->IR:�� REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0 > vex amd64->IR:�� VEX=1 VEX.L=1 VEX.nVVVV=0x0 ESC=0F38 > vex amd64->IR:�� PFX.66=1 PFX.F2=0 PFX.F3=0 > ==5096== valgrind: Unrecognised instruction at address 0x5ee6282. > ==5096==��� at 0x5EE6282: aeshash256_ge32(long long __vector(4), unsigned char const*, unsigned long) (in /opt/qt-6.4.0/lib/libQt6Core.so.6.4.0) RTFM. It's DOCUMENTED!! https://valgrind.org/info/platforms.html AMD64/Linux: up to and including AVX2. This is the primary development target and tends to be well supported. So: Intel Haswell: yes. If "grep aes /proc/cpuinfo" is not empty, then NO, unless you tell the compiler and disto-supplied libraries to avoid aes. Also search for 'aes' in "$ info gcc". |
From: Tom H. <to...@co...> - 2023-07-19 14:15:40
|
That depends how you define support. I use it on a Ryzen all the time, but not with code compiled to target all the AMD specific extensions, which we do not currently have support for. Tom On 19/07/2023 12:39, Stuart Foster via Valgrind-users wrote: > I am trying to find which of my systems will run valgrind, I know it > will not run on my AMD FX-8370� and AMD FX-4350 systems. Does any one > know if it should run on my AMD Ryzen 5 5600X (see failure below) ? > > I have access to an Intel core 7 laptop (Haswell), would I stand a > better chance with that, I am reluctant to move my whole project to the > laptop if there is no chance of Valgrind working there too. > > ==5096== Memcheck, a memory error detector > ==5096== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al. > ==5096== Using Valgrind-3.21.0 and LibVEX; rerun with -h for copyright info > ==5096== Command: QtWeather -s moira2 > ==5096== > vex amd64->IR: unhandled instruction bytes: 0xC4 0xE2 0x7D 0xDC 0xC9 > 0x48 0x39 0xD1 0x73 0x37 > vex amd64->IR:�� REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0 > vex amd64->IR:�� VEX=1 VEX.L=1 VEX.nVVVV=0x0 ESC=0F38 > vex amd64->IR:�� PFX.66=1 PFX.F2=0 PFX.F3=0 > ==5096== valgrind: Unrecognised instruction at address 0x5ee6282. > ==5096==��� at 0x5EE6282: aeshash256_ge32(long long __vector(4), > unsigned char const*, unsigned long) (in > /opt/qt-6.4.0/lib/libQt6Core.so.6.4.0) > ==5096==��� by 0x5FEBFD1: > QFactoryLoaderPrivate::updateSinglePath(QString const&) (in > /opt/qt-6.4.0/lib/libQt6Core.so.6.4.0) > ==5096==��� by 0x5FE8403: QFactoryLoader::update() (in > /opt/qt-6.4.0/lib/libQt6Core.so.6.4.0) > ==5096==��� by 0x5FE8906: QFactoryLoader::QFactoryLoader(char const*, > QString const&, Qt::CaseSensitivity) (in > /opt/qt-6.4.0/lib/libQt6Core.so.6.4.0) > ==5096==��� by 0x54DC115: QPlatformIntegrationFactory::keys(QString > const&) (in /opt/qt-6.4.0/lib/libQt6Gui.so.6.4.0) > ==5096==��� by 0x54A1A36: init_platform(QString const&, QString const&, > QString const&, int&, char**) (in /opt/qt-6.4.0/lib/libQt6Gui.so.6.4.0) > ==5096==��� by 0x54A58DF: > QGuiApplicationPrivate::createPlatformIntegration() (in > /opt/qt-6.4.0/lib/libQt6Gui.so.6.4.0) > ==5096==��� by 0x54A6517: > QGuiApplicationPrivate::createEventDispatcher() (in > /opt/qt-6.4.0/lib/libQt6Gui.so.6.4.0) > ==5096==��� by 0x5F64804: QCoreApplicationPrivate::init() (in > /opt/qt-6.4.0/lib/libQt6Core.so.6.4.0) > ==5096==��� by 0x54A9979: QGuiApplicationPrivate::init() (in > /opt/qt-6.4.0/lib/libQt6Gui.so.6.4.0) > ==5096==��� by 0x4C28708: QApplicationPrivate::init() (in > /opt/qt-6.4.0/lib/libQt6Widgets.so.6.4.0) > ==5096==��� by 0x124098: main (in /usr/bin/QtWeather) > ==5096== Your program just tried to execute an instruction that Valgrind > ==5096== did not recognise.� There are two possible reasons for this. > ==5096== 1. Your program has a bug and erroneously jumped to a non-code > ==5096==��� location.� If you are running Memcheck and you just saw a > ==5096==��� warning about a bad jump, it's probably your program's fault. > ==5096== 2. The instruction is legitimate but Valgrind doesn't handle it, > ==5096==��� i.e. it's Valgrind's fault.� If you think this is the case or > ==5096==��� you are not sure, please let us know and we'll try to fix it. > ==5096== Either way, Valgrind will now raise a SIGILL signal which will > ==5096== probably kill your program. > ==5096== > ... > > Thanks > > > > _______________________________________________ > Valgrind-users mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-users -- Tom Hughes (to...@co...) http://compton.nu/ |
From: Simon S. <sim...@gn...> - 2023-07-19 11:48:37
|
The issue is _very_ likely not about the processor: ==5096== valgrind: Unrecognised instruction at address 0x5ee6282. ==5096== at 0x5EE6282: aeshash256_ge32(long long __vector(4), unsigned char const*, unsigned long) (in /opt/qt-6.4.0/lib/libQt6Core.so.6.4.0) This library uses an unknown instruction, and it likely will do so on other processors, too. The only solution is to use a QT library that doesn't use this. Depending on how this is configured/build you may be able to disable use of this aes function altogether, if not you _may_ be able to specify via CXXFLAGS/CFLAGS to not optimize for a CPU. Side note: all my projects work on valgrind if I only compile "normally", as soon as I use -march/-mtune GCC generates calls to faster but cpu specific instructions that valgrind does not support yet. Simon Am 19.07.2023 um 13:39 schrieb Stuart Foster via Valgrind-users: > I am trying to find which of my systems will run valgrind, I know it > will not run on my AMD FX-8370� and AMD FX-4350 systems. Does any one > know if it should run on my AMD Ryzen 5 5600X (see failure below) ? > > I have access to an Intel core 7 laptop (Haswell), would I stand a > better chance with that, I am reluctant to move my whole project to the > laptop if there is no chance of Valgrind working there too. > > ==5096== Memcheck, a memory error detector > ==5096== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al. > ==5096== Using Valgrind-3.21.0 and LibVEX; rerun with -h for copyright info > ==5096== Command: QtWeather -s moira2 > ==5096== > vex amd64->IR: unhandled instruction bytes: 0xC4 0xE2 0x7D 0xDC 0xC9 > 0x48 0x39 0xD1 0x73 0x37 > vex amd64->IR:�� REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0 > vex amd64->IR:�� VEX=1 VEX.L=1 VEX.nVVVV=0x0 ESC=0F38 > vex amd64->IR:�� PFX.66=1 PFX.F2=0 PFX.F3=0 > ==5096== valgrind: Unrecognised instruction at address 0x5ee6282. > ==5096==��� at 0x5EE6282: aeshash256_ge32(long long __vector(4), > unsigned char const*, unsigned long) (in > /opt/qt-6.4.0/lib/libQt6Core.so.6.4.0) > ==5096==��� by 0x5FEBFD1: > QFactoryLoaderPrivate::updateSinglePath(QString const&) (in > /opt/qt-6.4.0/lib/libQt6Core.so.6.4.0) > ==5096==��� by 0x5FE8403: QFactoryLoader::update() (in > /opt/qt-6.4.0/lib/libQt6Core.so.6.4.0) > ==5096==��� by 0x5FE8906: QFactoryLoader::QFactoryLoader(char const*, > QString const&, Qt::CaseSensitivity) (in > /opt/qt-6.4.0/lib/libQt6Core.so.6.4.0) > ==5096==��� by 0x54DC115: QPlatformIntegrationFactory::keys(QString > const&) (in /opt/qt-6.4.0/lib/libQt6Gui.so.6.4.0) > ==5096==��� by 0x54A1A36: init_platform(QString const&, QString const&, > QString const&, int&, char**) (in /opt/qt-6.4.0/lib/libQt6Gui.so.6.4.0) > ==5096==��� by 0x54A58DF: > QGuiApplicationPrivate::createPlatformIntegration() (in > /opt/qt-6.4.0/lib/libQt6Gui.so.6.4.0) > ==5096==��� by 0x54A6517: > QGuiApplicationPrivate::createEventDispatcher() (in > /opt/qt-6.4.0/lib/libQt6Gui.so.6.4.0) > ==5096==��� by 0x5F64804: QCoreApplicationPrivate::init() (in > /opt/qt-6.4.0/lib/libQt6Core.so.6.4.0) > ==5096==��� by 0x54A9979: QGuiApplicationPrivate::init() (in > /opt/qt-6.4.0/lib/libQt6Gui.so.6.4.0) > ==5096==��� by 0x4C28708: QApplicationPrivate::init() (in > /opt/qt-6.4.0/lib/libQt6Widgets.so.6.4.0) > ==5096==��� by 0x124098: main (in /usr/bin/QtWeather) > ==5096== Your program just tried to execute an instruction that Valgrind > ==5096== did not recognise.� There are two possible reasons for this. > ==5096== 1. Your program has a bug and erroneously jumped to a non-code > ==5096==��� location.� If you are running Memcheck and you just saw a > ==5096==��� warning about a bad jump, it's probably your program's fault. > ==5096== 2. The instruction is legitimate but Valgrind doesn't handle it, > ==5096==��� i.e. it's Valgrind's fault.� If you think this is the case or > ==5096==��� you are not sure, please let us know and we'll try to fix it. > ==5096== Either way, Valgrind will now raise a SIGILL signal which will > ==5096== probably kill your program. > ==5096== > ... > > Thanks > > > > _______________________________________________ > Valgrind-users mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-users |
From: Stuart F. <smf...@nt...> - 2023-07-19 11:39:43
|
I am trying to find which of my systems will run valgrind, I know it will not run on my AMD FX-8370 and AMD FX-4350 systems. Does any one know if it should run on my AMD Ryzen 5 5600X (see failure below) ? I have access to an Intel core 7 laptop (Haswell), would I stand a better chance with that, I am reluctant to move my whole project to the laptop if there is no chance of Valgrind working there too. ==5096== Memcheck, a memory error detector ==5096== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al. ==5096== Using Valgrind-3.21.0 and LibVEX; rerun with -h for copyright info ==5096== Command: QtWeather -s moira2 ==5096== vex amd64->IR: unhandled instruction bytes: 0xC4 0xE2 0x7D 0xDC 0xC9 0x48 0x39 0xD1 0x73 0x37 vex amd64->IR: REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0 vex amd64->IR: VEX=1 VEX.L=1 VEX.nVVVV=0x0 ESC=0F38 vex amd64->IR: PFX.66=1 PFX.F2=0 PFX.F3=0 ==5096== valgrind: Unrecognised instruction at address 0x5ee6282. ==5096== at 0x5EE6282: aeshash256_ge32(long long __vector(4), unsigned char const*, unsigned long) (in /opt/qt-6.4.0/lib/libQt6Core.so.6.4.0) ==5096== by 0x5FEBFD1: QFactoryLoaderPrivate::updateSinglePath(QString const&) (in /opt/qt-6.4.0/lib/libQt6Core.so.6.4.0) ==5096== by 0x5FE8403: QFactoryLoader::update() (in /opt/qt-6.4.0/lib/libQt6Core.so.6.4.0) ==5096== by 0x5FE8906: QFactoryLoader::QFactoryLoader(char const*, QString const&, Qt::CaseSensitivity) (in /opt/qt-6.4.0/lib/libQt6Core.so.6.4.0) ==5096== by 0x54DC115: QPlatformIntegrationFactory::keys(QString const&) (in /opt/qt-6.4.0/lib/libQt6Gui.so.6.4.0) ==5096== by 0x54A1A36: init_platform(QString const&, QString const&, QString const&, int&, char**) (in /opt/qt-6.4.0/lib/libQt6Gui.so.6.4.0) ==5096== by 0x54A58DF: QGuiApplicationPrivate::createPlatformIntegration() (in /opt/qt-6.4.0/lib/libQt6Gui.so.6.4.0) ==5096== by 0x54A6517: QGuiApplicationPrivate::createEventDispatcher() (in /opt/qt-6.4.0/lib/libQt6Gui.so.6.4.0) ==5096== by 0x5F64804: QCoreApplicationPrivate::init() (in /opt/qt-6.4.0/lib/libQt6Core.so.6.4.0) ==5096== by 0x54A9979: QGuiApplicationPrivate::init() (in /opt/qt-6.4.0/lib/libQt6Gui.so.6.4.0) ==5096== by 0x4C28708: QApplicationPrivate::init() (in /opt/qt-6.4.0/lib/libQt6Widgets.so.6.4.0) ==5096== by 0x124098: main (in /usr/bin/QtWeather) ==5096== Your program just tried to execute an instruction that Valgrind ==5096== did not recognise. There are two possible reasons for this. ==5096== 1. Your program has a bug and erroneously jumped to a non-code ==5096== location. If you are running Memcheck and you just saw a ==5096== warning about a bad jump, it's probably your program's fault. ==5096== 2. The instruction is legitimate but Valgrind doesn't handle it, ==5096== i.e. it's Valgrind's fault. If you think this is the case or ==5096== you are not sure, please let us know and we'll try to fix it. ==5096== Either way, Valgrind will now raise a SIGILL signal which will ==5096== probably kill your program. ==5096== ... Thanks |
From: Wu, F. <fe...@in...> - 2023-07-19 01:25:15
|
On 7/19/2023 3:08 AM, Petr Pavlu wrote: > On 11. Jul 23 19:28, Wu, Fei wrote: >> On 7/11/2023 4:50 AM, Petr Pavlu wrote: >>> On 6. Jul 23 20:39, Wu, Fei wrote: >>>> [...] >>>> >>>> This approach will introduce a bunch of new vlen Vector IRs, especially >>>> the arithmetic IRs such as vadd, my goal is for a good solution which >>>> takes reasonable time to reach usable status, yet still be able to >>>> evolve and generic enough for other vector ISA. Any comments? > > This personally looks to me as a right direction. Supporting scalable > vector extensions in Valgrind as a first-class citizen would be my > preferred choice. I think it is something that will be needed to handle > Arm SVE and RISC-V RVV well. On the other hand, it is likely the most > complex approach and could take time to iron out. > >>> Could you please share a repository with your changes or send them to me >>> as patches? I have a few questions but I think it might be easier for me >>> first to see the actual code. >>> >> Please see attachment. It's a very raw version to just verify the idea, >> mask is not added but expected to be done as mentioned above, it's based >> on commit 71272b2529 on your branch, patch 0013 is the key. > > Thanks for sharing this code. The previous discussions and this series > introduces a new concept of translating client code per some CPU state. > That is something I spent most time thinking about. > > I can see it is indeed necessary for RVV. In particular, this > "versioning" of translations allows that Valgrind IR can statically > express an element type of each vector operation, i.e. that it is an > operation on I32, F64, ... An alternative would be to try to express the > type dynamically in IR. That should be still somewhat manageable in the > toIR frontend but I have a hard time seeing how it would work for the > instrumentation and codegen. > > The versioning should work well for RVV translations because my > expectation is that most RVV loops will consist of a call to vsetvli > (with a static vtype), followed by some actual vector operations. Such > a block then requires only one translation. > > This is however true only if translations are versioned just per vtype, > without vl. If I understood correctly, the patches version them per vl > too but it isn't clear to me conceptually if this is really necessary. > Yes, this series does version vl, it helps the situation such as in the last patch, it can break the large vl to multiple small vl operations, in case the backend doesn't have a register allocation algorithm for LMUL>1. > For instance, I think VAdd8 could look as follows: > VAdd8(<len>, <in1>, <in2>, <flags?>) where <len> is something as > IRExpr_Get(OFFB_VL, Ity_I64). > > Another problem which I noticed is that blocks containing no RVV > instructions are also versioned. Consider the following: > while (true) { > // (1) some RVV code which can set vtype to different values > // (2) a large chunk of non-RVV code > } > > The code in (2) will currently have multiple same translations for each > residue left in vtype by (1). > Yes, indeed. This is one place to optimize. > In general, I think the concept of allowing translations per some CPU > state could be useful in other cases and for other architectures too. > For RISC-V, it could be beneficial for floating-point operations. My > expectation is that regular RISC-V FP code will have instructions with > encoded rm=DYN and always executed with frm=RNE. The current approach is > that the toIR frontend generates an IR which reads the rounding mode > from frm and remaps it to the Valgrind's representation. The codegen > then does the opposite. The idea here is that the frontend would know > the actual rounding mode and could create IR which has directly this > mode, for instance, AddF64(Irrm_NEAREST, <in1>, <in2>). The codegen then > doesn't need to know how to handle any dynamic rounding modes as they > become static. > > I plan to look further into this series. Specifically, I'd like to have > a stab at adding some basic support for Arm SVE to get a better > understanding if this is generic enough. > Great, I will add more RVV support if it's proved to be the right direction, and thank you for the review. Thanks, Fei. > Thanks, > Petr |
From: Petr P. <pet...@da...> - 2023-07-18 19:26:03
|
On 11. Jul 23 19:28, Wu, Fei wrote: > On 7/11/2023 4:50 AM, Petr Pavlu wrote: > > On 6. Jul 23 20:39, Wu, Fei wrote: > >> [...] > >> > >> This approach will introduce a bunch of new vlen Vector IRs, especially > >> the arithmetic IRs such as vadd, my goal is for a good solution which > >> takes reasonable time to reach usable status, yet still be able to > >> evolve and generic enough for other vector ISA. Any comments? This personally looks to me as a right direction. Supporting scalable vector extensions in Valgrind as a first-class citizen would be my preferred choice. I think it is something that will be needed to handle Arm SVE and RISC-V RVV well. On the other hand, it is likely the most complex approach and could take time to iron out. > > Could you please share a repository with your changes or send them to me > > as patches? I have a few questions but I think it might be easier for me > > first to see the actual code. > > > Please see attachment. It's a very raw version to just verify the idea, > mask is not added but expected to be done as mentioned above, it's based > on commit 71272b2529 on your branch, patch 0013 is the key. Thanks for sharing this code. The previous discussions and this series introduces a new concept of translating client code per some CPU state. That is something I spent most time thinking about. I can see it is indeed necessary for RVV. In particular, this "versioning" of translations allows that Valgrind IR can statically express an element type of each vector operation, i.e. that it is an operation on I32, F64, ... An alternative would be to try to express the type dynamically in IR. That should be still somewhat manageable in the toIR frontend but I have a hard time seeing how it would work for the instrumentation and codegen. The versioning should work well for RVV translations because my expectation is that most RVV loops will consist of a call to vsetvli (with a static vtype), followed by some actual vector operations. Such a block then requires only one translation. This is however true only if translations are versioned just per vtype, without vl. If I understood correctly, the patches version them per vl too but it isn't clear to me conceptually if this is really necessary. For instance, I think VAdd8 could look as follows: VAdd8(<len>, <in1>, <in2>, <flags?>) where <len> is something as IRExpr_Get(OFFB_VL, Ity_I64). Another problem which I noticed is that blocks containing no RVV instructions are also versioned. Consider the following: while (true) { // (1) some RVV code which can set vtype to different values // (2) a large chunk of non-RVV code } The code in (2) will currently have multiple same translations for each residue left in vtype by (1). In general, I think the concept of allowing translations per some CPU state could be useful in other cases and for other architectures too. For RISC-V, it could be beneficial for floating-point operations. My expectation is that regular RISC-V FP code will have instructions with encoded rm=DYN and always executed with frm=RNE. The current approach is that the toIR frontend generates an IR which reads the rounding mode from frm and remaps it to the Valgrind's representation. The codegen then does the opposite. The idea here is that the frontend would know the actual rounding mode and could create IR which has directly this mode, for instance, AddF64(Irrm_NEAREST, <in1>, <in2>). The codegen then doesn't need to know how to handle any dynamic rounding modes as they become static. I plan to look further into this series. Specifically, I'd like to have a stab at adding some basic support for Arm SVE to get a better understanding if this is generic enough. Thanks, Petr |
From: Wu, F. <fe...@in...> - 2023-07-18 01:44:56
|
On 7/11/2023 7:28 PM, Wu, Fei wrote: > On 7/11/2023 4:50 AM, Petr Pavlu wrote: >> On 6. Jul 23 20:39, Wu, Fei wrote: >>> On 5/29/2023 11:29 AM, Wu, Fei wrote: >>>> On 5/28/2023 1:06 AM, Petr Pavlu wrote: >>>>> On 21. Apr 23 17:25, Jojo R wrote: >>>>>> We consider to add RVV/Vector [1] feature in valgrind, there are some >>>>>> challenges. >>>>>> RVV like ARM's SVE [2] programming model, it's scalable/VLA, that means the >>>>>> vector length is agnostic. >>>>>> ARM's SVE is not supported in valgrind :( >>>>>> >>>>>> There are three major issues in implementing RVV instruction set in Valgrind >>>>>> as following: >>>>>> >>>>>> 1. Scalable vector register width VLENB >>>>>> 2. Runtime changing property of LMUL and SEW >>>>>> 3. Lack of proper VEX IR to represent all vector operations >>>>>> >>>>>> We propose applicable methods to solve 1 and 2. As for 3, we explore several >>>>>> possible but maybe imperfect approaches to handle different cases. >>>>>> >>> I did a very basic prototype for vlen Vector-IR, particularly on RISC-V >>> Vector (RVV): >>> >>> * Define new iops such as Iop_VAdd8/16/32/64, the difference from >>> existing SIMD version is that no element number is specified like >>> Iop_Add8x32 >>> >>> * Define new IR type Ity_VLen along side existing types such as Ity_I64, >>> Ity_V256 >>> >>> * Define new class HRcVecVLen in HRegClass for vlen vector registers >>> The real length is embedded in both IROp and IRType for vlen ops/types, >>> it's runtime-decided and already known when handling insn such as vadd, >>> this leads to more flexibility, e.g. backend can issue extra vsetvl if >>> necessary. >>> >>> With the above, RVV instruction in the guest can be passed from >>> frontend, to memcheck, to the backend, and generate the final RVV insn >>> during host isel, a very basic testcase has been tested. >>> >>> Now here comes to the complexities: >>> >>> 1. RVV has the concept of LMUL, which groups multiple (or partial) >>> vector registers, e.g. when LMUL==2, v2 means the real v2+v3. This >>> complicates the register allocation. >>> >>> 2. RVV uses the "implicit" v0 for mask, its content must be loaded to >>> the exact "v0" register instead of any other ones if host isel wants to >>> leverage RVV insn, this implicitness in ISA requires more explicitness >>> in Valgrind implementation. >>> >>> For #1 LMUL, a new register allocation algorithm for it can be added, >>> and it will be great if someone is willing to try it, I'm not sure how >>> much effort it will take. The other way is splitting it into multiple >>> ops which only takes one vector register, taking vadd for example, 2 >>> vadd will run with LMUL=1 for one vadd with LMUL=2, this is still okay >>> for the widening insn, most of the arithmetic insns can be covered in >>> this way. The exception could be register gather insn vrgather, which we >>> can consult other ways for it, e.g. scalar or helper. >>> >>> For #2 v0 mask, one way is to handle the mask in the very beginning at >>> guest_riscv64_toIR.c, similar to what AVX port does: >>> >>> a) Read the whole dest register without mask >>> b) Generate unmasked result by running op without mask >>> c) Applying mask to a,b and generate the final dest >>> >>> by doing this, insn with mask is converted to non-mask ones, although >>> more insns are generated but the performance should be acceptable. There >>> are still exceptions, e.g. vadc (Add-with-Carry), v0 is not used as mask >>> but as carry, but just as mentioned above, it's okay to use other ways >>> for a few insns. Eventually, we can pass v0 mask down to the backend if >>> it's proved a better solution. >>> >>> This approach will introduce a bunch of new vlen Vector IRs, especially >>> the arithmetic IRs such as vadd, my goal is for a good solution which >>> takes reasonable time to reach usable status, yet still be able to >>> evolve and generic enough for other vector ISA. Any comments? >> >> Could you please share a repository with your changes or send them to me >> as patches? I have a few questions but I think it might be easier for me >> first to see the actual code. >> > Please see attachment. It's a very raw version to just verify the idea, > mask is not added but expected to be done as mentioned above, it's based > on commit 71272b2529 on your branch, patch 0013 is the key. > Hi Petr, Have you taken a look? Any comments? Thanks, Fei. > btw, I will setup a repository but it takes a few days to pass the > internal process. > > Thanks, > Fei. > >> Thanks, >> Petr |
From: Pavankumar S V <pav...@gm...> - 2023-07-12 13:27:03
|
Hello, I’m working on an embedded application which is multithreaded running on Linux platform. It has an infinite 'for' loop to keep the main thread alive. Every time, each iteration of this loop takes a different amount of time to get executed. In some iteration it is taking too much time and there are spikes in the execution time now and then. I’m trying to improve the performance(getting a consistent execution time) by figuring out the reason for the spike in execution time. So, I decided to explore the profilers to understand which functions are taking too much time to get executed. Tried gprof, strace, perf etc.. But none of them gave me the expected profiling report. *Question1:** My expectation from profilers*: I want to see time consumed by each function(user-space) of my application. Many of these functions are invoking system calls. So, I want to know the time consumed by each system call and who is invoking those time-consuming system calls. Is this possible with callgrind? I have followed these steps to generate a profiling data from callgrind: 1. I am limiting the infinite 'for' loop to a few thousands of iterations and returning from the main() function to get the callgrind output generated. 2. Compiled the program with these compiler flags: *-O0 -g -fno-inline-functions* 3. Running my application with this command: *valgrind --tool=callgrind * *-q * *--collect-systime=yes * *--trace-children=yes* * taskset 0x1 application_name* 1. Around 150 callgrind.out.X files are generated with different values for ‘X’. 2. I’m taking the callgrind.out.X file with the least value of X, assuming that this has the profiling data of the main thread. (When I checked other files, they did not have main() function in their profiled data). 3. Opening the output file with kcachegrind: *kcachegrind callgrind.out.X* After checking, the below points made me doubt the correctness of the profiling data: · There is a function that gets called inside the 'for' loop in my application which I know is taking a lot of time(as it is using ioctl() calls every time and confirmed that it takes too much time with testing). But callgrind output file shows that it is taking very less time to get executed. · Also, I added a test code (‘for’ loop that loops around for some time every time it gets called and consumes significant amount of time.) in one function. I confirmed that this function(after adding test code) consumes lot of time with gprof. But as per callgrind, this function is taking very less time. *Question2:* Please let me know where I'm going wrong or should I do anything more to get correct profiling data from callgrind. *Question3:* Why are so many *callgrind.out.X* generated? How to identify which file is for the main() thread? How to get only one output file generated like gprof? Thank you *Best Regards,* Pavankumar S V |
From: Wu, F. <fe...@in...> - 2023-07-11 11:29:25
|
On 7/11/2023 4:50 AM, Petr Pavlu wrote: > On 6. Jul 23 20:39, Wu, Fei wrote: >> On 5/29/2023 11:29 AM, Wu, Fei wrote: >>> On 5/28/2023 1:06 AM, Petr Pavlu wrote: >>>> On 21. Apr 23 17:25, Jojo R wrote: >>>>> We consider to add RVV/Vector [1] feature in valgrind, there are some >>>>> challenges. >>>>> RVV like ARM's SVE [2] programming model, it's scalable/VLA, that means the >>>>> vector length is agnostic. >>>>> ARM's SVE is not supported in valgrind :( >>>>> >>>>> There are three major issues in implementing RVV instruction set in Valgrind >>>>> as following: >>>>> >>>>> 1. Scalable vector register width VLENB >>>>> 2. Runtime changing property of LMUL and SEW >>>>> 3. Lack of proper VEX IR to represent all vector operations >>>>> >>>>> We propose applicable methods to solve 1 and 2. As for 3, we explore several >>>>> possible but maybe imperfect approaches to handle different cases. >>>>> >> I did a very basic prototype for vlen Vector-IR, particularly on RISC-V >> Vector (RVV): >> >> * Define new iops such as Iop_VAdd8/16/32/64, the difference from >> existing SIMD version is that no element number is specified like >> Iop_Add8x32 >> >> * Define new IR type Ity_VLen along side existing types such as Ity_I64, >> Ity_V256 >> >> * Define new class HRcVecVLen in HRegClass for vlen vector registers >> The real length is embedded in both IROp and IRType for vlen ops/types, >> it's runtime-decided and already known when handling insn such as vadd, >> this leads to more flexibility, e.g. backend can issue extra vsetvl if >> necessary. >> >> With the above, RVV instruction in the guest can be passed from >> frontend, to memcheck, to the backend, and generate the final RVV insn >> during host isel, a very basic testcase has been tested. >> >> Now here comes to the complexities: >> >> 1. RVV has the concept of LMUL, which groups multiple (or partial) >> vector registers, e.g. when LMUL==2, v2 means the real v2+v3. This >> complicates the register allocation. >> >> 2. RVV uses the "implicit" v0 for mask, its content must be loaded to >> the exact "v0" register instead of any other ones if host isel wants to >> leverage RVV insn, this implicitness in ISA requires more explicitness >> in Valgrind implementation. >> >> For #1 LMUL, a new register allocation algorithm for it can be added, >> and it will be great if someone is willing to try it, I'm not sure how >> much effort it will take. The other way is splitting it into multiple >> ops which only takes one vector register, taking vadd for example, 2 >> vadd will run with LMUL=1 for one vadd with LMUL=2, this is still okay >> for the widening insn, most of the arithmetic insns can be covered in >> this way. The exception could be register gather insn vrgather, which we >> can consult other ways for it, e.g. scalar or helper. >> >> For #2 v0 mask, one way is to handle the mask in the very beginning at >> guest_riscv64_toIR.c, similar to what AVX port does: >> >> a) Read the whole dest register without mask >> b) Generate unmasked result by running op without mask >> c) Applying mask to a,b and generate the final dest >> >> by doing this, insn with mask is converted to non-mask ones, although >> more insns are generated but the performance should be acceptable. There >> are still exceptions, e.g. vadc (Add-with-Carry), v0 is not used as mask >> but as carry, but just as mentioned above, it's okay to use other ways >> for a few insns. Eventually, we can pass v0 mask down to the backend if >> it's proved a better solution. >> >> This approach will introduce a bunch of new vlen Vector IRs, especially >> the arithmetic IRs such as vadd, my goal is for a good solution which >> takes reasonable time to reach usable status, yet still be able to >> evolve and generic enough for other vector ISA. Any comments? > > Could you please share a repository with your changes or send them to me > as patches? I have a few questions but I think it might be easier for me > first to see the actual code. > Please see attachment. It's a very raw version to just verify the idea, mask is not added but expected to be done as mentioned above, it's based on commit 71272b2529 on your branch, patch 0013 is the key. btw, I will setup a repository but it takes a few days to pass the internal process. Thanks, Fei. > Thanks, > Petr |
From: Petr P. <pet...@da...> - 2023-07-10 21:06:01
|
On 6. Jul 23 20:39, Wu, Fei wrote: > On 5/29/2023 11:29 AM, Wu, Fei wrote: > > On 5/28/2023 1:06 AM, Petr Pavlu wrote: > >> On 21. Apr 23 17:25, Jojo R wrote: > >>> We consider to add RVV/Vector [1] feature in valgrind, there are some > >>> challenges. > >>> RVV like ARM's SVE [2] programming model, it's scalable/VLA, that means the > >>> vector length is agnostic. > >>> ARM's SVE is not supported in valgrind :( > >>> > >>> There are three major issues in implementing RVV instruction set in Valgrind > >>> as following: > >>> > >>> 1. Scalable vector register width VLENB > >>> 2. Runtime changing property of LMUL and SEW > >>> 3. Lack of proper VEX IR to represent all vector operations > >>> > >>> We propose applicable methods to solve 1 and 2. As for 3, we explore several > >>> possible but maybe imperfect approaches to handle different cases. > >>> > I did a very basic prototype for vlen Vector-IR, particularly on RISC-V > Vector (RVV): > > * Define new iops such as Iop_VAdd8/16/32/64, the difference from > existing SIMD version is that no element number is specified like > Iop_Add8x32 > > * Define new IR type Ity_VLen along side existing types such as Ity_I64, > Ity_V256 > > * Define new class HRcVecVLen in HRegClass for vlen vector registers > The real length is embedded in both IROp and IRType for vlen ops/types, > it's runtime-decided and already known when handling insn such as vadd, > this leads to more flexibility, e.g. backend can issue extra vsetvl if > necessary. > > With the above, RVV instruction in the guest can be passed from > frontend, to memcheck, to the backend, and generate the final RVV insn > during host isel, a very basic testcase has been tested. > > Now here comes to the complexities: > > 1. RVV has the concept of LMUL, which groups multiple (or partial) > vector registers, e.g. when LMUL==2, v2 means the real v2+v3. This > complicates the register allocation. > > 2. RVV uses the "implicit" v0 for mask, its content must be loaded to > the exact "v0" register instead of any other ones if host isel wants to > leverage RVV insn, this implicitness in ISA requires more explicitness > in Valgrind implementation. > > For #1 LMUL, a new register allocation algorithm for it can be added, > and it will be great if someone is willing to try it, I'm not sure how > much effort it will take. The other way is splitting it into multiple > ops which only takes one vector register, taking vadd for example, 2 > vadd will run with LMUL=1 for one vadd with LMUL=2, this is still okay > for the widening insn, most of the arithmetic insns can be covered in > this way. The exception could be register gather insn vrgather, which we > can consult other ways for it, e.g. scalar or helper. > > For #2 v0 mask, one way is to handle the mask in the very beginning at > guest_riscv64_toIR.c, similar to what AVX port does: > > a) Read the whole dest register without mask > b) Generate unmasked result by running op without mask > c) Applying mask to a,b and generate the final dest > > by doing this, insn with mask is converted to non-mask ones, although > more insns are generated but the performance should be acceptable. There > are still exceptions, e.g. vadc (Add-with-Carry), v0 is not used as mask > but as carry, but just as mentioned above, it's okay to use other ways > for a few insns. Eventually, we can pass v0 mask down to the backend if > it's proved a better solution. > > This approach will introduce a bunch of new vlen Vector IRs, especially > the arithmetic IRs such as vadd, my goal is for a good solution which > takes reasonable time to reach usable status, yet still be able to > evolve and generic enough for other vector ISA. Any comments? Could you please share a repository with your changes or send them to me as patches? I have a few questions but I think it might be easier for me first to see the actual code. Thanks, Petr |
From: Wu, F. <fe...@in...> - 2023-07-06 12:40:15
|
On 5/29/2023 11:29 AM, Wu, Fei wrote: > On 5/28/2023 1:06 AM, Petr Pavlu wrote: >> On 21. Apr 23 17:25, Jojo R wrote: >>> We consider to add RVV/Vector [1] feature in valgrind, there are some >>> challenges. >>> RVV like ARM's SVE [2] programming model, it's scalable/VLA, that means the >>> vector length is agnostic. >>> ARM's SVE is not supported in valgrind :( >>> >>> There are three major issues in implementing RVV instruction set in Valgrind >>> as following: >>> >>> 1. Scalable vector register width VLENB >>> 2. Runtime changing property of LMUL and SEW >>> 3. Lack of proper VEX IR to represent all vector operations >>> >>> We propose applicable methods to solve 1 and 2. As for 3, we explore several >>> possible but maybe imperfect approaches to handle different cases. >>> I did a very basic prototype for vlen Vector-IR, particularly on RISC-V Vector (RVV): * Define new iops such as Iop_VAdd8/16/32/64, the difference from existing SIMD version is that no element number is specified like Iop_Add8x32 * Define new IR type Ity_VLen along side existing types such as Ity_I64, Ity_V256 * Define new class HRcVecVLen in HRegClass for vlen vector registers The real length is embedded in both IROp and IRType for vlen ops/types, it's runtime-decided and already known when handling insn such as vadd, this leads to more flexibility, e.g. backend can issue extra vsetvl if necessary. With the above, RVV instruction in the guest can be passed from frontend, to memcheck, to the backend, and generate the final RVV insn during host isel, a very basic testcase has been tested. Now here comes to the complexities: 1. RVV has the concept of LMUL, which groups multiple (or partial) vector registers, e.g. when LMUL==2, v2 means the real v2+v3. This complicates the register allocation. 2. RVV uses the "implicit" v0 for mask, its content must be loaded to the exact "v0" register instead of any other ones if host isel wants to leverage RVV insn, this implicitness in ISA requires more explicitness in Valgrind implementation. For #1 LMUL, a new register allocation algorithm for it can be added, and it will be great if someone is willing to try it, I'm not sure how much effort it will take. The other way is splitting it into multiple ops which only takes one vector register, taking vadd for example, 2 vadd will run with LMUL=1 for one vadd with LMUL=2, this is still okay for the widening insn, most of the arithmetic insns can be covered in this way. The exception could be register gather insn vrgather, which we can consult other ways for it, e.g. scalar or helper. For #2 v0 mask, one way is to handle the mask in the very beginning at guest_riscv64_toIR.c, similar to what AVX port does: a) Read the whole dest register without mask b) Generate unmasked result by running op without mask c) Applying mask to a,b and generate the final dest by doing this, insn with mask is converted to non-mask ones, although more insns are generated but the performance should be acceptable. There are still exceptions, e.g. vadc (Add-with-Carry), v0 is not used as mask but as carry, but just as mentioned above, it's okay to use other ways for a few insns. Eventually, we can pass v0 mask down to the backend if it's proved a better solution. This approach will introduce a bunch of new vlen Vector IRs, especially the arithmetic IRs such as vadd, my goal is for a good solution which takes reasonable time to reach usable status, yet still be able to evolve and generic enough for other vector ISA. Any comments? Best Regards, Fei. >>> We start from 1. As each guest register should be described in VEXGuestState >>> struct, the vector registers with scalable width of VLENB can be added into >>> VEXGuestState as arrays using an allowable maximum length like 2048/4096. >> >> Size of VexGuestRISCV64State is currently 592 bytes. Adding these large >> vector registers will bump it by 32*2048/8=8192 bytes. >> > Yes, that's the reason in my RFC patches the vlen is set to 128, that's > the largest room for vector in current design. > >> The baseblock layout in VEX is: the guest state, two equal sized areas >> for shadow state and then a spill area. The RISC-V port accesses the >> baseblock in generated code via x8/s0. The register is set to the >> address of the baseblock+2048 (file >> coregrind/m_dispatch/dispatch-riscv64-linux.S). The extra offset is >> a small optimization to utilize the fact that load/store instructions in >> RVI have a signed offset in range [-2048,2047]. The end result is that >> it is possible to access the baseblock data using only a single >> instruction. >> > Nice design. > >> Adding the new vector registers will cause that more instructions will >> be necessary. For instance, accessing any shadow guest state would >> naively require a sequence of LUI+ADDI+LOAD/STORE. >> >> I suspect this could affect performance quite a bit and might need some >> optimizing. >> > Yes, can we separate the vector registers from the other ones, is it > able to use two baseblocks? Or we can do some experiments to measure the > overhead. > >>> >>> The actual available access range can be determined at Valgrind startup time >>> by querying the CPU for its vector capability or some suitable setup steps. >> >> Something to consider is that the virtual CPU provided by Valgrind does >> not necessarily need to match the host CPU. For instance, VEX could >> hardcode that its vector registers are only 128 bits in size. >> >> I was originally hoping that this is how support for the V extension >> could be added, but the LMUL grouping looks to break this model. >> > Originally I had the same idea, but 128 vlen hardware cannot run the > software built for larger vlen, e.g. clang has option > -riscv-v-vector-bits-min, if it's set to 256, then it assumes the > underlying hardware has at least 256 vlen. > >>> >>> >>> To solve problem 2, we are inspired by already-proven techniques in QEMU, >>> where translation blocks are broken up when certain critical CSRs are set. >>> Because the guest code to IR translation relies on the precise value of >>> LMUL/SEW and they may change within a basic block, we can break up the basic >>> block each time encountering a vsetvl{i} instruction and return to the >>> scheduler to execute the translated code and update LMUL/SEW. Accordingly, >>> translation cache management should be refactored to detect the changing of >>> LMUL/SEW to invalidate outdated code cache. Without losing the generality, >>> the LMUL/SEW should be encoded into an ULong flag such that other >>> architectures can leverage this flag to store their arch-dependent >>> information. The TTentry struct should also take the flag into account no >>> matter insertion or deletion. By doing this, the flag carries the newest >>> LMUL/SEW throughout the simulation and can be passed to disassemble >>> functions using the VEXArchInfo struct such that we can get the real and >>> newest value of LMUL and SEW to facilitate our translation. >>> >>> Also, some architecture-related code should be taken care of. Like >>> m_dispatch part, disp_cp_xindir function looks up code cache using hardcoded >>> assembly by checking the requested guest state IP and translation cache >>> entry address with no more constraints. Many other modules should be checked >>> to ensure the in-time update of LMUL/SEW is instantly visible to essential >>> parts in Valgrind. >>> >>> >>> The last remaining big issue is 3, which we introduce some ad-hoc approaches >>> to deal with. We summarize these approaches into three types as following: >>> >>> 1. Break down a vector instruction to scalar VEX IR ops. >>> 2. Break down a vector instruction to fixed-length VEX IR ops. >>> 3. Use dirty helpers to realize vector instructions. >> >> I would also look at adding new VEX IR ops for scalable vector >> instructions. In particular, if it could be shown that RVV and SVE can >> use same new ops then it could make a good argument for adding them. >> >> Perhaps interesting is if such new scalable vector ops could also >> represent fixed operations on other architectures, but that is just me >> thinking out loud. >> > It's a good idea to consolidate all vector/simd together, the challenge > is to verify its feasibility and to speedup the adaption progress, as > it's supposed to take more efforts and longer time. Is there anyone with > knowledge or experience of other ISA such as avx/sve on valgrind can > share the pain and gain, or we can do some quick prototype? > > Thanks, > Fei. > >>> [...] >>> In summary, it is far to reach a truly applicable solution in adding vector >>> extensions in Valgrind. We need to do detailed and comprehensive estimations >>> on different vector instruction categories. >>> >>> Any feedback is welcome in github [3] also. >>> >>> >>> [1] https://github.com/riscv/riscv-v-spec >>> >>> [2] https://community.arm.com/arm-research/b/articles/posts/the-arm-scalable-vector-extension-sve >>> >>> [3] https://github.com/petrpavlu/valgrind-riscv64/issues/17 >> >> Sorry for not being more helpful at this point. As mentioned in the >> GitHub issue, I still need to get myself more familiar with RVV and how >> Valgrind handles vector instructions. >> >> Thanks, >> Petr >> >> >> >> _______________________________________________ >> Valgrind-developers mailing list >> Val...@li... >> https://lists.sourceforge.net/lists/listinfo/valgrind-developers > > > > _______________________________________________ > Valgrind-developers mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-developers |
From: Tom H. <to...@co...> - 2023-07-03 11:36:54
|
On 03/07/2023 10:42, Daniel Fishman wrote: > Thanks for the pointer. I just commented out the line for the mentioned syscall > number from valgrind's syscall table, and this workaround was enough to solve > the problem. Since the custom syscall doesn't modify its parameters and doesn't > seem to write anything in user space, it seems that writing a wrapper > for it is not > strictly necessary - or very useful for that matter, since in any case > it won't be > possible to submit a valgrind patch for the problem. Well pread will be reading user memory so the wrapper would be checking that the memory it was given was valid, and that the file descriptor argument is valid. Not doing that won't break anything of course, it just means you may not detect some problems in your program. > Beyond this problem, maybe it could be useful if upon encountering an impossible > problem (the one when valgrind writes: "valgrind: the 'impossible' happened"), > valgrind will send a user to read the file README_MISSING_SYSCALL_OR_IOCTL > in addition to telling him to read FAQ. Had I been aware of this file > before, I would have known how to solve the problem myself. Well sure, but the chances that a random SEGV in valgrind are caused by a syscall issue are probably less than 1% so doing that would mostly just be completely misleading. Tom -- Tom Hughes (to...@co...) http://compton.nu/ |
From: Daniel F. <qua...@gm...> - 2023-07-03 09:47:41
|
On Mon, Jul 3, 2023 at 12:31 AM John Reiser <jr...@bi...> wrote: > Please show the complete output of "uname -a". > In the linux git source code repository > url = git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git > there is no git tag "v3.10.0"; only "v3.10" and "v3.10.1", "v3.10.2", etc. Yes, this distribution is based on a modified kernel and not on the official one. The full name from uname is 3.10.0-1150.15.2cpx86_64 > After checking out the code via "git checkout v3.10.1", then "grep -sr setprocns ." > shows that the linux source code has no lines that contain "setprocns". > So there is an error in part of your claim. Indeed, it seems that this syscall is specific to this distribution - upon encountering the problem I went straight to distribution's kernel source to check the syscall table and didn't realize that it may be different from the vanilla kernel. After comparing distribution's syscall table to original 3.10.x sources I see that the distribution ported some of the syscalls from later kernels (highest syscall on 3.10.x is 350, while the distribution has syscalls with higher numbers - which are not even consecutive), and added this one new custom syscall. This explains why valgrind 3.10.0 works on this distribution - it just doesn't implement wrappers for syscalls higher than 355, and therefore there is no collision with a custom syscall which hijacks syscall number from a syscall that in later kernel version is used for a different syscall (preadv2). > 2. The valgrind syscall table is in coregrind/m_syswrap/syswrap-x86-linux.c > and syswrap-amd64-linux.c. Therefore, modify the source code of valgrind > to call utsname() during initialization, and alter the table > static SyscallTableEntry syscall_table[] = { ... }; > accordingly. Probably 'static' must be removed. Also, 'const' should > be removed if necessary. [Why isn't the table 'const' in the first place?] Thanks for the pointer. I just commented out the line for the mentioned syscall number from valgrind's syscall table, and this workaround was enough to solve the problem. Since the custom syscall doesn't modify its parameters and doesn't seem to write anything in user space, it seems that writing a wrapper for it is not strictly necessary - or very useful for that matter, since in any case it won't be possible to submit a valgrind patch for the problem. Beyond this problem, maybe it could be useful if upon encountering an impossible problem (the one when valgrind writes: "valgrind: the 'impossible' happened"), valgrind will send a user to read the file README_MISSING_SYSCALL_OR_IOCTL in addition to telling him to read FAQ. Had I been aware of this file before, I would have known how to solve the problem myself. |
From: John R. <jr...@bi...> - 2023-07-02 21:30:48
|
> On a machine that has an old linux kernel, when valgrind 3.21.0 runs an > executable that contains a call to syscall 378 - valgrind fails after > being killed by a fatal signal. > > The kernel on the machine is 3.10.0 x86_64 (the system is based on RedHad 5, > I think), libc 2.17, and the executable itself is 32 bit. Please show the complete output of "uname -a". In the linux git source code repository url = git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git there is no git tag "v3.10.0"; only "v3.10" and "v3.10.1", "v3.10.2", etc. > On this particular kernel, syscall 378 happens to be mapped to setprocns, After checking out the code via "git checkout v3.10.1", then "grep -sr setprocns ." shows that the linux source code has no lines that contain "setprocns". So there is an error in part of your claim. > valgrind thinks that an executable is trying to execute syscall preadv2, which > is indeed mapped to 378 on newer kernels, but doesn't exist in linux 3.10.0 It is correct that "grep -sr preadv2 ." shows no matching lines in linux v3.10.1. However, "cd arch/x86; grep -sr 378 ." shows no syscall numbered 378. Please verify what you are talking about. "cd arch/x86; grep -sr setns ." does show these syscalls with a related name: ./syscalls/syscall_64.tbl:308 common setns sys_setns ./syscalls/syscall_32.tbl:346 i386 setns sys_setns > Older versions of valgrind (for example, valgrind 3.10.0) don't have this > problem, and succeed to execute the same executable on this machine. Therefore one solution is to use a version of valgrind that is contemporaneous with your kernel. Check the list of versions and dates for both linux and valgrind, and find the best match. > What can I do to fix the problem? Unfortunately I am stuck with having to > use such an old system, and therefore using newer kernel is not an option. 1. Double check the versions that you claim, and prove against the official sources. 2. The valgrind syscall table is in coregrind/m_syswrap/syswrap-x86-linux.c and syswrap-amd64-linux.c. Therefore, modify the source code of valgrind to call utsname() during initialization, and alter the table static SyscallTableEntry syscall_table[] = { ... }; accordingly. Probably 'static' must be removed. Also, 'const' should be removed if necessary. [Why isn't the table 'const' in the first place?] Note that any change in syscall numbers creates a giant incompatibility for any app that is built using "the other" assignments. No old apps can be relied to run on newer systems, and no new apps can be relied to run on older systems. That's a disaster, and it is *NOT* "waiting to happen"; it has already happened. |
From: Daniel F. <qua...@gm...> - 2023-07-02 11:40:10
|
Hello, On a machine that has an old linux kernel, when valgrind 3.21.0 runs an executable that contains a call to syscall 378 - valgrind fails after being killed by a fatal signal. The kernel on the machine is 3.10.0 x86_64 (the system is based on RedHad 5, I think), libc 2.17, and the executable itself is 32 bit. On this particular kernel, syscall 378 happens to be mapped to setprocns, while valgrind thinks that an executable is trying to execute syscall preadv2, which is indeed mapped to 378 on newer kernels, but doesn't exist in linux 3.10.0 Older versions of valgrind (for example, valgrind 3.10.0) don't have this problem, and succeed to execute the same executable on this machine. According to release notes it seems that this platform is supported. What can I do to fix the problem? Unfortunately I am stuck with having to use such an old system, and therefore using newer kernel is not an option. Test program and valgrind's report with the fatal signal are attached. gcc 6.3.0 was used for compilations (both the executable and the valgrind). |
From: Simon S. <sim...@gn...> - 2023-06-29 18:25:45
|
Am 29.06.2023 um 18:19 schrieb Mark Wielaard: > Hi Simon, > > On Thu, Jun 29, 2023 at 05:46:59PM +0200, Simon Sobisch wrote: >> Am 29.06.2023 um 15:10 schrieb John Reiser: >>>> Running valgrind on GnuCOBOL errors out with >>>> >>>> vex amd64->IR: unhandled instruction bytes: >>>> 0x62 0xF1 0xFE 0x8 0x6F 0x7 0x48 0xC7 0x5 0x6F >>>> vex amd64->IR: REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0 >>>> vex amd64->IR: VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE >>>> vex amd64->IR: PFX.66=0 PFX.F2=0 PFX.F3=0 >>>> valgrind: Unrecognised instruction at address 0x4e75f20. >>>> at 0x4E75F20: cob_string_init (strings.c:742) >>> >>>> 132 (gdb) disassemble /s >>>> 133 Dump of assembler code for function cob_string_init: >>>> 134 ../../libcob/strings.c: >>>> 135 741 { >>>> 136 742 string_dst_copy = *dst; >>>> 137 => 0x0000000004e75f20 <+0>: vmovdqu64 (%rdi),%xmm0 >>> >>>> Is there anything I can do this to still run the application >>>> with valgrind or do I need to wait for a hotfix? > > vmovdqu64 is part of AVX512, see this bug: > https://bugs.kde.org/show_bug.cgi?id=valgrind-avx512 > (yes, it has been reported so many times that it has its own alias) Whoa! Thanks for pointing this out (I have not found that on the user list, but that is likely because of the exact instruction I've searched for). So the workaround seems to be to compile sources with gcc -march=native -mno-avx512f -mno-avx512dq -mno-avx512cd -mno-avx512bw -mno-avx512vl -mno-avx512ifma -mno-avx512vbmi -mno-avx512vbmi2 -mno-avx512vnni -mno-avx512bitalg -mno-avx5124fmaps -mno-avx5124vnniw -mno-avx5124vbmi -mno-avx512vpopcntdq or use an -march that is "generic" and get slower code when running outside of valgrind. > There are patches, but the original submitter isn't working on it > anymore. So we need someone to pick up the code and go through the > feedback to get it integrated. Hm, as far as I see all the feedback is already considered, no? Sadly I'm not in the position to finish that and _guess_ that there's no .patch file which I could directly apply to the last release, is there? Thanks, Simon |
From: Mark W. <ma...@kl...> - 2023-06-29 16:19:36
|
Hi Simon, On Thu, Jun 29, 2023 at 05:46:59PM +0200, Simon Sobisch wrote: > Am 29.06.2023 um 15:10 schrieb John Reiser: > >>Running valgrind on GnuCOBOL errors out with > >> > >>vex amd64->IR: unhandled instruction bytes: > >> 0x62 0xF1 0xFE 0x8 0x6F 0x7 0x48 0xC7 0x5 0x6F > >>vex amd64->IR: REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0 > >>vex amd64->IR: VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE > >>vex amd64->IR: PFX.66=0 PFX.F2=0 PFX.F3=0 > >>valgrind: Unrecognised instruction at address 0x4e75f20. > >> at 0x4E75F20: cob_string_init (strings.c:742) > > > >>132 (gdb) disassemble /s > >>133 Dump of assembler code for function cob_string_init: > >>134 ../../libcob/strings.c: > >>135 741 { > >>136 742 string_dst_copy = *dst; > >>137 => 0x0000000004e75f20 <+0>: vmovdqu64 (%rdi),%xmm0 > > > >>Is there anything I can do this to still run the application > >>with valgrind or do I need to wait for a hotfix? vmovdqu64 is part of AVX512, see this bug: https://bugs.kde.org/show_bug.cgi?id=valgrind-avx512 (yes, it has been reported so many times that it has its own alias) There are patches, but the original submitter isn't working on it anymore. So we need someone to pick up the code and go through the feedback to get it integrated. Thanks, Mark |
From: Simon S. <sim...@gn...> - 2023-06-29 15:47:13
|
Sorry, I should have made this explicit! The error initially was seen with $> valgrind --version valgrind-3.20.0 which was then updated to $> valgrind --version valgrind-3.21.0 where this output below (100% identical to 3.20.0) came from. Both Valgrind and GnuCOBOL were compiled with gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-4) GNU assembler version 2.30-117.el8 on Linux 4.18.0-348.el8.x86_64 Simon Am 29.06.2023 um 15:10 schrieb John Reiser: >> Running valgrind on GnuCOBOL errors out with >> >> vex amd64->IR: unhandled instruction bytes: >> 0x62 0xF1 0xFE 0x8 0x6F 0x7 0x48 0xC7 0x5 0x6F >> vex amd64->IR: REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0 >> vex amd64->IR: VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE >> vex amd64->IR: PFX.66=0 PFX.F2=0 PFX.F3=0 >> valgrind: Unrecognised instruction at address 0x4e75f20. >> at 0x4E75F20: cob_string_init (strings.c:742) > >> 132 (gdb) disassemble /s >> 133 Dump of assembler code for function cob_string_init: >> 134 ../../libcob/strings.c: >> 135 741 { >> 136 742 string_dst_copy = *dst; >> 137 => 0x0000000004e75f20 <+0>: vmovdqu64 (%rdi),%xmm0 > >> Is there anything I can do this to still run the application with >> valgrind or do I need to wait for a hotfix? > > As always: report the version of valgrind. Run "valgrind --version", > then copy+paste the output here. The version is the #1 clue for any > investigation. |
From: John R. <jr...@bi...> - 2023-06-29 13:10:15
|
> Running valgrind on GnuCOBOL errors out with > > vex amd64->IR: unhandled instruction bytes: > 0x62 0xF1 0xFE 0x8 0x6F 0x7 0x48 0xC7 0x5 0x6F > vex amd64->IR: REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0 > vex amd64->IR: VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=NONE > vex amd64->IR: PFX.66=0 PFX.F2=0 PFX.F3=0 > valgrind: Unrecognised instruction at address 0x4e75f20. > at 0x4E75F20: cob_string_init (strings.c:742) > 132 (gdb) disassemble /s > 133 Dump of assembler code for function cob_string_init: > 134 ../../libcob/strings.c: > 135 741 { > 136 742 string_dst_copy = *dst; > 137 => 0x0000000004e75f20 <+0>: vmovdqu64 (%rdi),%xmm0 > Is there anything I can do this to still run the application with valgrind or do I need to wait for a hotfix? As always: report the version of valgrind. Run "valgrind --version", then copy+paste the output here. The version is the #1 clue for any investigation. |