|
From: Tobias W. <wid...@gm...> - 2014-09-09 20:39:44
|
Hello!
I am having issues using valgrind on my project. I am getting an "Illegal
opcode" error. This is the full invocation and the log:
132 tobbe@archosaurus ~/projects/blocks (git)-[master] % valgrind
./bin/blocks
:(
==21008== Memcheck, a memory error detector
==21008== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==21008== Using Valgrind-3.9.0 and LibVEX; rerun with -h for copyright info
==21008== Command: ./bin/blocks
==21008==
[22:36:34|script|VERBOSE]: Adding data/scripts/elephant.as for compilation.
[22:36:34|script|VERBOSE]: Adding data/scripts/general.as for compilation.
[22:36:34|script|VERBOSE]: Adding data/scripts/core/entity.as for
compilation.
[22:36:34|script|VERBOSE]: Adding data/scripts/core/loglevel.as for
compilation.
[22:36:34|script|VERBOSE]: Adding data/scripts/player.as for compilation.
[22:36:34|script|INFO]: Compiling scripts...
[22:36:34|script|INFO]: Compilation process over.
[22:36:34|script|VERBOSE]: Setting up script callbacks...
[22:36:34|script|VERBOSE]: Compilation process over.
[22:36:34|script|VERBOSE]: Done setting up callbacks.
[22:36:34|entity|INFO]: Loading entity definitions
[22:36:34|entity|INFO]: Found 2 entity definitions
[22:36:35|entity|VERBOSE]: Added entity type 'Elephant' to entity
definitions
[22:36:35|entity|VERBOSE]: Added entity type 'Player' to entity definitions
[22:36:35|entity|INFO]: Entity definitions loaded
[22:36:35|server|INFO]: Server initialised and ready to go
[22:36:35|script|VERBOSE]: Verbose message from the script
[22:36:35|script|INFO]: Info message from the script
[22:36:35|script|WARNING]: Warning message from the script
[22:36:35|script|ERROR]: Error message from the script
[22:36:35|script|INFO]: Game started!
[22:36:37|client|INFO]: client connected to server
vex amd64->IR: unhandled instruction bytes: 0xF 0x1 0xD5 0x31 0xC0 0xC3
0x48 0x8D
vex amd64->IR: REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0
vex amd64->IR: VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=0F
vex amd64->IR: PFX.66=0 PFX.F2=0 PFX.F3=0
==21008== valgrind: Unrecognised instruction at address 0x5146eb8.
==21008== at 0x5146EB8: __lll_unlock_elision (in /usr/lib/
libpthread-2.19.so)
==21008== by 0x442FC4:
LocalClientConnectionListener::createClientConnection(LocalServerClientBridge*)
(gthr-default.h:778)
==21008== by 0x4445B9: BlocksApplication::setupSinglePlayer()
(blocksapp.cpp:104)
==21008== by 0x58ACA2E: fea::Application::run(int, char**) (in
/usr/local/lib/libfea-structure.so)
==21008== by 0x44A455: main (main.cpp:6)
==21008== Your program just tried to execute an instruction that Valgrind
==21008== did not recognise. There are two possible reasons for this.
==21008== 1. Your program has a bug and erroneously jumped to a non-code
==21008== location. If you are running Memcheck and you just saw a
==21008== warning about a bad jump, it's probably your program's fault.
==21008== 2. The instruction is legitimate but Valgrind doesn't handle it,
==21008== i.e. it's Valgrind's fault. If you think this is the case or
==21008== you are not sure, please let us know and we'll try to fix it.
==21008== Either way, Valgrind will now raise a SIGILL signal which will
==21008== probably kill your program.
==21008==
==21008== Process terminating with default action of signal 4 (SIGILL):
dumping core
==21008== Illegal opcode at address 0x5146EB8
==21008== at 0x5146EB8: __lll_unlock_elision (in /usr/lib/
libpthread-2.19.so)
==21008== by 0x442FC4:
LocalClientConnectionListener::createClientConnection(LocalServerClientBridge*)
(gthr-default.h:778)
==21008== by 0x4445B9: BlocksApplication::setupSinglePlayer()
(blocksapp.cpp:104)
==21008== by 0x58ACA2E: fea::Application::run(int, char**) (in
/usr/local/lib/libfea-structure.so)
==21008== by 0x44A455: main (main.cpp:6)
==21008==
==21008== HEAP SUMMARY:
==21008== in use at exit: 31,353,847 bytes in 42,861 blocks
==21008== total heap usage: 120,444 allocs, 77,583 frees, 106,018,132
bytes allocated
==21008==
==21008== LEAK SUMMARY:
==21008== definitely lost: 394,020 bytes in 479 blocks
==21008== indirectly lost: 128 bytes in 1 blocks
==21008== possibly lost: 26,168,346 bytes in 39,958 blocks
==21008== still reachable: 4,791,353 bytes in 2,423 blocks
==21008== suppressed: 0 bytes in 0 blocks
==21008== Rerun with --leak-check=full to see details of leaked memory
==21008==
==21008== For counts of detected and suppressed errors, rerun with: -v
==21008== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 7 from 2)
[5] 21008 illegal hardware instruction valgrind ./bin/blocks
valgrind ./bin/blocks 7.13s user 0.18s system 99% cpu 7.371 total
I have tried googling on this error but I am not finding much related to
it. Any ideas on how to fix it?
Thanks!
|
|
From: Julian S. <js...@ac...> - 2014-09-09 20:55:33
|
This is HTM lock elision stuff; 3.9.0 doesn't support it but 3.10.0.BETA2 might well do. > I have tried googling on this error but I am not finding much related to > it. Any ideas on how to fix it? Try with http://www.valgrind.org/downloads/valgrind-3.10.0.BETA2.tar.bz2 instead. J |
|
From: Philippe W. <phi...@sk...> - 2014-09-09 20:57:39
|
On Tue, 2014-09-09 at 22:39 +0200, Tobias Widlund wrote: > > vex amd64->IR: unhandled instruction bytes: 0xF 0x1 0xD5 0x31 0xC0 > 0xC3 0x48 0x8D > vex amd64->IR: REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0 > vex amd64->IR: VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=0F > vex amd64->IR: PFX.66=0 PFX.F2=0 PFX.F3=0 > ==21008== valgrind: Unrecognised instruction at address 0x5146eb8. > ==21008== at 0x5146EB8: __lll_unlock_elision > (in /usr/lib/libpthread-2.19.so) That looks to be the use of the new 'transaction' instructions (xbegin etc). Assuming this is the case; you need the last 3.10 valgrind release (it is in beta status, see the announcement yesterday) Download from: http://www.valgrind.org/downloads/valgrind-3.10.0.BETA2.tar.bz2 Philippe |
|
From: Tobias W. <wid...@gm...> - 2014-09-10 10:28:08
|
Thanks for the reply, I did download and build the valgrind version you linked and the illegal instruction error went away, but instead I am getting a segfault on the same place, as seen: tobbe@archosaurus ~/projects/blocks (git)-[master] % valgrind ./bin/blocks ==7731== Memcheck, a memory error detector ==7731== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al. ==7731== Using Valgrind-3.10.0.BETA2 and LibVEX; rerun with -h for copyright info ==7731== Command: ./bin/blocks ==7731== [12:25:39|script|VERBOSE]: Adding data/scripts/elephant.as for compilation. [12:25:39|script|VERBOSE]: Adding data/scripts/general.as for compilation. [12:25:39|script|VERBOSE]: Adding data/scripts/core/entity.as for compilation. [12:25:39|script|VERBOSE]: Adding data/scripts/core/loglevel.as for compilation. [12:25:39|script|VERBOSE]: Adding data/scripts/player.as for compilation. [12:25:39|script|INFO]: Compiling scripts... [12:25:40|script|INFO]: Compilation process over. [12:25:40|script|VERBOSE]: Setting up script callbacks... [12:25:40|script|VERBOSE]: Compilation process over. [12:25:40|script|VERBOSE]: Done setting up callbacks. [12:25:40|entity|INFO]: Loading entity definitions [12:25:40|entity|INFO]: Found 2 entity definitions [12:25:40|entity|VERBOSE]: Added entity type 'Elephant' to entity definitions [12:25:40|entity|VERBOSE]: Added entity type 'Player' to entity definitions [12:25:40|entity|INFO]: Entity definitions loaded [12:25:40|server|INFO]: Server initialised and ready to go [12:25:40|script|VERBOSE]: Verbose message from the script [12:25:40|script|INFO]: Info message from the script [12:25:40|script|WARNING]: Warning message from the script [12:25:40|script|ERROR]: Error message from the script [12:25:40|script|INFO]: Game started! [12:25:43|client|INFO]: client connected to server ==7731== ==7731== Process terminating with default action of signal 11 (SIGSEGV): dumping core ==7731== General Protection Fault ==7731== at 0x5145EBB: __lll_unlock_elision (in /usr/lib/ libpthread-2.19.so) ==7731== by 0x441D31: LocalClientConnectionListener::createClientConnection(LocalServerClientBridge*) (in /home/tobbe/projects/blocks/bin/blocks) ==7731== by 0x4432D6: BlocksApplication::setupSinglePlayer() (in /home/tobbe/projects/blocks/bin/blocks) ==7731== by 0x58ABA2E: fea::Application::run(int, char**) (in /usr/local/lib/libfea-structure.so) ==7731== by 0x4497F5: main (in /home/tobbe/projects/blocks/bin/blocks) ==7731== ==7731== HEAP SUMMARY: ==7731== in use at exit: 31,353,847 bytes in 42,861 blocks ==7731== total heap usage: 120,444 allocs, 77,583 frees, 106,018,132 bytes allocated ==7731== ==7731== LEAK SUMMARY: ==7731== definitely lost: 426,836 bytes in 480 blocks ==7731== indirectly lost: 128 bytes in 1 blocks ==7731== possibly lost: 26,135,530 bytes in 39,957 blocks ==7731== still reachable: 4,791,353 bytes in 2,423 blocks ==7731== suppressed: 0 bytes in 0 blocks ==7731== Rerun with --leak-check=full to see details of leaked memory ==7731== ==7731== For counts of detected and suppressed errors, rerun with: -v ==7731== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 6 from 1) [5] 7731 segmentation fault valgrind ./bin/blocks valgrind ./bin/blocks 7.64s user 0.11s system 99% cpu 7.760 total This segfault does not happen without valgrind. Any ideas? On 9 September 2014 22:57, Philippe Waroquiers < phi...@sk...> wrote: > On Tue, 2014-09-09 at 22:39 +0200, Tobias Widlund wrote: > > > > > vex amd64->IR: unhandled instruction bytes: 0xF 0x1 0xD5 0x31 0xC0 > > 0xC3 0x48 0x8D > > vex amd64->IR: REX=0 REX.W=0 REX.R=0 REX.X=0 REX.B=0 > > vex amd64->IR: VEX=0 VEX.L=0 VEX.nVVVV=0x0 ESC=0F > > vex amd64->IR: PFX.66=0 PFX.F2=0 PFX.F3=0 > > ==21008== valgrind: Unrecognised instruction at address 0x5146eb8. > > ==21008== at 0x5146EB8: __lll_unlock_elision > > (in /usr/lib/libpthread-2.19.so) > > That looks to be the use of the new 'transaction' instructions > (xbegin etc). > Assuming this is the case; you need the last 3.10 valgrind > release (it is in beta status, see the announcement yesterday) > Download from: > http://www.valgrind.org/downloads/valgrind-3.10.0.BETA2.tar.bz2 > > Philippe > > > > |
|
From: Mark W. <mj...@re...> - 2014-09-10 13:17:15
|
On Wed, 2014-09-10 at 12:27 +0200, Tobias Widlund wrote: > Thanks for the reply, I did download and build the valgrind version you > linked and the illegal instruction error went away, but instead I am > getting a segfault on the same place, as seen: OK, that is somewhat expected. valgrind 3.10.0[BETA] now knows about the xend instruction, but sees it is used outside an transaction and so generates a SIGSEGV to indicate bad usage of xend (instead of a SIGILL to indicate it doesn't understand the instruction at all). > tobbe@archosaurus ~/projects/blocks (git)-[master] % valgrind ./bin/blocks > [...] > ==7731== Process terminating with default action of signal 11 (SIGSEGV): > dumping core > ==7731== General Protection Fault > ==7731== at 0x5145EBB: __lll_unlock_elision (in /usr/lib/ > libpthread-2.19.so) > ==7731== by 0x441D31: > LocalClientConnectionListener::createClientConnection(LocalServerClientBridge*) > (in /home/tobbe/projects/blocks/bin/blocks) > ==7731== by 0x4432D6: BlocksApplication::setupSinglePlayer() (in > /home/tobbe/projects/blocks/bin/blocks) > ==7731== by 0x58ABA2E: fea::Application::run(int, char**) (in > /usr/local/lib/libfea-structure.so) > ==7731== by 0x4497F5: main (in /home/tobbe/projects/blocks/bin/blocks) > [...] > [5] 7731 segmentation fault valgrind ./bin/blocks > valgrind ./bin/blocks 7.64s user 0.11s system 99% cpu 7.760 total > > This segfault does not happen without valgrind. Any ideas? So the difference might be that your native cpu doesn't support TSX, but valgrind is emulating it anyway and sets the cpuid field that glibc picks up and tries to use (valgrind -v might show you, grep for "Arch and hwcaps"). This is kind of a bug in valgrind, it should manage the cpuid bits a bit more intelligently (https://bugs.kde.org/show_bug.cgi?id=324882). But that might still mean there is a bug in either your program or glibc. What seems to be happening is that a mutex is unlocked in your code that isn't locked in the first place (glibc with TSX lock elision is a bit aggressive and just xends a transaction that isn't there). So I would first look at createClientConnection() and see if there is any suspicious unlocking going on. Otherwise it might also be this glibc bug: https://sourceware.org/bugzilla/show_bug.cgi?id=16657 "Lock elision breaks pthread_mutex_destroy". Cheers, Mark |
|
From: Tobias W. <wid...@gm...> - 2014-09-10 13:35:47
|
Thanks for your reply, with the information provided I was able to solve my
issue!
I inspected my code and as you said, there is a double unlocking of a
mutex. For reference for other people who might get this issue, I'll post
the problematic code below:
std::lock_guard<std::mutex> lock(mIncomingConnectionsMutex);
mIncomingConnections.push(clientConnection);
mIncomingConnectionsMutex.unlock();
Since std::lock_guard is RAII designed, there is no need to call unlock()
manually. It unlocks itself in the destructor anyway. I removed the third
line and it worked just fine.
Thanks for your help :)
On 10 September 2014 15:17, Mark Wielaard <mj...@re...> wrote:
> On Wed, 2014-09-10 at 12:27 +0200, Tobias Widlund wrote:
> > Thanks for the reply, I did download and build the valgrind version you
> > linked and the illegal instruction error went away, but instead I am
> > getting a segfault on the same place, as seen:
>
> OK, that is somewhat expected. valgrind 3.10.0[BETA] now knows about the
> xend instruction, but sees it is used outside an transaction and so
> generates a SIGSEGV to indicate bad usage of xend (instead of a SIGILL
> to indicate it doesn't understand the instruction at all).
>
> > tobbe@archosaurus ~/projects/blocks (git)-[master] % valgrind
> ./bin/blocks
> > [...]
> > ==7731== Process terminating with default action of signal 11 (SIGSEGV):
> > dumping core
> > ==7731== General Protection Fault
> > ==7731== at 0x5145EBB: __lll_unlock_elision (in /usr/lib/
> > libpthread-2.19.so)
> > ==7731== by 0x441D31:
> >
> LocalClientConnectionListener::createClientConnection(LocalServerClientBridge*)
> > (in /home/tobbe/projects/blocks/bin/blocks)
> > ==7731== by 0x4432D6: BlocksApplication::setupSinglePlayer() (in
> > /home/tobbe/projects/blocks/bin/blocks)
> > ==7731== by 0x58ABA2E: fea::Application::run(int, char**) (in
> > /usr/local/lib/libfea-structure.so)
> > ==7731== by 0x4497F5: main (in /home/tobbe/projects/blocks/bin/blocks)
> > [...]
> > [5] 7731 segmentation fault valgrind ./bin/blocks
> > valgrind ./bin/blocks 7.64s user 0.11s system 99% cpu 7.760 total
> >
> > This segfault does not happen without valgrind. Any ideas?
>
> So the difference might be that your native cpu doesn't support TSX, but
> valgrind is emulating it anyway and sets the cpuid field that glibc
> picks up and tries to use (valgrind -v might show you, grep for "Arch
> and hwcaps"). This is kind of a bug in valgrind, it should manage the
> cpuid bits a bit more intelligently
> (https://bugs.kde.org/show_bug.cgi?id=324882).
>
> But that might still mean there is a bug in either your program or
> glibc. What seems to be happening is that a mutex is unlocked in your
> code that isn't locked in the first place (glibc with TSX lock elision
> is a bit aggressive and just xends a transaction that isn't there). So I
> would first look at createClientConnection() and see if there is any
> suspicious unlocking going on. Otherwise it might also be this glibc
> bug: https://sourceware.org/bugzilla/show_bug.cgi?id=16657
> "Lock elision breaks pthread_mutex_destroy".
>
> Cheers,
>
> Mark
>
|