|
From: Michael A. <Mic...@fs...> - 2007-04-26 00:09:59
|
I tried to valgrind a test out of the linbox test suite on a mabshoff@fsmath /tmp/Work/linbox-r2685/tests $ cat /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 6 model : 4 model name : AMD Athlon(tm) processor stepping : 2 cpu MHz : 1205.160 cache size : 256 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr syscall mmxext 3dnowext 3dnow bogomips : 2411.39 clflush size : 32 with valgrind 3.2.3 compile from sources. The failure is in a BLAS library called ATLAS which has also been compiled on that box mabshoff@fsmath /tmp/Work/linbox-r2685/tests $ /usr/local/valgrind-3.2.3/bin/valgrind --leak-resolution=high --tool=memcheck ./test-rational-solver ==1548== Memcheck, a memory error detector. ==1548== Copyright (C) 2002-2007, and GNU GPL'd, by Julian Seward et al. ==1548== Using LibVEX rev 1732, a library for dynamic binary translation. ==1548== Copyright (C) 2004-2007, and GNU GPL'd, by OpenWorks LLP. ==1548== Using valgrind-3.2.3, a dynamic binary instrumentation framework. ==1548== Copyright (C) 2000-2007, and GNU GPL'd, by Julian Seward et al. ==1548== For more details, rerun with: -v ==1548== Testing Nonsingular Random Diagonal solve ... 0%vex x86->IR: unhandled instruction bytes: 0xF 0xD 0x0 0xF ==1548== valgrind: Unrecognised instruction at address 0x80A5315. ==1548== Your program just tried to execute an instruction that Valgrind ==1548== did not recognise. There are two possible reasons for this. ==1548== 1. Your program has a bug and erroneously jumped to a non-code ==1548== location. If you are running Memcheck and you just saw a ==1548== warning about a bad jump, it's probably your program's fault. ==1548== 2. The instruction is legitimate but Valgrind doesn't handle it, ==1548== i.e. it's Valgrind's fault. If you think this is the case or ==1548== you are not sure, please let us know and we'll try to fix it. ==1548== Either way, Valgrind will now raise a SIGILL signal which will ==1548== probably kill your program. ==1548== ==1548== Process terminating with default action of signal 4 (SIGILL) ==1548== Illegal opcode at address 0x80A5315 ==1548== at 0x80A5315: ATL_dJIK0x0x0NN0x0x0_aX_bX (in /tmp/Work/linbox-r2685/tests/test-rational-solver) ==1548== ==1548== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 9 from 1) ==1548== malloc/free: in use at exit: 256,702 bytes in 10,610 blocks. ==1548== malloc/free: 21,208 allocs, 10,598 frees, 303,027 bytes allocated. ==1548== For counts of detected errors, rerun with: -v ==1548== searching for pointers to 10,610 not-freed blocks. ==1548== checked 325,380 bytes. ==1548== ==1548== LEAK SUMMARY: ==1548== definitely lost: 0 bytes in 0 blocks. ==1548== possibly lost: 550 bytes in 2 blocks. ==1548== still reachable: 256,152 bytes in 10,608 blocks. ==1548== suppressed: 0 bytes in 0 blocks. ==1548== Rerun with --leak-check=full to see details of leaked memory. Illegal instruction I didn't see a warning about a bad jump, so which instructions are being executed? If it is something SSE/FP related it is probably an issue with ATLAS not properly detecting the CPU and using illegal instructions. Another quick question: Is this topic more suited for the user list or does this belong to devel? Cheers, Michael |