|
From: santosh <ys...@gm...> - 2015-04-04 16:00:20
|
Hello, I still get the error: disInstr(ppc): unhandled instruction: 0x10E40301 I thought Valgrind 3.10.1 has support for CPU: e500v2? NO? Thanks, Santosh |
|
From: John R. <jr...@bi...> - 2015-04-05 03:36:05
|
> I still get the error: disInstr(ppc): unhandled instruction: 0x10E40301 > I thought Valgrind 3.10.1 has support for CPU: e500v2? NO? 0x10E40301 ==> "evldd r7,0(r4)". Developers are much more likely to recognize 'evldd' than an instruction word in hex. It is customary to give some documented basis for "I thought ...". Just a little bit of searching the bug list: https://bugs.kde.org Product: valgrind Content: e500 yields this report from 2010-10-28 (4.5 years ago), last updated 2013-04-02 (two years ago): https://bugs.kde.org/show_bug.cgi?id=255494 which links to this attachment from a run of valgrind-3.6.0: https://bugs.kde.org/attachment.cgi?id=52939 which contains: ----- disInstr(ppc): unhandled instruction: 0x10E40301 primary 4(0x4), secondary 769(0x301) ==19630== valgrind: Unrecognised instruction at address 0x4019510. [[snip]] ==19630== Process terminating with default action of signal 4 (SIGILL): dumping core ==19630== Illegal opcode at address 0x4019510 ==19630== at 0x4019510: memcpy (in /lib/ld-2.8.so) ==19630== by 0x40021BF: _dl_start_final (in /lib/ld-2.8.so) ==19630== by 0x40162C7: _start (in /lib/ld-2.8.so) ----- So that code for memcpy has an optimization to fetch at least 64 bits at a time in some circumstances, instead of only 32 bits. Evidently valgrind-3.10.1 still does not implement this instruction. Your best bet is to comment on bug 255494, requesting implementation. You could help by supplying a list of missing opcodes which, if implemented, would be sufficient for valgrind to handle all of memcpy. Until then, avoid code that contains this optimization. Install a more-generic version of /lib/ld-2.8.so that does not use the specialized instructions. |
|
From: santosh <ys...@gm...> - 2015-04-08 10:24:10
|
On Sun, Apr 5, 2015 at 9:05 AM, John Reiser <jr...@bi...> wrote: >> I still get the error: disInstr(ppc): unhandled instruction: 0x10E40301 >> I thought Valgrind 3.10.1 has support for CPU: e500v2? NO? > > 0x10E40301 ==> "evldd r7,0(r4)". Developers are much more likely to recognize 'evldd' > than an instruction word in hex. > > It is customary to give some documented basis for "I thought ...". Sorry. > > Just a little bit of searching the bug list: > https://bugs.kde.org Product: valgrind Content: e500 > yields this report from 2010-10-28 (4.5 years ago), last updated 2013-04-02 (two years ago): > https://bugs.kde.org/show_bug.cgi?id=255494 > which links to this attachment from a run of valgrind-3.6.0: > https://bugs.kde.org/attachment.cgi?id=52939 Thank you for pointing me towards right place. > which contains: > ----- > disInstr(ppc): unhandled instruction: 0x10E40301 > primary 4(0x4), secondary 769(0x301) > ==19630== valgrind: Unrecognised instruction at address 0x4019510. > [[snip]] > ==19630== Process terminating with default action of signal 4 (SIGILL): dumping core > ==19630== Illegal opcode at address 0x4019510 > ==19630== at 0x4019510: memcpy (in /lib/ld-2.8.so) > ==19630== by 0x40021BF: _dl_start_final (in /lib/ld-2.8.so) > ==19630== by 0x40162C7: _start (in /lib/ld-2.8.so) > ----- > So that code for memcpy has an optimization to fetch at least 64 bits at a time > in some circumstances, instead of only 32 bits. > Evidently valgrind-3.10.1 still does not implement this instruction. > > Your best bet is to comment on bug 255494, requesting implementation. > You could help by supplying a list of missing opcodes which, > if implemented, would be sufficient for valgrind to handle all of memcpy. > Until then, avoid code that contains this optimization. Install > a more-generic version of /lib/ld-2.8.so that does not use > the specialized instructions. > > > ------------------------------------------------------------------------------ > Dive into the World of Parallel Programming The Go Parallel Website, sponsored > by Intel and developed in partnership with Slashdot Media, is your hub for all > things parallel software development, from weekly thought leadership blogs to > news, videos, case studies, tutorials and more. Take a look and join the > conversation now. http://goparallel.sourceforge.net/ > _______________________________________________ > Valgrind-developers mailing list > Val...@li... > https://lists.sourceforge.net/lists/listinfo/valgrind-developers |
|
From: Julian S. <js...@ac...> - 2015-04-09 11:09:09
|
On 08/04/15 12:23, santosh wrote: >>> I still get the error: disInstr(ppc): unhandled instruction: 0x10E40301 >>> I thought Valgrind 3.10.1 has support for CPU: e500v2? NO? A general comment about e500 support. We've had people asking about this for several years now, and I think there are even some patches drifting around. From my point of view, the reason these haven't been committed is that they require proper integration with the existing POWER/PowerPC front end (guest_ppc_toIR.c). In particular, e500 has some instruction encodings which are interpreted as Altivec instructions in the "standard" POWER/PowerPC world. That means we can't simply commit e500 support as-is because it will completely break the existing POWER/PowerPC targets. Instead, the front end needs to know whether it is decoding for e500 or not. If someone could create a patch which has proper e500 hwcaps detection and uses that detection in the front end, and can verify that the patch really doesn't break existing POWER/PowerPC targets, then I'd be much more inclined to take it. That, as far as I am concerned, is really the limiting factor. J |
|
From: John R. <jr...@bi...> - 2015-04-10 13:53:17
|
On 04/09/2015 04:09 AM, Julian Seward wrote:
> On 08/04/15 12:23, santosh wrote:
>>>> I still get the error: disInstr(ppc): unhandled instruction: 0x10E40301
>>>> I thought Valgrind 3.10.1 has support for CPU: e500v2? NO?
>
> A general comment about e500 support. We've had people asking about this
> for several years now, and I think there are even some patches drifting
> around. From my point of view, the reason these haven't been committed
> is that they require proper integration with the existing POWER/PowerPC
> front end (guest_ppc_toIR.c).
>
> In particular, e500 has some instruction encodings which are interpreted
> as Altivec instructions in the "standard" POWER/PowerPC world. That means
> we can't simply commit e500 support as-is because it will completely break
> the existing POWER/PowerPC targets. Instead, the front end needs to know
> whether it is decoding for e500 or not.
>
> If someone could create a patch which has proper e500 hwcaps detection and
> uses that detection in the front end, and can verify that the patch really
> doesn't break existing POWER/PowerPC targets, then I'd be much more inclined
> to take it. That, as far as I am concerned, is really the limiting factor.
Traditionally the hardware manufacturer publishes such software, which uses
a combination of software-readable configuration registers and deliberate
SIGILL faults (for unimplemented instructions) to determine the actual CPU.
In practice the software would be distributed by Application Support
Engineers from the _marketing_ department of the chip manufacturer.
So ask the hardware vendor, who stands to get more money (or sooner)
by making it easier to use the CPU chips.
On Linux much of the information is encoded in the /proc/cpuinfo pseudo file.
So parse(read(open("/proc/cpuinfo", O_RDONLY))).
Valgrind could contribute a "last resort" option by implementing optional
command-line arguments similar to the -march= or -mcpu= parameters of gcc.
The motivated end user of valgrind should write a patch which implements
the /proc/cpuinfo or -mcpu= strategies. [Let this be a warning
to use hardware that is better supported than e500! such as MIPS, ARM, x86*]
|
|
From: John R. <jr...@bi...> - 2015-04-10 22:37:29
|
> Parsing /proc/cpuinfo is a bad idea. That might be true, but please say why you believe it. > Julian again mentioned the correct > method and that is to query the AT_HWCAP and AT_HWCAP2 flags in the AUXV. > > [bergner@makalu ~]$ LD_SHOW_AUXV=1 /bin/true | grep AT_HWCAP > AT_HWCAP: vsx arch_2_06 dfp ic_snoop smt mmu fpu altivec ppc64 ppc32 > AT_HWCAP2: tar isel ebb dscr htm arch_2_07 > > That shows all the instruction categories that are supported in the > cpu you're running on. In my experience, relying on AT_HWCAP* (and especially its decoded representation) is a bad idea. /proc/cpuinfo is *much* better. Here are three current examples: On x86_64: $ LD_SHOW_AUXV=1 /bin/true | grep AT_HWCAP AT_HWCAP: bfebfbf ### meaning ?? $ cat /proc/cpuinfo ### abbreviated :) processor : 3 vendor_id : GenuineIntel cpu family : 6 model : 42 model name : Intel(R) Core(TM) i5-2500K CPU @ 3.30GHz flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 popcnt tsc_deadline_timer aes xsave avx lahf_lm ida arat epb xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid On ARM (raspberry pi2): $ LD_SHOW_AUXV=1 /bin/true | grep AT_HWCAP AT_HWCAP: half thumb fastmult vfp edsp neon vfpv3 ### armv6 or armv7 ?? etc. $ cat /proc/cpuinfo ### abbreviated :) model name : ARMv7 Processor rev 5 (v7l) Features : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm CPU implementer : 0x41 CPU architecture: 7 CPU variant : 0x0 CPU part : 0xc07 CPU revision : 5 On MIPS (ASUS RT-N16): $ LD_SHOW_AUXV=1 /bin/true | grep AT_HWCAP ### empty output from grep: no AT_HWCAP at all $ cat /proc/cpuinfo ### abbreviated :) system type : Broadcom BCM4716 chip rev 1 cpu model : MIPS 74K V4.0 ASEs implemented : mips16 dsp In those 3 cases /proc/cpuinfo is vastly superior to AT_HWCAP*. Given that the vendor of e500v2 is stingy with expected software, I expect that /proc/cpuinfo contains more and better information. |
|
From: Peter B. <be...@vn...> - 2015-04-10 21:39:22
|
On Fri, 2015-04-10 at 06:53 -0700, John Reiser wrote:
> Traditionally the hardware manufacturer publishes such software, which uses
> a combination of software-readable configuration registers and deliberate
> SIGILL faults (for unimplemented instructions) to determine the actual CPU.
PowerPC doesn't have a software readable config register and Julian
showed that relying on SIGILL (or lack thereof) to determine whether
a instruction is supported or not won't work for e500, since the
same opcode is used for different instructions on different cpus.
> On Linux much of the information is encoded in the /proc/cpuinfo pseudo file.
> So parse(read(open("/proc/cpuinfo", O_RDONLY))).
Parsing /proc/cpuinfo is a bad idea. Julian again mentioned the correct
method and that is to query the AT_HWCAP and AT_HWCAP2 flags in the AUXV.
[bergner@makalu ~]$ LD_SHOW_AUXV=1 /bin/true | grep AT_HWCAP
AT_HWCAP: vsx arch_2_06 dfp ic_snoop smt mmu fpu altivec ppc64 ppc32
AT_HWCAP2: tar isel ebb dscr htm arch_2_07
That shows all the instruction categories that are supported in the
cpu you're running on. I thought we (IBM) had added some support for
querying the AT_HWCAP (AT_PLATFORM?) values...or maybe I'm just
remembering us (IBM) talking about it. Carl, what is the status of
using the AT_HWCAP values to enable/disable instruction categories?
> In practice the software would be distributed by Application Support
> Engineers from the _marketing_ department of the chip manufacturer.
> So ask the hardware vendor, ...
This is the real problem, since it seems this specific hardware vendor
doesn't engage with the community that often and their support for their
cpus suffer. The same goes for gcc, glibc, binutils and gdb too.
Peter
|
|
From: John R. <jr...@bi...> - 2015-04-11 23:36:12
|
> On MIPS (ASUS RT-N16): > $ LD_SHOW_AUXV=1 /bin/true | grep AT_HWCAP > ### empty output from grep: no AT_HWCAP at all ... because the C library is uClibc, not glibc. Some digging shows that the AUX vector is: 0x00000010 AT_HWCAP 0x00000000 0x00000006 AT_PAGESZ 0x00001000 0x00000011 AT_CLKTCK 0x00000064 0x00000003 AT_PHDR 0x00400034 0x00000004 AT_PHENT 0x00000020 0x00000005 AT_PHNUM 0x00000008 0x00000007 AT_BASE 0x2aaa8000 0x00000008 AT_FLAGS 0x00000000 0x00000009 AT_ENTRY 0x00401740 0x0000000b AT_UID 0x00000000 0x0000000c AT_EUID 0x00000000 0x0000000d AT_GID 0x00000000 0x0000000e AT_EGID 0x00000000 0x00000017 AT_SECURE 0x00000000 0x00000000 AT_NULL 0x00000000 Still, AT_HWCAP is 0, which omits information such as support for mips16 and dsp that is shown in /proc/cpuinfo below. The Linux kernel is 2.6.24 (dd-wrt + optware.) > > $ cat /proc/cpuinfo ### abbreviated :) > system type : Broadcom BCM4716 chip rev 1 > cpu model : MIPS 74K V4.0 > ASEs implemented : mips16 dsp From the viewpoint of the end user, a commandline override such as --cpu=... has an advantage because it allows working around bugs in AT_HWCAP and/or /proc/cpuinfo. |
|
From: Peter B. <be...@vn...> - 2015-04-13 17:45:47
|
On Sat, 2015-04-11 at 16:36 -0700, John Reiser wrote: > > On MIPS (ASUS RT-N16): > > $ LD_SHOW_AUXV=1 /bin/true | grep AT_HWCAP > > ### empty output from grep: no AT_HWCAP at all > > ... because the C library is uClibc, not glibc. Yes, the above is a neat trick from glibc. On linux, the AUXV is exported from the kernel via /proc/<pid>/auxv or /proc/self/auxv and it is also placed on the stack above the top most frame. We (IBM) also created a libauxv library which can help with reading and querying the AUXV contents: https://github.com/Libauxv/libauxv > Some digging shows that the AUX vector is: > 0x00000010 AT_HWCAP 0x00000000 [snip] > Still, AT_HWCAP is 0, which omits information such as support for mips16 and dsp > that is shown in /proc/cpuinfo below. The Linux kernel is 2.6.24 (dd-wrt + optware.) That seems like a kernel bug to me. > From the viewpoint of the end user, a commandline override such as --cpu=... > has an advantage because it allows working around bugs in AT_HWCAP > and/or /proc/cpuinfo. You'll get no argument from me on it being potentially useful to override the automatically detected cpu value. Peter |