|
From: Michael A. <Mic...@fs...> - 2007-04-06 16:07:49
|
Jason Martin wrote: > Hi Michael, > Hello Jason, hello folks, > Thanks for the bug report! > > That is indeed lahf/sahf which I do use in my assembly code. Note > that the lahf/sahf instruction is not supported on some x86_64 CPUs. > However, it IS supported on the Intel Core 2 CPUs. (Please note that > the Intel "Core Duo" is not the same as the Intel "Core 2 Duo." > Despite the similarity in the names of the processors, they have very > different architectures.) > Ok. > I'm not familiar with Valgrind, but I just looked it up and it appears > to be a nice debugging package. My guess is that since lahf/sahf is > not supported on Pentium4 and early Xeon processors, the Valgrind > package is trapping for lahf/sahf and flagging it as an unsupported > instruction even on processors where it is allowed. This seems like > good practice to me because it ensures that the resulting binary will > be more portable. So, I would not consider this a bug in Valgrind. > I would disagree because those instructions will be used on supported CPUs. They would be limited to Intel Core 2 CPUs at the moment (and potentially some of the high end Opterons), but they are legal instructions and valgrind should handle them properly. If you use SSE3 on an older pentium you will get failures, but that's not valgrind's fault. > I chose the lahf/sahf pair to save and restore the Carry Flag because > on the Core 2 architecture lahf/sahf can execute without stalling the > execution pipeline... (other common methods of saving and restoring > the CF state such as conditional moves, bit tests, and the add with > carry instructions can create a false dependency with surrounding > arithmetic instructions which adds latency to a critical loop). > > I hadn't realized that this would be a problem, but now that you've > encountered it I'm re-thinking my design decision. I think I will go > back and re-write that section of code so that by default it will > produce a slower-but-more-portable instruction and then give users the > option of compiling with the faster lahf/sahf if they want it. > Ok, I would always prefer performance over portability since the gmp has code for "lesser" CPUs. > Thanks, > jason > > On 4/6/07, Julian Seward <js...@ac...> wrote: >> >> > vex amd64->IR: unhandled instruction bytes: 0x9F 0x49 0x89 0xC9 >> >> That looks like lahf/sahf. Maybe you can configure GMP not to use >> those? >> In the meantime please file a bug report as described at >> http://www.valgrind.org/support/bug_reports.html >> so this can be tracked properly. >> >> J >> Great, I would suggest that Jason should write a short assembly program. I am willing to do it, but the last time I did assembly programming was in the hey days of 68k, so I am a little rusty there. I can take the code from him and run valgrind on it and file the bug report then in case Jason doesn't have time for it. One more thing: When I run configure for 3.3.0SVN on this box processor : 1 vendor_id : GenuineIntel cpu family : 6 model : 15 model name : Intel(R) Core(TM)2 CPU 6600 @ 2.40GHz stepping : 6 cpu MHz : 1596.000 cache size : 4096 KB physical id : 0 siblings : 2 core id : 1 cpu cores : 2 fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx lm constant_tsc pni monitor ds_cpl vmx est tm2 cx16 xtpr lahf_lm bogomips : 4789.13 clflush size : 64 cache_alignment : 64 address sizes : 36 bits physical, 48 bits virtual power management: I get Primary build target: AMD64_LINUX Secondary build target: X86_LINUX Default supp files: xfree-3.supp xfree-4.supp glibc-2.5.supp Does AMD64 cover also cover EMT64 by Intel? Cheers, Michael |