From: Erik S. <esc...@pe...> - 2011-10-29 16:43:20
|
When I use clang 3.1 (a recent snapshot) to translate e.g. the fabs intrinsic, acting on a single floating point number, then then generated x86 code looks like _Z4fabsf: # @_Z4fabsf movd %xmm0, %eax andl $2147483647, %eax # imm = 0x7FFFFFFF movd %eax, %xmm0 ret This is not optimal, since the value is moved from xmm0 to eax and back, which is not necessary. Instead of andl, I expect to see the andss instruction. How do I go about having this corrected? Is this a problem in pocl, in clang, in llvm, or in the way one of these are used? -erik -- Erik Schnetter <esc...@pe...> http://www.cct.lsu.edu/~eschnett/ AIM: eschnett247, Skype: eschnett, Google Talk: sch...@gm... |