Thread: [Pocl-devel] Non-optimal code generation

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

When I use clang 3.1 (a recent snapshot) to translate e.g. the fabs
intrinsic, acting on a single floating point number, then then
generated x86 code looks like

_Z4fabsf:                               # @_Z4fabsf
        movd    %xmm0, %eax
        andl    $2147483647, %eax       # imm = 0x7FFFFFFF
        movd    %eax, %xmm0
        ret

This is not optimal, since the value is moved from xmm0 to eax and
back, which is not necessary. Instead of andl, I expect to see the
andss instruction.

How do I go about having this corrected? Is this a problem in pocl, in
clang, in llvm, or in the way one of these are used?

-erik

-- 
Erik Schnetter <esc...@pe...>
http://www.cct.lsu.edu/~eschnett/
AIM: eschnett247, Skype: eschnett, Google Talk: sch...@gm...

Thread: [Pocl-devel] Non-optimal code generation

pocl-devel