From: Pekka J. <pek...@tu...> - 2011-10-31 07:25:51
|
On 10/31/2011 12:21 AM, Erik Schnetter wrote: > You are right; andss does not exist, but there is an andps instead. It seems to be an SIMD instruction that performs the 'and' for 4 single precision floats and you are performing it on a single one. I can understand that LLVM cannot select it automatically as in that case it would clobber all the other floats in the SIMD register too, and (at least when inlined) they can contain live data. Thus, if it selected it automatically, it had to "spill" the other parts of the SIMD reg before doing that which is quite costly. However, if you are sure using ANDPS here is faster, you can generate an inline asm that has a safe 'all ones' mask for the rest of the fields, right? -- --Pekka |