From: Bram R. <bra...@ph...> - 2006-02-03 16:26:10
|
Hi all, Today, I updated the cpfspd_fio.c module in the code repository. This code contains a speed optimized memcpy function, that uses assembly and intrinsics for cpu identification, data prefetching and fast copy using the SSE instruction set. Uptil now, this is available on the win32 platform only; not on cygwin. This assembly code has been streamlined for a few % more performance (hence lower cpu load for file access functions). Furthermore, I prepared gcc compilation for this code. Unfortunately it requires the -mpentium4 compiler flag. So, even though the cpu is checked at runtime whether the optimized code can be executed, I cannot compile it unless the processor is specified to the compiler. I am reluctant to specify the -mpentium4 flag because I don't know what the compiler will do with all other code... I do not want to distribute a cpfspd library that fails on older (pre-pentium4) systems. And I do not have such systems to test... Therefore, the optimized memcpy code is still not activated on cygwin. Any suggestions? Opinions? Hints to compile without the processor flag? Regards, Bram. -- A.K. (Bram) Riemens Principal Scientist, DSP group, Philips Research Office: WO-p-94, Postbox WO02 High Tech Campus 36 (WO), 5656 AE Eindhoven, The Netherlands Tel: +31 40 27 43833, Fax: +31 40 27 44675 E-mail: bra...@ph... |