From: Bram R. <bra...@ph...> - 2006-02-03 16:55:20
|
Hi all, Immediately after the previous email, the -mmmx and -msse compiler options got my attention. Then Patrick called and suggested the same. I verified the gcc doc, and it seems that they indeed serve our purpose. Hence, I have enabled the speed optimization on cygwin. There is no cpu load difference anymore between the win32 and the cygwin platforms. Patrick, thanks for your input! Developers, some tests on old hardware would be appreciated... :-) Regards, Bram. -- A.K. (Bram) Riemens Principal Scientist, DSP group, Philips Research Office: WO-p-94, Postbox WO02 High Tech Campus 36 (WO), 5656 AE Eindhoven, The Netherlands Tel: +31 40 27 43833, Fax: +31 40 27 44675 E-mail: bra...@ph... Bram Riemens Sent by: pfs...@li... 03-02-2006 17:24 To pfs...@li... cc Patrick Meuwissen/EHV/RESEARCH/PHILIPS@PHILIPS O P Gangwal/EHV/RESEARCH/PHILIPS@PHILIPS Subject [Pfs...@sf...] Cpfspd speed optimization on cygwin Classification Hi all, Today, I updated the cpfspd_fio.c module in the code repository. This code contains a speed optimized memcpy function, that uses assembly and intrinsics for cpu identification, data prefetching and fast copy using the SSE instruction set. Uptil now, this is available on the win32 platform only; not on cygwin. This assembly code has been streamlined for a few % more performance (hence lower cpu load for file access functions). Furthermore, I prepared gcc compilation for this code. Unfortunately it requires the -mpentium4 compiler flag. So, even though the cpu is checked at runtime whether the optimized code can be executed, I cannot compile it unless the processor is specified to the compiler. I am reluctant to specify the -mpentium4 flag because I don't know what the compiler will do with all other code... I do not want to distribute a cpfspd library that fails on older (pre-pentium4) systems. And I do not have such systems to test... Therefore, the optimized memcpy code is still not activated on cygwin. Any suggestions? Opinions? Hints to compile without the processor flag? Regards, Bram. -- A.K. (Bram) Riemens Principal Scientist, DSP group, Philips Research Office: WO-p-94, Postbox WO02 High Tech Campus 36 (WO), 5656 AE Eindhoven, The Netherlands Tel: +31 40 27 43833, Fax: +31 40 27 44675 E-mail: bra...@ph... |