From: Robert S. <ro...@xb...> - 2004-02-22 00:12:32
|
Gwenole I think I spoke to soon. When I compile the binary last time some how I got caught up in a old directory as I have a tendency to keep old source floating around in and just rename folders well when I realized what I did and I went to fix it. So I do the usual steps and this is what I get with gcc 3.2.3 g++ -I../kpx_cpu/include -I../kpx_cpu/src -DUSE_JIT -I../include -I. g++ -DHAVE_CONFIG_H -D_REENTRANT g++ -DDATADIR=\"/usr/local/share/SheepShaver\" -g -O2 g++ -I/usr/X11R6/include -I/usr/include/gtk-1.2 -I/usr/include/glib-1.2 g++ -I/usr/lib/glib/include -I/usr/X11R6/include -fomit-frame-pointer g++ -mpreferred-stack-boundary=2 -falign-functions=0 -mmmx -msse g++ -finline-limit=10000 -fno-reorder-blocks -fno-optimize-sibling-calls g++ -c ../kpx_cpu/src/cpu/ppc/ppc-dyngen-ops.cpp -o obj/ppc-dyngen-ops.o In file included from ../kpx_cpu/src/cpu/ppc/ppc-dyngen-ops.cpp:1493: /usr/lib/gcc-lib/i486-slackware-linux/3.2.3/include/xmmintrin.h: In function `void _mm_stream_pi(vector int*, vector int)': /usr/lib/gcc-lib/i486-slackware-linux/3.2.3/include/xmmintrin.h:1036: cannot convert `vector int*' to `long long unsigned int*' for argument `1' to `void __builtin_ia32_movntq(long long unsigned int*, long long unsigned int)' make: *** [obj/ppc-dyngen-ops.o] Error 1 root@slackrules:/home/rob/SheepShaver/src/Unix# -----Original Message----- From: bas...@li... [mailto:bas...@li...] On Behalf Of Gwenole Beauchesne Sent: Friday, February 20, 2004 12:57 PM To: bas...@li... Subject: [B2-devel] Optimizations to AltiVec code now in CVS Hi, I have just committed some tentative SSE & MMX optimizations for AltiVec emulation. You need gcc recent enough to take benefit. e.g. gcc 3.2.2 with the respective intrinsics in <mmintrin.h> & <xmmintrin.h>. This is generic code that can be improved even more. Still, the 1.8 GHz Opteron here performs at around 783 MegFlops on AltiVec Fractal Carbon, i.e. the exact half performance than my PBG4/400. I really would like to reach 1+ GigaFlops in emulation. ;-) However, I doubt this will be reached with the current "JIT1" dynamic translation engine. Obviously, we can't translate all AltiVec instructions to SSE code simply because that's not practical, especially saturating variants as we have to update the SAT bit and there is not extra flag in x86 land to know that a value saturated. i.e. we only have approx 30 native code templates for key VMX instructions used e.g. in AltiVec Fractal Carbon. I hope I haven't broken anything. One possible next step is to use the run-time assembler I wrote last year for the new JIT infrastructure. That way, we can remove some useless instructions in the process. I don't know yet what you can expect next, but at least the CPU emulation is now in an interesting shape. ;-) Bye, Gwenole. ------------------------------------------------------- SF.Net is sponsored by: Speed Start Your Linux Apps Now. Build and deploy apps & Web services for Linux with a free DVD software kit from IBM. Click Now! http://ads.osdn.com/?ad_id=1356&alloc_id=3438&op=click _______________________________________________ basilisk-devel mailing list bas...@li... https://lists.sourceforge.net/lists/listinfo/basilisk-devel |