[Algorithms] Optimized math with template expressions
Brought to you by:
vexxed72
From: Melax, S. <me...@ea...> - 2002-04-01 20:51:23
|
Has anyone out there made a serious effort to utilize template expressions (template metaprogramming) to improve performance for production code? There's an introductory article at: http://www.flipcode.com/tutorials/tut_fastmath.shtml Going from theory to practice there are many issues to consider... The general approach is useful for saving register space with a "generic fpu" but seems less useful with aligned 4D data on a processor that has 128 bit registers. For unaligned data (3D vecs and mats) there clearly is a performance boost (as is shown in the above reference). But if you dont have the latest VC6 service packs, then it backfires and things run slower. The latest version of VC6 (with the patches and service packs) work fine. Later versions of the compiler (such as what is used for compiling for a particular console) also work. But without template expressions this latest compiler seems to be more aggressive with the optimizations anyway. It rearranges any non-dependant floating point operations thus leaving less room for performance improvments utilizing the template-expression technique. The previous efforts of template expressions only focus on reducing temporary storage to improve performance. Has anyone considered using template metaprogramming to reduce stalls in other ways? such as taking an expressions such as SomeType a,b,c,d,r; r = a*b*c*d; and turning it into: r = (a*b) * (c*d) which, as you likely know already, runs a few cycles faster. In particular, 3 cycles on a PC, if SomeType is a 1x1 matrix (float). But no compiler could automatically make that rearrangement since it would violate the language standard - even if the user didn't mind. All the research that I've seen on template expressions on the web is dated year 2000 or earlier. Is this still an active area of research? or is it pointless? Most of the research on template metaprogramming has been done using GCC. Yet, I haven't had any positive success with that compiler yet (ps2 or pc). Does anyone know how to get GCC working with this? which version is required? Yes I know that you can just edit the code to optimize it. But humour me anyway. I have this romantic (but perhaps impractical) notion in my head that wants me to optimize code without touching it, and possibly even optimize code that hasn't been written yet. Disclaimer: Note that I am not here to suggest that people try template expressions or any other metaprogramming technique. You're probably better off just concentrating on making your specific game fast and fun. Even if that forces you to write a bit of ugly looking code :-) I was only wondering if anyone out there had any practical experience on the subject of template expressions that they are willing to share. Feel free to email direct. Thanks stan |