From: Matthias S. <op...@vr...> - 2007-09-29 09:49:10
|
Hi, > > Hard to say. We are using AQTime (nice tool by the way) these days > > to find out what is going on. > > But their pricing sucks, charging 3 times more for a floating license > is ridiculous. I wouldn't say intel's pricing is much better, but since the base price=20 of AQTime isn't that high. > > Especially the inlines are very surprising > > (read frustrating). Most of them are completely ignored EVEN if you > > use __forceinline. AFAIK inline is something like a hint for a > > compiler. > > inline and __inline in itself yes. But usually there must be a way of > forcing the compiler to inline. Otherwise the compiler is unusable > for anything that comes even close to high performance. It's a tool > so it should never ever have a big impact on the design as long as > the design is within the language specification, especially if other > tools prove that it is possible. And it should always listen to the > programmer who hopefully knows what he/she is doing ;-) Totally agreed. But, can you still remember the times when the=20 "tuning-community" discovered the register keyword within C ?=20 > > > > Especially using templates in basic math classes like > > > > Vec3f/Pnt3f > > > > > > > > > and so on is probably the worst thing one can do since some > > > > > functions in OpenSG (like operator+ for Pnt3f) don=C2=B4t get > > > > > inlined. Not to mention: you loose some chances for > > > > > optimizations (SIMD for example could be pretty difficult to > > > > > use in templates when you want to use the same code for > > > > > Vec2f/Vec3f/Vec4f). > > > > > > This is only a half an argument as you can (and always have to) > > > specialise the SIMD part as there are platforms which do not > > > support it, e.g. ia64 or ppc platforms. Furthermore in order to > > > use SIMD for non 128 bit structures you have to > > > specialise/reorganise on an even higher level like having a > > > specialised MFVec2f which can handle two two value vectors > > > combined. > > > > Of course you can do template specializations but well then you are > > going to rewrite most of them anyway :( > > not really, comparing the size of a completely (e.g. type, size and > simd/altivec) unrolled implementation against a templated/specialised > one I would still say the templated version is smaller. And more > important if you have to fix a bug you don't have to trace it through > n unrolled version. > > > Implementing the mentioned > > specialized 2D case is only useful for very few people I think. > > Not unlikely, except if you do a lot of texture coordinate > computation on the cpu. not a common scenario I suppose. > > > > > I finally got used to use my own math classes and do a simple > > > > > reinterpret cast on values stored in multifields - got me > > > > > quite some speedup since the compiler does correct inlining. > > > > > > > > Do you have some benchmark results you can share? If there is a > > > > significant impact we need to change things... > > > > > > Before we start changing code and design, I would like to verify > > > if this is a MSVC only problem. If it is I'm tempted to say bad > > > luck, if they still don't know how to build a proper compiler. I > > > mean they seem to have managed to add ipo and pgo. Anyway usually > > > there should be way to force the compiler to inline functions, > > > something like __forceinline. So before changing code I would go > > > for that. > > > > Sorry, doesn't work. > > Can anyone supply some concrete examples, e.g which function and the > calling place. I really would like to verify it. We will provide one next week. Regards Matthias =2D-=20 +---------------------+----------------------------+ | VREC GmbH | | | Matthias Stiller | | | Robert-Bosch-Str. 7 | tel: +49 6151 4921034 | | 64293 Darmstadt | web: http://www.vrec.de | | Germany | mail: ms...@vr... | +---------------------+----------------------------+ |