Menu

#122 Super slow reference add algorithms

1.3-dev
open
nobody
Signal (12)
5
2009-09-24
2009-09-24
Anonymous
No

Add functions, when using reference algorithm (on cpu without sse2), is extremely slow. For example, fwsAdd_8u_Sfs is almost 50 times slower than a typical simple C implementation.
Reason (suspected):
fwsAdd_8u_Sfs calls OPT_LEVEL::fe< DEF_ADD::ADD::C1::Add_8u<0> > (data, pSrc1, pSrc2, pDst, len), which finally ends in
ISV AddI( const TS1 * s1, CH cs1, const TS2 * s2, CH cs2, TD * d, CH cd, int scale ), which calls
FW_REF::Limits<TD>::Sat( FW_REF::Scale( (s1[0] + s2[0]), scale) );
Unfortunately the Scale() function calls the pow() function to scale power of two. Seems the original integer is first converted to float, then calculated pow(), then converted back to integer. This process is very slow.
It is totally unnecessary to incur pow() when you are scaling based on power of 2. I believe other Add functions would have the similar problem.
Strongly suggest someone fix this problem, so the application will not run too slow on an already slow and old cpu.

Discussion


Log in to post a comment.