IT++ / Discussion / Open Discussion: ldexp in pow2 implementation

andy_panov - 2013-01-31

Why do not we use ldexp in pow2 implementation? It should be faster and
more accurate solution then std::pow.

Last edit: andy_panov 2013-01-31

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Bogdan Cristea - 2013-02-01

Please provide some results (expecially proving that your implementation is faster).

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

andy_panov - 2013-02-01

I've tried to run the following code on my machine with MSVC:

~~~~~~~~~~~~~~~~~~~~

include <cmath></cmath>

include <vector></vector>

include <iostream></iostream>

include "itpp/base/timing.h"

using namespace itpp;

int main()
{
double res;
itpp::tic();
std::vector<double> pow_out(500);
for (int j = 0 ; j < 500; ++j)
{
res = 0.0;
for(int i = 0, k = 0; i < 100000; ++i, ++k)
{
if (k > 50) k = 0;
res += pow(2.0, k);
}
pow_out[j] = res;
}
std::cout<<"res="<<pow_out[499]<<std::endl;
itpp::toc_print();</double>

itpp::tic(); std::vector<double> ldexp_out(500); for (int j = 0 ; j < 500; ++j) { res = 0.0; for(int i = 0, k = 0; i < 100000; ++i, ++k) { if (k > 50) k = 0; res += ldexp(1.0, k); } ldexp_out[j] = res; } std::cout<<"res="<<ldexp_out[499]<<std::endl; itpp::toc_print(); return 0;

}

~~~~~~~~~~~~~~~~

and was quite surprised with results:

res=4.41353e+018
Elapsed time = 0.858002 seconds
res=4.41353e+018
Elapsed time = 2.6364 seconds

It means that several multiplications of doubles are 3 times faster then just tweaking of double exponent value!

Sure, I was wrong with my initial statement.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

andy_panov - 2013-02-01

PS someone in the following thread http://stackoverflow.com/questions/7720668/fast-multiplication-division-by-2-for-floats-and-doubles-c-c indicates that VC11 vectorizes loops with doubles using SSE2, so others can obtain opposite results with compilers still using FPU for things like that.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Stephan Ludwig - 2013-02-01

Hi andy_panov,

I not an expert in this topic at all, but I found the thread you mentioned as well and was a bit confused. I have nothing of content to state on that topic, but provide some measurements and a slight note of
hmmm, I think you either misinterpret your results or you switched the lines of your output:

Tell me if I am wrong, but according to you code above, the first 2 lines of output should be for pow (fast) and the second ones for ldexp (slow).

Anyway, here are my results for comparison (Ubuntu 11.04 natty 64 bit with g++ (Ubuntu/Linaro 4.5.2-8ubuntu4) 4.5.2 )
I swapped the lines tic (toc) with memory allocation (std::cout), because we do not want to measure those effects, do we?

Elapsed time = 5.0531 seconds
pow res=4.41353e+18
Elapsed time = 0.77639 seconds
ldexp res=4.41353e+18

/Stephan
PS: code for reference, compiled with
g++ itpp-config --cflags ldexp_test.cpp -o ldexp_test itpp-config --libs

ldexp_test.cpp

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
- Stephan Ludwig - 2013-02-01
  
  the compile command, of course, requires the usual accents before/after itpp-config --<option>.</option>
  
  results were for a Intel(R) Core(TM) i3 CPU M 350 @ 2.27GHz
  
  further results i7-2600 @3.4GHz, Ubuntu 11.10 oneiric 64 bit with g++ (Ubuntu/Linaro 4.6.1-9ubuntu3) 4.6.1:
  Elapsed time = 2.91351 seconds
  pow res=4.41353e+18
  Elapsed time = 0.496346 seconds
  ldexp res=4.41353e+18
  
  If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

andy_panov - 2013-02-01

Hi Stephan,

still have following results with your ldexp_test.cpp (msvc, vc11):

Elapsed time = 0.889201 seconds
pow res=4.41353e+018
Elapsed time = 2.6364 seconds
ldexp res=4.41353e+018

Allocation should not affect the whole picture much since it is done only once.

I do not have the explanations. I also confused with it.

Last edit: andy_panov 2013-02-01

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

andy_panov - 2013-02-01

MSVC run-time uses divide-conquer algorithm with multiplications to compute integer powers (log2 N multiplications are used). I do not know what happens inside the microsoft implementation of ldexp

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

andy_panov - 2013-02-02

I feel we need more testing. Can anyone run the benchmark with Stephan's modifications and report the results? Both Linux and windows results are highly appreciated.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Bogdan Cristea - 2013-02-02

These are my results (openSUSE 12.2, x86_64 with ACML)
pow
Elapsed time = 6.87597 seconds
res=4.41353e+18
ldexp
Elapsed time = 0.98343 seconds
res=4.41353e+18

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Bogdan Cristea - 2013-02-02

And on Windows 8 x86_64 (Visual Studio 2010, ACML, Release mode, x64)
pow
Elapsed time = 0.352236 seconds
res=4.41353e+18
ldexp
Elapsed time = 1.3519 seconds
res=4.41353e+18

Althow in Debug mode I get faster times with ldexp.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

andy_panov - 2013-02-02

Hi Bogdan,

Thank you for the results. I also mentioned that ldexp is faster in Debug mode. Based on the test results (thank you, Stephan!), linux users should benefit from switching to ldexp, but we should stick with current implementation on windows. I can implement it compiler-dependent way with ifdefs inside the pow2 implementation. I can proceed and provide a patch if no one has any objections.

Andy

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

ldexp in pow2 implementation

Forums

Help

ldexp in pow2 implementation

include <cmath></cmath>

include <vector></vector>

include <iostream></iostream>

include "itpp/base/timing.h"

ldexp in pow2 implementation

Forums

Help

ldexp in pow2 implementation document.SUBSCRIPTION_OPTIONS = { "thing": "topic", "subscribed": false, "url": "subscribe", "icon": { "css": "fa fa-envelope-o" } };

include <cmath></cmath>

include <vector></vector>

include <iostream></iostream>

include "itpp/base/timing.h"

ldexp in pow2 implementation