GFLOPs changes with the number of iterations in a for loop

2014-02-25
2014-02-25
  • Daniel Estrela
    Daniel Estrela
    2014-02-25

    Hi,

    I try to make some tests with viennacl and decide to mesure the GFLOPs of a spmv routine.
    I use the benchmark-utils.hhp that comes in the examples.
    The code snippet below ilustrate what I'm trying to do:

    viennacl::copy(std_vectorx, vcl_vectorx);
    viennacl::copy(std_matrixM, vcl_matrixM);
    
    timer.start()
    for (int i = 0; i < LOOPS; ++i)
        vcl_vectory = viennacl::linalg::prod(vcl_matrixM, vcl_vectorx);
    time_kernel = timer.get() / static_cast<double>(LOOPS);
    printOps(static_cast<double>(vcl_Matrix.nnz()) * 2.0, time_kernel);
    

    Here, the vcl_matrixM is a sparse matrix in coo format and timer is a Timer object.

    The issue is when the LOOP constant is 1 the GFLOPS is about 27, but when I increase the number of iterations th GFLOPS changes considerably: 10 iterations - 67 GFLOPS, 100 iterations - 156 GFLOPS, 1000 iterations - 356 GFLOPS and so on.

    I really don't know whats possibly happing here.

    Thanks in advance,
    Daniel Estrela

     
  • Karl Rupp
    Karl Rupp
    2014-02-25

    Hi Daniel,

    all operations are asynchronous on the GPU, so your for-loop only enqueues the necessary kernels, but does not wait for their completion. You need to use

    viennacl::backend::finish();
    

    to wait for kernel execution completion before taking the timings. Have a look at the other benchmarks in examples/benchmarks, this is also used there.

    Hope this helps :-)

    Best regards,
    Karli

     
  • Daniel Estrela
    Daniel Estrela
    2014-02-25

    Thank you Karli,

    I use the viennacl::backend::finish() as you mention and everything worked as expected.

    Best regards,
    Daniel Estrela