I'm in the process of using profilers to analyze the program execution.
I'd like to quickly run the ViennaCL, and I see the /benchmarks folder.
When I run ./dense_blas-bench-opencl, I get bunch of GFLOPS for sGEMV-N, sGEMV-T, sGEMM-NN, ...
Instead of all those results together, can I run just one config instead ? (just one GFLOPs/s for sGEMV-N as example).
Or instead of /benchmark folder, is there any other location that I can quickly evaluate ?
Thanks
Last edit: Michael Huang 2016-04-05
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
for benchmarking dense BLAS routines, there are indeed only dense_blas-bench-X, with X either "cpu", "opencl", or "cuda". If you want to evaluate just one of those routines, copy the source file and delete everything except the one value you are interested in. It's probably not the super-elegant solution you are looking for, but it shouldn't take a lot of time :-)
Best regards,
Karli
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I'm in the process of using profilers to analyze the program execution.
I'd like to quickly run the ViennaCL, and I see the /benchmarks folder.
When I run ./dense_blas-bench-opencl, I get bunch of GFLOPS for sGEMV-N, sGEMV-T, sGEMM-NN, ...
Instead of all those results together, can I run just one config instead ? (just one GFLOPs/s for sGEMV-N as example).
Or instead of /benchmark folder, is there any other location that I can quickly evaluate ?
Thanks
Last edit: Michael Huang 2016-04-05
Hi Michael,
for benchmarking dense BLAS routines, there are indeed only dense_blas-bench-X, with X either "cpu", "opencl", or "cuda". If you want to evaluate just one of those routines, copy the source file and delete everything except the one value you are interested in. It's probably not the super-elegant solution you are looking for, but it shouldn't take a lot of time :-)
Best regards,
Karli