From: Nikos P. <ni...@gm...> - 2014-07-15 08:04:02
|
Hi, I have two problems regarding the opencl dense matrix product benchmark example (blas3.cpp): 1. In my GPU (ATI 6950), I get 285GFlops on single precision. Isn't that a little low considering that my GPU has a theoretical max of 2TFlops? Can I do anything to boost performance? 2. I cannot seem to make the matrixes larger than 6000x6000 (BLAS3_MATRIX_SIZE). Every time I do so the program crashes. My development environment is Win8.1 64bit, with AMD APP 2.9 SDK installed. The benchmark was build with version 1.5.2 using Visual Studio 2013 and mingw gcc 4.8.2. Both compilers produced similar results. Best regards, Nikos Papahristou |