You can subscribe to this list here.
2012 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(6) |
Aug
(30) |
Sep
(1) |
Oct
(10) |
Nov
(8) |
Dec
(1) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2013 |
Jan
|
Feb
(9) |
Mar
(3) |
Apr
(1) |
May
(2) |
Jun
(2) |
Jul
(73) |
Aug
(145) |
Sep
(32) |
Oct
(45) |
Nov
(4) |
Dec
(76) |
2014 |
Jan
(24) |
Feb
(92) |
Mar
(27) |
Apr
(15) |
May
(57) |
Jun
(49) |
Jul
(105) |
Aug
(125) |
Sep
(7) |
Oct
(19) |
Nov
(70) |
Dec
(4) |
2015 |
Jan
|
Feb
|
Mar
(3) |
Apr
|
May
(8) |
Jun
|
Jul
(40) |
Aug
(29) |
Sep
|
Oct
(8) |
Nov
(1) |
Dec
(7) |
2016 |
Jan
(12) |
Feb
(7) |
Mar
(8) |
Apr
(4) |
May
(20) |
Jun
(4) |
Jul
(38) |
Aug
(44) |
Sep
(11) |
Oct
(10) |
Nov
(13) |
Dec
(4) |
2017 |
Jan
|
Feb
(7) |
Mar
|
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
(2) |
Sep
|
Oct
|
Nov
|
Dec
|
2018 |
Jan
(1) |
Feb
|
Mar
|
Apr
|
May
|
Jun
(4) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Andrew P. <ap...@ou...> - 2016-09-14 20:30:04
|
Hello, I've been getting a CL_OUT_OF_RESOURCES error when I try to do (somthing like) the following with A OpenCL Contex in a unit testt: viennacl::matrix<double,viennacl::row_major> mxA viennacl::vector<double> vecB // add some data to both mxA and vecB viennacl::vector<double> vecB = viennacl::linalg::prod(mxA, vecB) This seems right and everything works when using an OpenMP Context, but when I try to read the data off of the GPU in (with in an openCL Context) using backend::memory_read, I get the CL_OUT_OF_RESOURCES error. If I dont make the backend::memory_read call, that test will pass, but my next unit test; A Matrix * Matrix test will fail. Does the Vector product or memory_read seem wrong to you? Thanks, Andy |
From: Karl R. <ru...@iu...> - 2016-09-08 18:44:31
|
Hi Charles, lu_factorize factors the matrix A into a lower triangular matrix L (with unit diagonal) and an upper triangular matrix U. The values in A are overwritten with these values. If you want to obtain the inverse, you have to call viennacl::linalg::lu_substitute(vcl_A, vcl_B); where vcl_B is the unit matrix. The inverse will be then stored in vcl_B. Best regards, Karli On 09/08/2016 07:45 PM, Charles Determan wrote: > I am trying to calculate the inverse of a matrix taking the advice from > a previous post > (https://sourceforge.net/p/viennacl/discussion/1143678/thread/ba394d35/) > suggesting the use of LU factorization. So I do the following: > > I have vcl_A matrix > > viennacl::vector<T> vcl_lu_rhs(vcl_A.size1()); > > // solution of a full system right into the load vector vcl_rhs: > viennacl::linalg::lu_factorize(vcl_A); > viennacl::linalg::lu_substitute(vcl_A, vcl_lu_rhs); > > std::cout << "matrix A" << std::endl; > std::cout << vcl_A << std::endl; > > std::cout << "vector" << std::endl; > std::cout << vcl_lu_rhs << std::endl; > > However, neither of these outputs is remotely close to the output I > expect to see for the inverse of a matrix. In R the output would be: > >># mat = vcl_A >> mat > [,1] [,2] [,3] [,4] > [1,] -1.0099356 0.19566691 0.47349181 2.2673060 > [2,] -0.7398383 0.81302435 0.34390506 0.4029221 > [3,] 1.0020811 -0.06548085 -0.09373213 0.3257177 > [4,] 1.1549178 0.87441621 1.53483119 -0.5862660 >> solve(mat) > [,1] [,2] [,3] [,4] > [1,] -0.09714492 0.004242602 0.8125165 0.07863873 > [2,] -0.45207626 1.583809306 0.8987234 -0.16052995 > [3,] 0.46074427 -0.886299611 -0.9398235 0.65059443 > [4,] 0.34057494 0.050298145 0.4806311 -0.08698459 > > but with the above viennacl code I see: > > matrix A > [4,4]((-1.00994,0.195667,0.473492,2.26731),(0.73256,0.669687,-0.00295601,-1.25802),(-0.992223,0.192126,0.376645,2.81709),(-1.14356,1.63983,5.52547,-11.4963)) > vector > [4](-0,0,0,-0) > > Did I miss something here? > > Thanks, > Charles > > > ------------------------------------------------------------------------------ > > > > _______________________________________________ > ViennaCL-devel mailing list > Vie...@li... > https://lists.sourceforge.net/lists/listinfo/viennacl-devel > |
From: Charles D. <cde...@gm...> - 2016-09-08 17:45:28
|
I am trying to calculate the inverse of a matrix taking the advice from a previous post ( https://sourceforge.net/p/viennacl/discussion/1143678/thread/ba394d35/) suggesting the use of LU factorization. So I do the following: I have vcl_A matrix viennacl::vector<T> vcl_lu_rhs(vcl_A.size1()); // solution of a full system right into the load vector vcl_rhs: viennacl::linalg::lu_factorize(vcl_A); viennacl::linalg::lu_substitute(vcl_A, vcl_lu_rhs); std::cout << "matrix A" << std::endl; std::cout << vcl_A << std::endl; std::cout << "vector" << std::endl; std::cout << vcl_lu_rhs << std::endl; However, neither of these outputs is remotely close to the output I expect to see for the inverse of a matrix. In R the output would be: ># mat = vcl_A > mat [,1] [,2] [,3] [,4] [1,] -1.0099356 0.19566691 0.47349181 2.2673060 [2,] -0.7398383 0.81302435 0.34390506 0.4029221 [3,] 1.0020811 -0.06548085 -0.09373213 0.3257177 [4,] 1.1549178 0.87441621 1.53483119 -0.5862660 > solve(mat) [,1] [,2] [,3] [,4] [1,] -0.09714492 0.004242602 0.8125165 0.07863873 [2,] -0.45207626 1.583809306 0.8987234 -0.16052995 [3,] 0.46074427 -0.886299611 -0.9398235 0.65059443 [4,] 0.34057494 0.050298145 0.4806311 -0.08698459 but with the above viennacl code I see: matrix A [4,4]((-1.00994,0.195667,0.473492,2.26731),(0.73256,0.669687,-0.00295601,-1.25802),(-0.992223,0.192126,0.376645,2.81709),(-1.14356,1.63983,5.52547,-11.4963)) vector [4](-0,0,0,-0) Did I miss something here? Thanks, Charles |
From: Charles D. <cde...@gm...> - 2016-08-18 13:38:12
|
Karl, I have solved the problem. It was how I was pulling the command queue. Previously I was pulling from the from context via ocl: cl_command_queue queue = viennacl::ocl::current_context().get_queue().handle().get(); Instead, I needed to pull directly from the matrix object: cl_command_queue queue = vcl_A.handle().opencl_handle().context().get_queue().handle().get(); The program appears to run without problem now. Regards, Charles On Wed, Aug 17, 2016 at 3:18 PM, Charles Determan <cde...@gm...> wrote: > Ok, thanks, I will try asking those with clBLAS (I haven't started messing > with clMAGMA yet until I can get the clBLAS interface working). I will > report back if I manage to solve the problem. > > Regards, > Charles > > On Wed, Aug 17, 2016 at 3:16 PM, Karl Rupp <ru...@iu...> wrote: > >> >> Running again and printing the cl_mem objects (from >>> opencl_handle().get()) it appears that the relevant addresses (last >>> three lines) are being used in multiple areas such as 'Writing data'. >>> >> >> this may be the case, yes. >> >> trying dgemm >>> 0xa84df00 >>> 0xa8b4c40 >>> 0xa8b5990 >>> >> >> These are the correct handles. I don't know why clMAGMA complains. >> >> Best regards, >> Karli >> >> > |
From: Charles D. <cde...@gm...> - 2016-08-17 20:18:26
|
Ok, thanks, I will try asking those with clBLAS (I haven't started messing with clMAGMA yet until I can get the clBLAS interface working). I will report back if I manage to solve the problem. Regards, Charles On Wed, Aug 17, 2016 at 3:16 PM, Karl Rupp <ru...@iu...> wrote: > > Running again and printing the cl_mem objects (from >> opencl_handle().get()) it appears that the relevant addresses (last >> three lines) are being used in multiple areas such as 'Writing data'. >> > > this may be the case, yes. > > trying dgemm >> 0xa84df00 >> 0xa8b4c40 >> 0xa8b5990 >> > > These are the correct handles. I don't know why clMAGMA complains. > > Best regards, > Karli > > |
From: Karl R. <ru...@iu...> - 2016-08-17 20:16:53
|
> Running again and printing the cl_mem objects (from > opencl_handle().get()) it appears that the relevant addresses (last > three lines) are being used in multiple areas such as 'Writing data'. this may be the case, yes. > trying dgemm > 0xa84df00 > 0xa8b4c40 > 0xa8b5990 These are the correct handles. I don't know why clMAGMA complains. Best regards, Karli |
From: Karl R. <ru...@iu...> - 2016-08-17 20:14:58
|
Hi Charles, > Just adding my opinion here as I have been following this thread. Would > it be possible to have both the .so library and header only options > available or is it a strictly 'this-or-that' scenario? a header-only version should be possible, but exposes the user to all kinds of compiler flags. One example: In order to provide AVX-enabled code, the user has to pass the respective compilation flags when using a header-only model. Many users don't want that, or may not even be in the position to do that (e.g. if ViennaCL is part of a larger software stack). Ideally, ViennaCL contains AVX- and non-AVX code in the same binary, selecting the appropriate code path based on the actual CPU features *available* rather than relying on the user passing the correct optimization flags. Best regards, Karli |
From: Charles D. <cde...@gm...> - 2016-08-17 20:14:38
|
Thanks for explanation Karl, Running again and printing the cl_mem objects (from opencl_handle().get()) it appears that the relevant addresses (last three lines) are being used in multiple areas such as 'Writing data'. ViennaCL: Initializing context no. 1 ViennaCL: Initializing new ViennaCL context. ViennaCL: Setting all devices for context... ViennaCL: Getting platform... ViennaCL: Querying devices available at current platform. ViennaCL: Found 1 devices. ViennaCL: Creating device object (CTOR with cl_device_id) ViennaCL: Creating device object (Copy CTOR) ViennaCL: Number of devices for context: 1 ViennaCL: Creating device object (Copy CTOR) ViennaCL: Initialization of new ViennaCL context done. ViennaCL: Creating device object (Copy CTOR) ViennaCL: Creating device object (Copy CTOR) ViennaCL: Adding new queue for device 0x354a910 to context 0x9e13440 ViennaCL: Context no. 1 initialized with 1 devices ViennaCL: Device id: 0x354a910 ViennaCL: Creating memory of size 131072 for context 0x9e13440 (unsafe, returning cl_mem directly) ViennaCL: Adding program 'double_matrix_row' with source to context 0x9e13440 ViennaCL: Creating kernel object (full CTOR): assign_cpu ViennaCL: Creating kernel object (full CTOR): ambm_cpu_cpu ViennaCL: Creating kernel object (full CTOR): vec_mul ViennaCL: Creating kernel object (full CTOR): ambm_m_cpu_cpu ViennaCL: Creating kernel object (full CTOR): ambm_m_gpu_cpu ViennaCL: Creating kernel object (full CTOR): am_cpu ViennaCL: Creating kernel object (full CTOR): element_op ViennaCL: Creating kernel object (full CTOR): ambm_gpu_cpu ViennaCL: Creating kernel object (full CTOR): ambm_m_gpu_gpu ViennaCL: Creating kernel object (full CTOR): am_gpu ViennaCL: Creating kernel object (full CTOR): trans_vec_mul ViennaCL: Creating kernel object (full CTOR): ambm_gpu_gpu ViennaCL: Creating kernel object (full CTOR): diagonal_assign_cpu ViennaCL: Creating kernel object (full CTOR): ambm_cpu_gpu ViennaCL: Creating kernel object (full CTOR): ambm_m_cpu_gpu ViennaCL: Stored program 'double_matrix_row' in context 0x9e13440 ViennaCL: There is/are 1 program(s) ViennaCL: Getting program 'double_matrix_row' from context 0x9e13440 ViennaCL: There are 1 programs ViennaCL: Setting handle kernel argument 0xa84df00 at pos 0 for kernel assign_cpu ViennaCL: Setting unsigned int kernel argument 0 at pos 1 for kernel assign_cpu ViennaCL: Setting unsigned int kernel argument 0 at pos 2 for kernel assign_cpu ViennaCL: Setting unsigned int kernel argument 1 at pos 3 for kernel assign_cpu ViennaCL: Setting unsigned int kernel argument 1 at pos 4 for kernel assign_cpu ViennaCL: Setting unsigned int kernel argument 128 at pos 5 for kernel assign_cpu ViennaCL: Setting unsigned int kernel argument 128 at pos 6 for kernel assign_cpu ViennaCL: Setting unsigned int kernel argument 128 at pos 7 for kernel assign_cpu ViennaCL: Setting unsigned int kernel argument 128 at pos 8 for kernel assign_cpu ViennaCL: Setting double precision kernel argument 0 at pos 9 for kernel assign_cpu ViennaCL: Getting const queue for device GeForce GTX 970 in context 0x9e13440 ViennaCL: Current queue id 0 ViennaCL: Queue handle 0xa84d890 ViennaCL: Starting 1D-kernel 'assign_cpu'... ViennaCL: Global work size: '16384'... ViennaCL: Local work size: '128'... ViennaCL: Kernel assign_cpu finished with status 0! ViennaCL: Getting queue for device GeForce GTX 970 in context 0x9e13440 ViennaCL: Current queue id 0 Writing data (131072 bytes, offset 0) to OpenCL buffer 0xa84df00 with queue 0xa84d890 from 0xa9b0e10 ViennaCL: Getting queue for device GeForce GTX 970 in context 0x9e13440 ViennaCL: Current queue id 0 ViennaCL: Creating memory of size 131072 for context 0x9e13440 (unsafe, returning cl_mem directly) ViennaCL: Getting program 'double_matrix_row' from context 0x9e13440 ViennaCL: There are 1 programs ViennaCL: Setting handle kernel argument 0xa8b4c40 at pos 0 for kernel assign_cpu ViennaCL: Setting unsigned int kernel argument 0 at pos 1 for kernel assign_cpu ViennaCL: Setting unsigned int kernel argument 0 at pos 2 for kernel assign_cpu ViennaCL: Setting unsigned int kernel argument 1 at pos 3 for kernel assign_cpu ViennaCL: Setting unsigned int kernel argument 1 at pos 4 for kernel assign_cpu ViennaCL: Setting unsigned int kernel argument 128 at pos 5 for kernel assign_cpu ViennaCL: Setting unsigned int kernel argument 128 at pos 6 for kernel assign_cpu ViennaCL: Setting unsigned int kernel argument 128 at pos 7 for kernel assign_cpu ViennaCL: Setting unsigned int kernel argument 128 at pos 8 for kernel assign_cpu ViennaCL: Setting double precision kernel argument 0 at pos 9 for kernel assign_cpu ViennaCL: Getting const queue for device GeForce GTX 970 in context 0x9e13440 ViennaCL: Current queue id 0 ViennaCL: Queue handle 0xa84d890 ViennaCL: Starting 1D-kernel 'assign_cpu'... ViennaCL: Global work size: '16384'... ViennaCL: Local work size: '128'... ViennaCL: Kernel assign_cpu finished with status 0! ViennaCL: Getting queue for device GeForce GTX 970 in context 0x9e13440 ViennaCL: Current queue id 0 Writing data (131072 bytes, offset 0) to OpenCL buffer 0xa8b4c40 with queue 0xa84d890 from 0xa9b0e10 ViennaCL: Getting queue for device GeForce GTX 970 in context 0x9e13440 ViennaCL: Current queue id 0 ViennaCL: Creating memory of size 131072 for context 0x9e13440 (unsafe, returning cl_mem directly) ViennaCL: Getting program 'double_matrix_row' from context 0x9e13440 ViennaCL: There are 1 programs ViennaCL: Setting handle kernel argument 0xa8b5990 at pos 0 for kernel assign_cpu ViennaCL: Setting unsigned int kernel argument 0 at pos 1 for kernel assign_cpu ViennaCL: Setting unsigned int kernel argument 0 at pos 2 for kernel assign_cpu ViennaCL: Setting unsigned int kernel argument 1 at pos 3 for kernel assign_cpu ViennaCL: Setting unsigned int kernel argument 1 at pos 4 for kernel assign_cpu ViennaCL: Setting unsigned int kernel argument 128 at pos 5 for kernel assign_cpu ViennaCL: Setting unsigned int kernel argument 128 at pos 6 for kernel assign_cpu ViennaCL: Setting unsigned int kernel argument 128 at pos 7 for kernel assign_cpu ViennaCL: Setting unsigned int kernel argument 128 at pos 8 for kernel assign_cpu ViennaCL: Setting double precision kernel argument 0 at pos 9 for kernel assign_cpu ViennaCL: Getting const queue for device GeForce GTX 970 in context 0x9e13440 ViennaCL: Current queue id 0 ViennaCL: Queue handle 0xa84d890 ViennaCL: Starting 1D-kernel 'assign_cpu'... ViennaCL: Global work size: '16384'... ViennaCL: Local work size: '128'... ViennaCL: Kernel assign_cpu finished with status 0! ViennaCL: Getting queue for device GeForce GTX 970 in context 0x9e13440 ViennaCL: Current queue id 0 Writing data (131072 bytes, offset 0) to OpenCL buffer 0xa8b5990 with queue 0xa84d890 from 0xa9b0e10 ViennaCL: Getting queue for device GeForce GTX 970 in context 0x9e13440 ViennaCL: Current queue id 0 ViennaCL: Getting current_context with id 0 ViennaCL: Getting queue for device GeForce GTX 970 in context 0x6bd5340 ViennaCL: Current queue id 0 trying dgemm 0xa84df00 0xa8b4c40 0xa8b5990 On Wed, Aug 17, 2016 at 3:04 PM, Karl Rupp <ru...@iu...> wrote: > Hi Charles, > > There is a fair amount of output, hopefully something here provides a >> clue that you can understand. >> > > Ok, so let me explain the relevant messages: > > ViennaCL: Setting handle kernel argument 0xbac6690 at pos 0 for kernel >> assign_cpu >> > (...) > >> ViennaCL: Setting handle kernel argument 0xbb2d3d0 at pos 0 for kernel >> assign_cpu >> > (...) > >> ViennaCL: Setting handle kernel argument 0xbb2e120 at pos 0 for kernel >> assign_cpu >> > > The buffers 0xbac6690, 0xbb2d3d0, and 0xbb2e120 (of type cl_mem) are the > relevant matrix buffers for clMAGMA. These are the ones you should pass the > the GEMM routines. You can verify that by printing the values of 'bufA' and > the like. > > (Of course the buffer addresses change in each run) > > Best regards, > Karli > > |
From: Karl R. <ru...@iu...> - 2016-08-17 20:05:00
|
Hi Charles, > There is a fair amount of output, hopefully something here provides a > clue that you can understand. Ok, so let me explain the relevant messages: > ViennaCL: Setting handle kernel argument 0xbac6690 at pos 0 for kernel > assign_cpu (...) > ViennaCL: Setting handle kernel argument 0xbb2d3d0 at pos 0 for kernel > assign_cpu (...) > ViennaCL: Setting handle kernel argument 0xbb2e120 at pos 0 for kernel > assign_cpu The buffers 0xbac6690, 0xbb2d3d0, and 0xbb2e120 (of type cl_mem) are the relevant matrix buffers for clMAGMA. These are the ones you should pass the the GEMM routines. You can verify that by printing the values of 'bufA' and the like. (Of course the buffer addresses change in each run) Best regards, Karli |
From: Charles D. <cde...@gm...> - 2016-08-17 19:56:46
|
Just adding my opinion here as I have been following this thread. Would it be possible to have both the .so library and header only options available or is it a strictly 'this-or-that' scenario? Regards, Charles On Wed, Aug 17, 2016 at 2:53 PM, Karl Rupp <ru...@iu...> wrote: > Hi Dmitriy, > > > We could (and probably should?) add such a convenience header file > > at the expense of increased compilation times (and reduced > > encapsulation of source code against compiler issues). > > > > > > +1 on single header! :) > > thanks for the feedback: > https://github.com/viennacl/viennacl-dev/issues/196 > > > > Ultimately, this all boils down to fighting limitations of the > > current header-only source code distribution model. > > > > > > FWIW, if our opinion matters, actually, header-only is one of the things > > we like very much. It means we don't have to redistribute any > > executables, everything already is included in our jars, everything that > > we use and need (and only it) is already generated for us by javacpp. > > This is one of the most valuable features about ViennaCL in my opinion. > > It is very hard to get customers to install yet-another libX.so on their > > clusters. > > I agree that additional libraries on clusters can be tricky at times... > > > > But header-only, template-based code solves > > > > (1) we include everything we need in jar (no extra infra requirement) > > (2) we include only that we actually support/use (lightweight, slim > > application size requirement) > > > > these are very valuable for flink/spark type of applications. Which is > > what we are. > > > > I know that you have plans to generate a .so lib with apparently > > non-object API, but for apache mahout the OAA api with header-only > > requirement is super optimal. (at least I have a high hope you won't > > _force_ us to redistribute an .so(s) in the future releases :) ) > > Will a static library suffice for your purposes? I'm not an expert on > releasing .jar packages, but I'd expect that a static library could > offer similar advantages to an header-only approach. > > Best regards, > Karli > > ------------------------------------------------------------ > ------------------ > _______________________________________________ > ViennaCL-devel mailing list > Vie...@li... > https://lists.sourceforge.net/lists/listinfo/viennacl-devel > |
From: Charles D. <cde...@gm...> - 2016-08-17 19:54:52
|
There is a fair amount of output, hopefully something here provides a clue that you can understand. ViennaCL: Initializing context no. 1 ViennaCL: Initializing new ViennaCL context. ViennaCL: Setting all devices for context... ViennaCL: Getting platform... ViennaCL: Querying devices available at current platform. ViennaCL: Found 1 devices. ViennaCL: Creating device object (CTOR with cl_device_id) ViennaCL: Creating device object (Copy CTOR) ViennaCL: Number of devices for context: 1 ViennaCL: Creating device object (Copy CTOR) ViennaCL: Initialization of new ViennaCL context done. ViennaCL: Creating device object (Copy CTOR) ViennaCL: Creating device object (Copy CTOR) ViennaCL: Adding new queue for device 0x47c2910 to context 0xb08a9d0 ViennaCL: Context no. 1 initialized with 1 devices ViennaCL: Device id: 0x47c2910 ViennaCL: Creating memory of size 131072 for context 0xb08a9d0 (unsafe, returning cl_mem directly) ViennaCL: Adding program 'double_matrix_row' with source to context 0xb08a9d0 ViennaCL: Creating kernel object (full CTOR): assign_cpu ViennaCL: Creating kernel object (full CTOR): ambm_cpu_cpu ViennaCL: Creating kernel object (full CTOR): vec_mul ViennaCL: Creating kernel object (full CTOR): ambm_m_cpu_cpu ViennaCL: Creating kernel object (full CTOR): ambm_m_gpu_cpu ViennaCL: Creating kernel object (full CTOR): am_cpu ViennaCL: Creating kernel object (full CTOR): element_op ViennaCL: Creating kernel object (full CTOR): ambm_gpu_cpu ViennaCL: Creating kernel object (full CTOR): ambm_m_gpu_gpu ViennaCL: Creating kernel object (full CTOR): am_gpu ViennaCL: Creating kernel object (full CTOR): trans_vec_mul ViennaCL: Creating kernel object (full CTOR): ambm_gpu_gpu ViennaCL: Creating kernel object (full CTOR): diagonal_assign_cpu ViennaCL: Creating kernel object (full CTOR): ambm_cpu_gpu ViennaCL: Creating kernel object (full CTOR): ambm_m_cpu_gpu ViennaCL: Stored program 'double_matrix_row' in context 0xb08a9d0 ViennaCL: There is/are 1 program(s) ViennaCL: Getting program 'double_matrix_row' from context 0xb08a9d0 ViennaCL: There are 1 programs ViennaCL: Setting handle kernel argument 0xbac6690 at pos 0 for kernel assign_cpu ViennaCL: Setting unsigned int kernel argument 0 at pos 1 for kernel assign_cpu ViennaCL: Setting unsigned int kernel argument 0 at pos 2 for kernel assign_cpu ViennaCL: Setting unsigned int kernel argument 1 at pos 3 for kernel assign_cpu ViennaCL: Setting unsigned int kernel argument 1 at pos 4 for kernel assign_cpu ViennaCL: Setting unsigned int kernel argument 128 at pos 5 for kernel assign_cpu ViennaCL: Setting unsigned int kernel argument 128 at pos 6 for kernel assign_cpu ViennaCL: Setting unsigned int kernel argument 128 at pos 7 for kernel assign_cpu ViennaCL: Setting unsigned int kernel argument 128 at pos 8 for kernel assign_cpu ViennaCL: Setting double precision kernel argument 0 at pos 9 for kernel assign_cpu ViennaCL: Getting const queue for device GeForce GTX 970 in context 0xb08a9d0 ViennaCL: Current queue id 0 ViennaCL: Queue handle 0xbac6020 ViennaCL: Starting 1D-kernel 'assign_cpu'... ViennaCL: Global work size: '16384'... ViennaCL: Local work size: '128'... ViennaCL: Kernel assign_cpu finished with status 0! ViennaCL: Getting queue for device GeForce GTX 970 in context 0xb08a9d0 ViennaCL: Current queue id 0 Writing data (131072 bytes, offset 0) to OpenCL buffer 0xbac6690 with queue 0xbac6020 from 0xbc295a0 ViennaCL: Getting queue for device GeForce GTX 970 in context 0xb08a9d0 ViennaCL: Current queue id 0 ViennaCL: Creating memory of size 131072 for context 0xb08a9d0 (unsafe, returning cl_mem directly) ViennaCL: Getting program 'double_matrix_row' from context 0xb08a9d0 ViennaCL: There are 1 programs ViennaCL: Setting handle kernel argument 0xbb2d3d0 at pos 0 for kernel assign_cpu ViennaCL: Setting unsigned int kernel argument 0 at pos 1 for kernel assign_cpu ViennaCL: Setting unsigned int kernel argument 0 at pos 2 for kernel assign_cpu ViennaCL: Setting unsigned int kernel argument 1 at pos 3 for kernel assign_cpu ViennaCL: Setting unsigned int kernel argument 1 at pos 4 for kernel assign_cpu ViennaCL: Setting unsigned int kernel argument 128 at pos 5 for kernel assign_cpu ViennaCL: Setting unsigned int kernel argument 128 at pos 6 for kernel assign_cpu ViennaCL: Setting unsigned int kernel argument 128 at pos 7 for kernel assign_cpu ViennaCL: Setting unsigned int kernel argument 128 at pos 8 for kernel assign_cpu ViennaCL: Setting double precision kernel argument 0 at pos 9 for kernel assign_cpu ViennaCL: Getting const queue for device GeForce GTX 970 in context 0xb08a9d0 ViennaCL: Current queue id 0 ViennaCL: Queue handle 0xbac6020 ViennaCL: Starting 1D-kernel 'assign_cpu'... ViennaCL: Global work size: '16384'... ViennaCL: Local work size: '128'... ViennaCL: Kernel assign_cpu finished with status 0! ViennaCL: Getting queue for device GeForce GTX 970 in context 0xb08a9d0 ViennaCL: Current queue id 0 Writing data (131072 bytes, offset 0) to OpenCL buffer 0xbb2d3d0 with queue 0xbac6020 from 0xbc295a0 ViennaCL: Getting queue for device GeForce GTX 970 in context 0xb08a9d0 ViennaCL: Current queue id 0 ViennaCL: Creating memory of size 131072 for context 0xb08a9d0 (unsafe, returning cl_mem directly) ViennaCL: Getting program 'double_matrix_row' from context 0xb08a9d0 ViennaCL: There are 1 programs ViennaCL: Setting handle kernel argument 0xbb2e120 at pos 0 for kernel assign_cpu ViennaCL: Setting unsigned int kernel argument 0 at pos 1 for kernel assign_cpu ViennaCL: Setting unsigned int kernel argument 0 at pos 2 for kernel assign_cpu ViennaCL: Setting unsigned int kernel argument 1 at pos 3 for kernel assign_cpu ViennaCL: Setting unsigned int kernel argument 1 at pos 4 for kernel assign_cpu ViennaCL: Setting unsigned int kernel argument 128 at pos 5 for kernel assign_cpu ViennaCL: Setting unsigned int kernel argument 128 at pos 6 for kernel assign_cpu ViennaCL: Setting unsigned int kernel argument 128 at pos 7 for kernel assign_cpu ViennaCL: Setting unsigned int kernel argument 128 at pos 8 for kernel assign_cpu ViennaCL: Setting double precision kernel argument 0 at pos 9 for kernel assign_cpu ViennaCL: Getting const queue for device GeForce GTX 970 in context 0xb08a9d0 ViennaCL: Current queue id 0 ViennaCL: Queue handle 0xbac6020 ViennaCL: Starting 1D-kernel 'assign_cpu'... ViennaCL: Global work size: '16384'... ViennaCL: Local work size: '128'... ViennaCL: Kernel assign_cpu finished with status 0! ViennaCL: Getting queue for device GeForce GTX 970 in context 0xb08a9d0 ViennaCL: Current queue id 0 Writing data (131072 bytes, offset 0) to OpenCL buffer 0xbb2e120 with queue 0xbac6020 from 0xbc295a0 ViennaCL: Getting queue for device GeForce GTX 970 in context 0xb08a9d0 ViennaCL: Current queue id 0 ViennaCL: Getting current_context with id 0 ViennaCL: Getting queue for device GeForce GTX 970 in context 0x7ab22f0 ViennaCL: Current queue id 0 OpenCL error -38 on line 281 of /home/cdeterman/Downloads/clBLAS/src/library/blas/xgemm.cc R: /home/cdeterman/Downloads/clBLAS/src/library/blas/xgemm.cc:281: void enqueueGemmKernel(cl_command_queue, cl_kernel, void**, size_t*, unsigned int, const size_t*, const size_t*, cl_uint, _cl_event* const*, _cl_event**): Assertion `false' failed. Regards, Charles On Wed, Aug 17, 2016 at 2:45 PM, Karl Rupp <ru...@iu...> wrote: > Hi, > > I actually did use the relevant internal_size1/2 calls, just being >> little lazy in email, didn't realize there was a plain internal_size >> method. I have been modifying the clBLAS source to troubleshoot on what >> argument the CL_CHECK call is crashing on. It is crashing on the first >> kernel argument which is defined just before this call >> >> gemmKernelArgs[ 0] = &A; >> >> Where 'A' is referring to the buffer of the first matrix which I have >> passed in as >> >> cl_mem bufA = A.handle().opencl_handle().get() >> >> This is now where I am currently stuck as to why it believes that the >> buffer I pull from the viennacl matrix is invalid. >> > > Can you recompile your code with -DVIENNACL_DEBUG_ALL and send the output? > Alternatively, > #define VIENNACL_DEBUG_ALL 1 > at the very beginning of your source file. This will print information > about everything that happens in the background. > > Best regards, > Karli > > > |
From: Karl R. <ru...@iu...> - 2016-08-17 19:53:40
|
Hi Dmitriy, > We could (and probably should?) add such a convenience header file > at the expense of increased compilation times (and reduced > encapsulation of source code against compiler issues). > > > +1 on single header! :) thanks for the feedback: https://github.com/viennacl/viennacl-dev/issues/196 > Ultimately, this all boils down to fighting limitations of the > current header-only source code distribution model. > > > FWIW, if our opinion matters, actually, header-only is one of the things > we like very much. It means we don't have to redistribute any > executables, everything already is included in our jars, everything that > we use and need (and only it) is already generated for us by javacpp. > This is one of the most valuable features about ViennaCL in my opinion. > It is very hard to get customers to install yet-another libX.so on their > clusters. I agree that additional libraries on clusters can be tricky at times... > But header-only, template-based code solves > > (1) we include everything we need in jar (no extra infra requirement) > (2) we include only that we actually support/use (lightweight, slim > application size requirement) > > these are very valuable for flink/spark type of applications. Which is > what we are. > > I know that you have plans to generate a .so lib with apparently > non-object API, but for apache mahout the OAA api with header-only > requirement is super optimal. (at least I have a high hope you won't > _force_ us to redistribute an .so(s) in the future releases :) ) Will a static library suffice for your purposes? I'm not an expert on releasing .jar packages, but I'd expect that a static library could offer similar advantages to an header-only approach. Best regards, Karli |
From: Dmitriy L. <dl...@gm...> - 2016-08-17 19:53:14
|
PS also since it is being built as part of our build, we also get control over supported HW options and platforms. I.e., we may generate not only per-platform support, but also further specialize for things like +AVX2 instruction set optimized openMP version (which anecdotally runs about ~2 times faster for me on moderate matrix sizes, meaning for the bigger sizes the gain is much more significant). On Wed, Aug 17, 2016 at 12:44 PM, Dmitriy Lyubimov <dl...@gm...> wrote: > > > On Wed, Aug 17, 2016 at 2:50 AM, Karl Rupp <ru...@iu...> wrote: > >> Hi Andy and Dmitriy, >> >> We could (and probably should?) add such a convenience header file at the >> expense of increased compilation times (and reduced encapsulation of source >> code against compiler issues). > > > +1 on single header! :) > > Ultimately, this all boils down to fighting limitations of the current >> header-only source code distribution model. >> > > FWIW, if our opinion matters, actually, header-only is one of the things > we like very much. It means we don't have to redistribute any executables, > everything already is included in our jars, everything that we use and need > (and only it) is already generated for us by javacpp. This is one of the > most valuable features about ViennaCL in my opinion. It is very hard to get > customers to install yet-another libX.so on their clusters. > > But header-only, template-based code solves > > (1) we include everything we need in jar (no extra infra requirement) > (2) we include only that we actually support/use (lightweight, slim > application size requirement) > > these are very valuable for flink/spark type of applications. Which is > what we are. > > I know that you have plans to generate a .so lib with apparently > non-object API, but for apache mahout the OAA api with header-only > requirement is super optimal. (at least I have a high hope you won't > _force_ us to redistribute an .so(s) in the future releases :) ) > > > -Dmitriy > > >> Best regards, >> Karli >> >> >> >> >>> >>> On Tue, Aug 16, 2016 at 11:16 AM, Dmitriy Lyubimov <dl...@gm... >>> <mailto:dl...@gm...>> wrote: >>> >>> Karl, >>> >>> i can independently confirm the problem with prod_impl instantiation >>> over expression of compressed times base_matrix into matrix type. >>> >>> I understand there are tests examples but something goes wrong with >>> the straightforward code. >>> >>> We are compiling for open cl and open mp at the same time. >>> >>> >>> On Mon, Aug 8, 2016 at 11:03 AM, Andrew Palumbo <ap...@ou... >>> <mailto:ap...@ou...>> wrote: >>> >>> Hi Karli, >>> >>> >>> I've mocked up in C++ the method that I'm trying to use from >>> java. Aside from adding some values, it looks very similar to >>> the code that you have below. >>> >>> >>> I'm getting the same compiler error hat I was getting through >>> javacpp/JNI: >>> >>> >>> >>> sparseDenseMmul.cpp:85:103: required from here >>> /usr/include/viennacl/matrix.hpp:2247:36: error: no matching >>> function for call to >>> ‘prod_impl(const viennacl::compressed_matrix<double>&, const >>> viennacl::matrix_base<double, long unsigned int, long int>&, >>> viennacl::matrix_base<double, long unsigned int, long int>&)’ >>> viennacl::linalg::prod_impl(proxy.lhs(), proxy.rhs(), >>> lhs); >>> >>> ^ >>> In file included from /usr/include/viennacl/matrix.hpp:28:0, >>> from >>> /usr/include/viennacl/linalg/sparse_matrix_operations.hpp:28, >>> from >>> /usr/include/viennacl/compressed_matrix.hpp:31, >>> from sparseDenseMmul.cpp:7: >>> /usr/include/viennacl/linalg/matrix_operations.hpp:438:10: >>> note: candidate: >>> template<class NumericT> void viennacl::linalg::prod_impl(co >>> nst >>> viennacl::matrix_base<T>&, const viennacl::vector_base<T>&, >>> viennacl::vector_base<T>&) >>> void prod_impl(const matrix_base<NumericT> & mat, >>> >>> >>> The code is below, and I've attached both the >>> "sparseDenseMmul.cpp" file and the full compilation error output >>> (very long, probably not useful) >>> >>> >>> Thanks very much, >>> >>> >>> Andy >>> >>> >>> >>> >>> >>> Attached as "sparseDenseMmul.cpp": >>> >>> >>> #include <iostream> >>> // not using openMP for this mockup >>> // #define VIENNACL_WITH_OPENMP 1 >>> // ViennaCL includes >>> #include "viennacl/forwards.h" >>> #include "viennacl/compressed_matrix.hpp" >>> #include "viennacl/linalg/prod.hpp" >>> #include "viennacl/backend/memory.hpp" >>> #include "viennacl/matrix.hpp" >>> #include "viennacl/detail/matrix_def.hpp" >>> #include "viennacl/tools/random.hpp" >>> #include "viennacl/context.hpp" >>> #include "viennacl/linalg/host_based/sp >>> arse_matrix_operations.hpp" >>> >>> >>> // C_dense_matrix = A_compressed_matrix %*% B_dense_matrix. >>> >>> // compile line w/o OpenMP: g++ sparseDenseMmul.cpp >>> -I/usr/include/viennacl/ -o sparseDenseMmul >>> >>> >>> >>> int main() >>> { >>> // trying to recreate javacpp wrapper functionalliy as closely >>> as possible >>> // so not using typedef, unsigned ints, etc, and defining >>> templates as doubles >>> // creating buffers as int/double arrays and then setting >>> pointers to them. >>> // (not 100% sure that this is how javacpp passes pointers but >>> should be close.) >>> >>> >>> //typedef double ScalarType; >>> >>> // in acuallity, we cast `int`s from jni/javacpp. >>> unsigned int m = 10; >>> unsigned int n = 10; >>> unsigned long s = 5; >>> >>> unsigned int NNz_A = 12; >>> >>> >>> // allocate buffers and set pointers (similarly to javacpp) >>> // using ints (not unsigned ints) here from jni/javacpp. >>> int A_row_jumpers[m + 1] = {0, 0, 1, 2, 4, 5, 6, 7, 9, 11, 12}; >>> int *A_row_ptr = A_row_jumpers; >>> >>> // using ints (not unsigned ints) here from jni/javacpp. >>> int A_col_idxs[NNz_A] = {4, 0, 2, 3, 2, 4, 0, 4, 3, 0, 3, 0}; >>> int *A_col_ptr = A_col_idxs; >>> >>> double A_values[NNz_A] = {0.4065367203992265, >>> 0.04957158909682802, 0.3708618354358446, >>> 0.5205586068847993, 0.6963900565931678, >>> 0.8330915529787706, 0.32839112750638844, >>> 0.4265801782090245, 0.7856168903297948, >>> 0.14733066454561583, 0.9501663495824946, >>> 0.9710498974366047}; >>> double* A_values_ptr = A_values; >>> >>> >>> // using double values in Mahout setting template directlyfor >>> our compressed_matrix, A >>> viennacl::compressed_matrix<double> A_compressed_matrix(m, s); >>> >>> // set the ptrs for A >>> A_compressed_matrix.set(A_row_ptr, A_col_ptr, A_values_ptr, m, >>> s, NNz_A); >>> >>> // B is dense s so we only need s x n values. >>> double B_values[s * n] = {0}; >>> >>> // add some random data to B: >>> viennacl::tools::uniform_random_numbers<double> randomNumber; >>> for (int i = 0; i< s * n; i++) { >>> B_values[i] = randomNumber(); >>> } >>> >>> double* B_values_ptr = B_values; >>> >>> >>> // for our row_major dense_matrix, B can set the double values >>> in the construcor >>> // this is currently the constructor that we're using through >>> scala/javacpp. >>> const viennacl::matrix<double,viennacl::row_major> >>> B_dense_matrix(B_values_ptr, >>> viennacl::MAIN_MEMORY, s, n); >>> >>> >>> // perform multiplication and inside of a compressed_matrix >>> constructor >>> viennacl::matrix<double> >>> C_dense_matrix(viennacl::linalg::prod(A_compressed_matrix , >>> B_dense_matrix)); >>> >>> >>> // print out matrix >>> std::cout << "ViennaCL: " << C_dense_matrix << std::endl; >>> >>> >>> // just exit with success for now if there are no runtime >>> errors. >>> >>> return EXIT_SUCCESS; >>> } >>> >>> >>> ------------------------------------------------------------ >>> ------------ >>> *From:* Karl Rupp <ru...@iu... >>> <mailto:ru...@iu...>> >>> *Sent:* Sunday, August 7, 2016 2:20:26 PM >>> *To:* Andrew Palumbo; vie...@li... >>> <mailto:vie...@li...> >>> *Subject:* Re: [ViennaCL-devel] compressed_matrix %*% matrix_Base >>> >>> >>> Hi Andy, >>> >>> the relevant tests for sparse matrices times dense matrices are >>> in >>> tests/spmdm.cpp. In particular, I recreated a test case based on >>> your >>> description and couldn't find any issues: >>> >>> viennacl::compressed_matrix<NumericT> compressed_A; >>> viennacl::matrix<NumericT, FactorLayoutT> B1(std_A.size(), >>> cols_rhs); >>> viennacl::matrix_base<NumericT> B1_ref(B1); >>> viennacl::matrix_base<NumericT> >>> C2(viennacl::linalg::prod(compressed_A, B1_ref)); >>> >>> compiles cleanly. Could you please provide a code snippet >>> demonstrating >>> the problem you are encountering? >>> >>> Thanks and best regards, >>> Karli >>> >>> >>> >>> On 08/05/2016 09:04 PM, Andrew Palumbo wrote: >>> > Hi Karl, >>> > >>> > >>> > I've been trying to implement tests for: >>> > >>> > >>> > matrix_base<double> C = compressed_matrix<double> A %*% >>> > >>> > matrix_base<double,row_major> B. >>> > >>> > >>> > I cant find in the code or the documentation any constructor >>> for >>> > matrix_base<T>( >>> > >>> > matrix_expression<const viennacl::compressed_matrix<T>, const >>> > viennacl::matrix_base<T>, viennacl::op_prod>) >>> > >>> > ie. a mixed expression of compressed_matrix and matrix_base >>> > >>> > and get a compilation error when I try to instantiate a: >>> > >>> > matrix_base<double>(matrix_expression<const >>> > viennacl::compressed_matrix<double>, const >>> > viennacl::matrix_base<double>, >>> > viennacl::op_prod>) >>> > >>> > Is there a transformation that I need to do from this >>> > >>> > matrix_expression<compressed_matrix<double>, >>> matrix_base<double>, >>> > op_prod> >>> > >>> > to something else so that I may be able to initialize a >>> matrix_base (or >>> > possibly even a compressed_matrix) from it? >>> > >>> > The compilation error that i get is below. >>> > >>> > Thanks, >>> > >>> > Andy >>> > >>> >>> >>> ------------------------------------------------------------ >>> ------------------ >>> What NetFlow Analyzer can do for you? Monitors network bandwidth >>> and traffic >>> patterns at an interface-level. Reveals which users, apps, and >>> protocols are >>> consuming the most bandwidth. Provides multi-vendor support for >>> NetFlow, >>> J-Flow, sFlow and other flows. Make informed decisions using >>> capacity >>> planning reports. http://sdm.link/zohodev2dev >>> _______________________________________________ >>> ViennaCL-devel mailing list >>> Vie...@li... >>> <mailto:Vie...@li...> >>> https://lists.sourceforge.net/lists/listinfo/viennacl-devel >>> <https://lists.sourceforge.net/lists/listinfo/viennacl-devel> >>> >>> >>> >>> >> > |
From: Karl R. <ru...@iu...> - 2016-08-17 19:45:56
|
Hi, > I actually did use the relevant internal_size1/2 calls, just being > little lazy in email, didn't realize there was a plain internal_size > method. I have been modifying the clBLAS source to troubleshoot on what > argument the CL_CHECK call is crashing on. It is crashing on the first > kernel argument which is defined just before this call > > gemmKernelArgs[ 0] = &A; > > Where 'A' is referring to the buffer of the first matrix which I have > passed in as > > cl_mem bufA = A.handle().opencl_handle().get() > > This is now where I am currently stuck as to why it believes that the > buffer I pull from the viennacl matrix is invalid. Can you recompile your code with -DVIENNACL_DEBUG_ALL and send the output? Alternatively, #define VIENNACL_DEBUG_ALL 1 at the very beginning of your source file. This will print information about everything that happens in the background. Best regards, Karli |
From: Dmitriy L. <dl...@gm...> - 2016-08-17 19:44:30
|
On Wed, Aug 17, 2016 at 2:50 AM, Karl Rupp <ru...@iu...> wrote: > Hi Andy and Dmitriy, > > We could (and probably should?) add such a convenience header file at the > expense of increased compilation times (and reduced encapsulation of source > code against compiler issues). +1 on single header! :) Ultimately, this all boils down to fighting limitations of the current > header-only source code distribution model. > FWIW, if our opinion matters, actually, header-only is one of the things we like very much. It means we don't have to redistribute any executables, everything already is included in our jars, everything that we use and need (and only it) is already generated for us by javacpp. This is one of the most valuable features about ViennaCL in my opinion. It is very hard to get customers to install yet-another libX.so on their clusters. But header-only, template-based code solves (1) we include everything we need in jar (no extra infra requirement) (2) we include only that we actually support/use (lightweight, slim application size requirement) these are very valuable for flink/spark type of applications. Which is what we are. I know that you have plans to generate a .so lib with apparently non-object API, but for apache mahout the OAA api with header-only requirement is super optimal. (at least I have a high hope you won't _force_ us to redistribute an .so(s) in the future releases :) ) -Dmitriy > Best regards, > Karli > > > > >> >> On Tue, Aug 16, 2016 at 11:16 AM, Dmitriy Lyubimov <dl...@gm... >> <mailto:dl...@gm...>> wrote: >> >> Karl, >> >> i can independently confirm the problem with prod_impl instantiation >> over expression of compressed times base_matrix into matrix type. >> >> I understand there are tests examples but something goes wrong with >> the straightforward code. >> >> We are compiling for open cl and open mp at the same time. >> >> >> On Mon, Aug 8, 2016 at 11:03 AM, Andrew Palumbo <ap...@ou... >> <mailto:ap...@ou...>> wrote: >> >> Hi Karli, >> >> >> I've mocked up in C++ the method that I'm trying to use from >> java. Aside from adding some values, it looks very similar to >> the code that you have below. >> >> >> I'm getting the same compiler error hat I was getting through >> javacpp/JNI: >> >> >> >> sparseDenseMmul.cpp:85:103: required from here >> /usr/include/viennacl/matrix.hpp:2247:36: error: no matching >> function for call to >> ‘prod_impl(const viennacl::compressed_matrix<double>&, const >> viennacl::matrix_base<double, long unsigned int, long int>&, >> viennacl::matrix_base<double, long unsigned int, long int>&)’ >> viennacl::linalg::prod_impl(proxy.lhs(), proxy.rhs(), >> lhs); >> >> ^ >> In file included from /usr/include/viennacl/matrix.hpp:28:0, >> from >> /usr/include/viennacl/linalg/sparse_matrix_operations.hpp:28, >> from >> /usr/include/viennacl/compressed_matrix.hpp:31, >> from sparseDenseMmul.cpp:7: >> /usr/include/viennacl/linalg/matrix_operations.hpp:438:10: >> note: candidate: >> template<class NumericT> void viennacl::linalg::prod_impl(co >> nst >> viennacl::matrix_base<T>&, const viennacl::vector_base<T>&, >> viennacl::vector_base<T>&) >> void prod_impl(const matrix_base<NumericT> & mat, >> >> >> The code is below, and I've attached both the >> "sparseDenseMmul.cpp" file and the full compilation error output >> (very long, probably not useful) >> >> >> Thanks very much, >> >> >> Andy >> >> >> >> >> >> Attached as "sparseDenseMmul.cpp": >> >> >> #include <iostream> >> // not using openMP for this mockup >> // #define VIENNACL_WITH_OPENMP 1 >> // ViennaCL includes >> #include "viennacl/forwards.h" >> #include "viennacl/compressed_matrix.hpp" >> #include "viennacl/linalg/prod.hpp" >> #include "viennacl/backend/memory.hpp" >> #include "viennacl/matrix.hpp" >> #include "viennacl/detail/matrix_def.hpp" >> #include "viennacl/tools/random.hpp" >> #include "viennacl/context.hpp" >> #include "viennacl/linalg/host_based/sp >> arse_matrix_operations.hpp" >> >> >> // C_dense_matrix = A_compressed_matrix %*% B_dense_matrix. >> >> // compile line w/o OpenMP: g++ sparseDenseMmul.cpp >> -I/usr/include/viennacl/ -o sparseDenseMmul >> >> >> >> int main() >> { >> // trying to recreate javacpp wrapper functionalliy as closely >> as possible >> // so not using typedef, unsigned ints, etc, and defining >> templates as doubles >> // creating buffers as int/double arrays and then setting >> pointers to them. >> // (not 100% sure that this is how javacpp passes pointers but >> should be close.) >> >> >> //typedef double ScalarType; >> >> // in acuallity, we cast `int`s from jni/javacpp. >> unsigned int m = 10; >> unsigned int n = 10; >> unsigned long s = 5; >> >> unsigned int NNz_A = 12; >> >> >> // allocate buffers and set pointers (similarly to javacpp) >> // using ints (not unsigned ints) here from jni/javacpp. >> int A_row_jumpers[m + 1] = {0, 0, 1, 2, 4, 5, 6, 7, 9, 11, 12}; >> int *A_row_ptr = A_row_jumpers; >> >> // using ints (not unsigned ints) here from jni/javacpp. >> int A_col_idxs[NNz_A] = {4, 0, 2, 3, 2, 4, 0, 4, 3, 0, 3, 0}; >> int *A_col_ptr = A_col_idxs; >> >> double A_values[NNz_A] = {0.4065367203992265, >> 0.04957158909682802, 0.3708618354358446, >> 0.5205586068847993, 0.6963900565931678, >> 0.8330915529787706, 0.32839112750638844, >> 0.4265801782090245, 0.7856168903297948, >> 0.14733066454561583, 0.9501663495824946, >> 0.9710498974366047}; >> double* A_values_ptr = A_values; >> >> >> // using double values in Mahout setting template directlyfor >> our compressed_matrix, A >> viennacl::compressed_matrix<double> A_compressed_matrix(m, s); >> >> // set the ptrs for A >> A_compressed_matrix.set(A_row_ptr, A_col_ptr, A_values_ptr, m, >> s, NNz_A); >> >> // B is dense s so we only need s x n values. >> double B_values[s * n] = {0}; >> >> // add some random data to B: >> viennacl::tools::uniform_random_numbers<double> randomNumber; >> for (int i = 0; i< s * n; i++) { >> B_values[i] = randomNumber(); >> } >> >> double* B_values_ptr = B_values; >> >> >> // for our row_major dense_matrix, B can set the double values >> in the construcor >> // this is currently the constructor that we're using through >> scala/javacpp. >> const viennacl::matrix<double,viennacl::row_major> >> B_dense_matrix(B_values_ptr, >> viennacl::MAIN_MEMORY, s, n); >> >> >> // perform multiplication and inside of a compressed_matrix >> constructor >> viennacl::matrix<double> >> C_dense_matrix(viennacl::linalg::prod(A_compressed_matrix , >> B_dense_matrix)); >> >> >> // print out matrix >> std::cout << "ViennaCL: " << C_dense_matrix << std::endl; >> >> >> // just exit with success for now if there are no runtime >> errors. >> >> return EXIT_SUCCESS; >> } >> >> >> ------------------------------------------------------------ >> ------------ >> *From:* Karl Rupp <ru...@iu... >> <mailto:ru...@iu...>> >> *Sent:* Sunday, August 7, 2016 2:20:26 PM >> *To:* Andrew Palumbo; vie...@li... >> <mailto:vie...@li...> >> *Subject:* Re: [ViennaCL-devel] compressed_matrix %*% matrix_Base >> >> >> Hi Andy, >> >> the relevant tests for sparse matrices times dense matrices are in >> tests/spmdm.cpp. In particular, I recreated a test case based on >> your >> description and couldn't find any issues: >> >> viennacl::compressed_matrix<NumericT> compressed_A; >> viennacl::matrix<NumericT, FactorLayoutT> B1(std_A.size(), >> cols_rhs); >> viennacl::matrix_base<NumericT> B1_ref(B1); >> viennacl::matrix_base<NumericT> >> C2(viennacl::linalg::prod(compressed_A, B1_ref)); >> >> compiles cleanly. Could you please provide a code snippet >> demonstrating >> the problem you are encountering? >> >> Thanks and best regards, >> Karli >> >> >> >> On 08/05/2016 09:04 PM, Andrew Palumbo wrote: >> > Hi Karl, >> > >> > >> > I've been trying to implement tests for: >> > >> > >> > matrix_base<double> C = compressed_matrix<double> A %*% >> > >> > matrix_base<double,row_major> B. >> > >> > >> > I cant find in the code or the documentation any constructor for >> > matrix_base<T>( >> > >> > matrix_expression<const viennacl::compressed_matrix<T>, const >> > viennacl::matrix_base<T>, viennacl::op_prod>) >> > >> > ie. a mixed expression of compressed_matrix and matrix_base >> > >> > and get a compilation error when I try to instantiate a: >> > >> > matrix_base<double>(matrix_expression<const >> > viennacl::compressed_matrix<double>, const >> > viennacl::matrix_base<double>, >> > viennacl::op_prod>) >> > >> > Is there a transformation that I need to do from this >> > >> > matrix_expression<compressed_matrix<double>, >> matrix_base<double>, >> > op_prod> >> > >> > to something else so that I may be able to initialize a >> matrix_base (or >> > possibly even a compressed_matrix) from it? >> > >> > The compilation error that i get is below. >> > >> > Thanks, >> > >> > Andy >> > >> >> >> ------------------------------------------------------------ >> ------------------ >> What NetFlow Analyzer can do for you? Monitors network bandwidth >> and traffic >> patterns at an interface-level. Reveals which users, apps, and >> protocols are >> consuming the most bandwidth. Provides multi-vendor support for >> NetFlow, >> J-Flow, sFlow and other flows. Make informed decisions using >> capacity >> planning reports. http://sdm.link/zohodev2dev >> _______________________________________________ >> ViennaCL-devel mailing list >> Vie...@li... >> <mailto:Vie...@li...> >> https://lists.sourceforge.net/lists/listinfo/viennacl-devel >> <https://lists.sourceforge.net/lists/listinfo/viennacl-devel> >> >> >> >> > |
From: Charles D. <cde...@gm...> - 2016-08-17 13:23:17
|
Karl, I actually did use the relevant internal_size1/2 calls, just being little lazy in email, didn't realize there was a plain internal_size method. I have been modifying the clBLAS source to troubleshoot on what argument the CL_CHECK call is crashing on. It is crashing on the first kernel argument which is defined just before this call gemmKernelArgs[ 0] = &A; Where 'A' is referring to the buffer of the first matrix which I have passed in as cl_mem bufA = A.handle().opencl_handle().get() This is now where I am currently stuck as to why it believes that the buffer I pull from the viennacl matrix is invalid. Regards, Charles On Wed, Aug 17, 2016 at 4:00 AM, Karl Rupp <ru...@iu...> wrote: > Hi, > > I have tried now with the regular .size() calls and setting the leading >> dimensions with .internal_size(). Same error. >> > > This should be either .internal_size1() or .internal_size2(). > For dense matrices .internal_size() equals .internal_size1() * > internal_size2(). > > > The error is reported on line 274 of the clBLAS/src/libary/blas/xgemm.cc >> file. It appears this loop is where it is causing the problem. >> >> for (unsigned int i = 0; i < numKernelArgs; i++) { >> CL_CHECK( clSetKernelArg( clKernel, i, kernelArgSizes[i], >> kernelArgs[i]) ) >> } >> > > I can't tell what is going wrong here based on the description provided. > > Best regards, > Karli > > > >> On Tue, Aug 16, 2016 at 4:33 AM, Karl Rupp <ru...@iu... >> <mailto:ru...@iu...>> wrote: >> >> Hi, >> >> >> >> On 08/15/2016 08:56 PM, Charles Determan wrote: >> >> Karl, >> >> I have the OpenCL backend enabled and I have tried: >> >> cl_mem bufA = A.handle().opencl_handle().get() >> cl_mem bufB = B.handle().opencl_handle().get() >> cl_mem bufC = C.handle().opencl_handle().get() >> >> cl_command_queue queue = >> viennacl::ocl::current_context().get_queue().handle().get(); >> >> err = clblasDgemm(clblasRowMajor, clblasNoTrans, clblasNoTrans, >> A.internal_size2(), >> B.internal_size1(), A.internal_size1(), >> >> >> Are you sure these sizes are correct? I'd expect that you use >> .size1() and .size2() here, and use the respective .internal_size1() >> and .internal_size2() for the leading dimensions (lda, ldb, ldc). >> >> Best regards, >> Karli >> >> >> alpha, bufA, 0, lda, >> bufB, 0, ldb, beta, >> bufC, 0, ldc, >> 1, >> &queue, >> 0, NULL, 0); >> >> >> Although this compiles it results in the error - >> CL_INVALID_MEM_OBJECT >> >> >> Where (which line) do you get the error? >> >> Best regards, >> Karli >> >> Not sure if you have any other thoughts or if I should try >> asking clBLAS >> developers. >> >> Thanks again, >> Charles >> >> On Mon, Aug 15, 2016 at 1:30 PM, Karl Rupp >> <ru...@iu... <mailto:ru...@iu...> >> <mailto:ru...@iu... <mailto:ru...@iu...>>> >> wrote: >> >> Hi Charles, >> >> I am trying to verify my interface with clBLAS before >> going >> completely >> in to clMAGMA. However, I keep getting an OpenCL error >> -38 which >> corresponds to invalid memory (CL_INVALID_MEM_OBJECT) >> when trying a >> clblasDgemm call. This must be referring to the opencl >> memory >> handles I >> am passing in. The fields generally accepts memory >> buffers (cl_mem) >> objects. I have tried passing both >> A.handle.opencl_handle() and >> A.handle.opencl_handle().get() in those fields but get >> the same >> error. >> >> >> These should be A.handle.opencl_handle().get() >> Mind the parantheses after 'handle'. >> >> Also, you will get the error if you don't enable the OpenCL >> backends, or if you enabled the CUDA backend as well (as >> CUDA will >> be the default then). >> >> Best regards, >> Karli >> >> >> I will continue to poke around (maybe I need to use >> internal_size >> numbers) but thought I would ask you about this. >> >> Any insight? >> >> Thanks, >> >> Charles >> >> On Fri, Aug 12, 2016 at 3:21 PM, Charles Determan >> <cde...@gm... <mailto:cde...@gm...> >> <mailto:cde...@gm... <mailto:cde...@gm...>> >> <mailto:cde...@gm... >> <mailto:cde...@gm...> <mailto:cde...@gm... >> <mailto:cde...@gm...>>>> >> wrote: >> >> Thanks Karl, >> >> One followup question, what distinguishes handle(), >> handle1(), and >> handle2()? Do they refer to different buffers? >> >> Regards, >> Charles >> >> On Fri, Aug 12, 2016 at 3:13 PM, Karl Rupp >> <ru...@iu... <mailto:ru...@iu...> >> <mailto:ru...@iu... <mailto:ru...@iu...>> >> <mailto:ru...@iu... >> <mailto:ru...@iu...> >> <mailto:ru...@iu... >> <mailto:ru...@iu...>>>> wrote: >> >> Hi Charles, >> >> call .handle()/.handle1()/.handle2() to get the >> abstract >> memory >> buffers, and call .opencl_handle() on them to >> get the cl_mem >> handles: >> >> A.handle().opencl_handle() >> >> Similarly, the command queue is obtained with >> viennacl::ocl::get_queue().handle().get() >> >> Unfortunately it's not explicitly written in the >> manual :-/ >> >> Best regards, >> Karli >> >> >> On 08/12/2016 09:39 PM, Charles Determan wrote: >> >> I also would need to access the command >> queue handle >> (cl_command_queue) >> object to pass to clBLAS and clMAGMA >> functions. Is >> this easily >> accessible as well? >> >> Thanks, >> Charles >> >> On Fri, Aug 12, 2016 at 11:45 AM, Charles >> Determan >> <cde...@gm... >> <mailto:cde...@gm...> >> <mailto:cde...@gm... >> <mailto:cde...@gm...>> <mailto:cde...@gm... >> <mailto:cde...@gm...> >> <mailto:cde...@gm... >> <mailto:cde...@gm...>>> >> <mailto:cde...@gm... >> <mailto:cde...@gm...> >> <mailto:cde...@gm... >> <mailto:cde...@gm...>> >> <mailto:cde...@gm... >> <mailto:cde...@gm...> >> <mailto:cde...@gm... >> <mailto:cde...@gm...>>>>> wrote: >> >> Thanks Karl, >> >> I have been looking through the docs and >> I can't >> find an >> example for >> how to pull the OpenCL handles from a >> matrix. I >> saw a >> couple I >> think from a context but not sure that >> is what I >> need. >> Is this in >> the documentation somewhere? The closest >> I >> could fine >> is this page >> >> (http://viennacl.sourceforge.net/doc/manual-memory.html >> <http://viennacl.sourceforge.net/doc/manual-memory.html> >> <http://viennacl.sourceforge.net/doc/manual-memory.html >> <http://viennacl.sourceforge.net/doc/manual-memory.html>> >> >> <http://viennacl.sourceforge.net/doc/manual-memory.html >> <http://viennacl.sourceforge.net/doc/manual-memory.html> >> <http://viennacl.sourceforge.net/doc/manual-memory.html >> <http://viennacl.sourceforge.net/doc/manual-memory.html>>> >> >> <http://viennacl.sourceforge.net/doc/manual-memory.html >> <http://viennacl.sourceforge.net/doc/manual-memory.html> >> <http://viennacl.sourceforge.net/doc/manual-memory.html >> <http://viennacl.sourceforge.net/doc/manual-memory.html>> >> >> <http://viennacl.sourceforge.net/doc/manual-memory.html >> <http://viennacl.sourceforge.net/doc/manual-memory.html> >> <http://viennacl.sourceforge.net/doc/manual-memory.html >> <http://viennacl.sourceforge.net/doc/manual-memory.html>>>>). >> >> Regards, >> Charles >> >> On Wed, Aug 10, 2016 at 12:09 PM, >> <ru...@iu... <mailto:ru...@iu...> >> <mailto:ru...@iu... <mailto:ru...@iu...>> >> <mailto:ru...@iu... >> <mailto:ru...@iu...> >> <mailto:ru...@iu... >> <mailto:ru...@iu...>>> >> <mailto:ru...@iu... >> <mailto:ru...@iu...> >> <mailto:ru...@iu... >> <mailto:ru...@iu...>> >> >> <mailto:ru...@iu... >> <mailto:ru...@iu...> >> <mailto:ru...@iu... >> <mailto:ru...@iu...>>>>> wrote: >> >> Hi Charles, >> >> >> I have recently expressed some >> interest >> in different >> factorizations such as >> QR and SVD. I am aware that >> these or >> currently >> experimental >> within >> ViennaCL. Until such a time >> that these >> factorizations are >> fully supported >> (I hope to contribute but the >> algorithms are >> quite complex) >> would it be >> feasible to interface with a >> library like >> clMAGMA? I'm not >> sure of any >> other library offhand that does >> implement these >> methods. I >> thought perhaps >> VexCL but I couldn't find >> anything to that >> effect in the >> documentation. >> >> >> Sure, you can always grab the OpenCL >> handles >> from >> the matrices >> and plug that into clMAGMA. >> I don't think there is any value in >> ViennaCL >> wrapping the >> clMAGMA interfaces, though. >> >> Best regards, >> Karli >> >> >> >> >> >> >> >> >> >> >> >> > |
From: Karl R. <ru...@iu...> - 2016-08-17 09:50:15
|
Hi Andy and Dmitriy, apologies for the late reply and thanks for narrowing down the problem. I could reproduce the problem on my machine and can confirm that the issue is due to the order of include files. Specifically, Andy's code compiles if #include "viennacl/compressed_matrix.hpp" is included *after* #include "viennacl/matrix.hpp" I still need to figure out how to fix this (the order of includes should never matter!) and will let you know as soon as a fix is pushed to the repository. > > Karl, what is recommended set of hpp files to be included? This question has - unfortunately - not a simple answer. I recommend to start with the includes in the respective examples and go from there. More specifically, consider the following basic set of includes: Types: viennacl/range.hpp - index ranges viennacl/slice.hpp - index slices viennacl/vector.hpp - vectors viennacl/vector_proxy.hpp - subvector support viennacl/matrix.hpp - dense matrices viennacl/matrix_proxy.hpp - submatrices of dense matrices viennacl/compressed_matrix.hpp - CSR sparse matrices (CSR recommended) Operations: viennacl/linalg/prod.hpp - Matrix-vector and matrix-matrix products viennacl/linalg/inner_prod.hpp - Inner products (vectors) viennacl/linalg/norm_X.hpp - vector and matrix norms (X in "1", "2", "inf", or "frobenius") Algorithms (solvers, factorizations, etc.) Pick the correspondingly named header file in viennacl/linalg These headers should almost always suffice. > and more specifically, what is already included transitively, so we > don't include it again? Is there any general rule/convention about it? Ideally, all you should ever have to worry about is a single header file, e.g. something like #include "viennacl/viennacl.hpp" We could (and probably should?) add such a convenience header file at the expense of increased compilation times (and reduced encapsulation of source code against compiler issues). Ultimately, this all boils down to fighting limitations of the current header-only source code distribution model. Best regards, Karli > > > On Tue, Aug 16, 2016 at 11:16 AM, Dmitriy Lyubimov <dl...@gm... > <mailto:dl...@gm...>> wrote: > > Karl, > > i can independently confirm the problem with prod_impl instantiation > over expression of compressed times base_matrix into matrix type. > > I understand there are tests examples but something goes wrong with > the straightforward code. > > We are compiling for open cl and open mp at the same time. > > > On Mon, Aug 8, 2016 at 11:03 AM, Andrew Palumbo <ap...@ou... > <mailto:ap...@ou...>> wrote: > > Hi Karli, > > > I've mocked up in C++ the method that I'm trying to use from > java. Aside from adding some values, it looks very similar to > the code that you have below. > > > I'm getting the same compiler error hat I was getting through > javacpp/JNI: > > > > sparseDenseMmul.cpp:85:103: required from here > /usr/include/viennacl/matrix.hpp:2247:36: error: no matching > function for call to > ‘prod_impl(const viennacl::compressed_matrix<double>&, const > viennacl::matrix_base<double, long unsigned int, long int>&, > viennacl::matrix_base<double, long unsigned int, long int>&)’ > viennacl::linalg::prod_impl(proxy.lhs(), proxy.rhs(), > lhs); > > ^ > In file included from /usr/include/viennacl/matrix.hpp:28:0, > from > /usr/include/viennacl/linalg/sparse_matrix_operations.hpp:28, > from > /usr/include/viennacl/compressed_matrix.hpp:31, > from sparseDenseMmul.cpp:7: > /usr/include/viennacl/linalg/matrix_operations.hpp:438:10: > note: candidate: > template<class NumericT> void viennacl::linalg::prod_impl(const > viennacl::matrix_base<T>&, const viennacl::vector_base<T>&, > viennacl::vector_base<T>&) > void prod_impl(const matrix_base<NumericT> & mat, > > > The code is below, and I've attached both the > "sparseDenseMmul.cpp" file and the full compilation error output > (very long, probably not useful) > > > Thanks very much, > > > Andy > > > > > > Attached as "sparseDenseMmul.cpp": > > > #include <iostream> > // not using openMP for this mockup > // #define VIENNACL_WITH_OPENMP 1 > // ViennaCL includes > #include "viennacl/forwards.h" > #include "viennacl/compressed_matrix.hpp" > #include "viennacl/linalg/prod.hpp" > #include "viennacl/backend/memory.hpp" > #include "viennacl/matrix.hpp" > #include "viennacl/detail/matrix_def.hpp" > #include "viennacl/tools/random.hpp" > #include "viennacl/context.hpp" > #include "viennacl/linalg/host_based/sparse_matrix_operations.hpp" > > > // C_dense_matrix = A_compressed_matrix %*% B_dense_matrix. > > // compile line w/o OpenMP: g++ sparseDenseMmul.cpp > -I/usr/include/viennacl/ -o sparseDenseMmul > > > > int main() > { > // trying to recreate javacpp wrapper functionalliy as closely > as possible > // so not using typedef, unsigned ints, etc, and defining > templates as doubles > // creating buffers as int/double arrays and then setting > pointers to them. > // (not 100% sure that this is how javacpp passes pointers but > should be close.) > > > //typedef double ScalarType; > > // in acuallity, we cast `int`s from jni/javacpp. > unsigned int m = 10; > unsigned int n = 10; > unsigned long s = 5; > > unsigned int NNz_A = 12; > > > // allocate buffers and set pointers (similarly to javacpp) > // using ints (not unsigned ints) here from jni/javacpp. > int A_row_jumpers[m + 1] = {0, 0, 1, 2, 4, 5, 6, 7, 9, 11, 12}; > int *A_row_ptr = A_row_jumpers; > > // using ints (not unsigned ints) here from jni/javacpp. > int A_col_idxs[NNz_A] = {4, 0, 2, 3, 2, 4, 0, 4, 3, 0, 3, 0}; > int *A_col_ptr = A_col_idxs; > > double A_values[NNz_A] = {0.4065367203992265, > 0.04957158909682802, 0.3708618354358446, > 0.5205586068847993, 0.6963900565931678, > 0.8330915529787706, 0.32839112750638844, > 0.4265801782090245, 0.7856168903297948, > 0.14733066454561583, 0.9501663495824946, > 0.9710498974366047}; > double* A_values_ptr = A_values; > > > // using double values in Mahout setting template directlyfor > our compressed_matrix, A > viennacl::compressed_matrix<double> A_compressed_matrix(m, s); > > // set the ptrs for A > A_compressed_matrix.set(A_row_ptr, A_col_ptr, A_values_ptr, m, > s, NNz_A); > > // B is dense s so we only need s x n values. > double B_values[s * n] = {0}; > > // add some random data to B: > viennacl::tools::uniform_random_numbers<double> randomNumber; > for (int i = 0; i< s * n; i++) { > B_values[i] = randomNumber(); > } > > double* B_values_ptr = B_values; > > > // for our row_major dense_matrix, B can set the double values > in the construcor > // this is currently the constructor that we're using through > scala/javacpp. > const viennacl::matrix<double,viennacl::row_major> > B_dense_matrix(B_values_ptr, > viennacl::MAIN_MEMORY, s, n); > > > // perform multiplication and inside of a compressed_matrix > constructor > viennacl::matrix<double> > C_dense_matrix(viennacl::linalg::prod(A_compressed_matrix , > B_dense_matrix)); > > > // print out matrix > std::cout << "ViennaCL: " << C_dense_matrix << std::endl; > > > // just exit with success for now if there are no runtime > errors. > > return EXIT_SUCCESS; > } > > > ------------------------------------------------------------------------ > *From:* Karl Rupp <ru...@iu... > <mailto:ru...@iu...>> > *Sent:* Sunday, August 7, 2016 2:20:26 PM > *To:* Andrew Palumbo; vie...@li... > <mailto:vie...@li...> > *Subject:* Re: [ViennaCL-devel] compressed_matrix %*% matrix_Base > > Hi Andy, > > the relevant tests for sparse matrices times dense matrices are in > tests/spmdm.cpp. In particular, I recreated a test case based on > your > description and couldn't find any issues: > > viennacl::compressed_matrix<NumericT> compressed_A; > viennacl::matrix<NumericT, FactorLayoutT> B1(std_A.size(), > cols_rhs); > viennacl::matrix_base<NumericT> B1_ref(B1); > viennacl::matrix_base<NumericT> > C2(viennacl::linalg::prod(compressed_A, B1_ref)); > > compiles cleanly. Could you please provide a code snippet > demonstrating > the problem you are encountering? > > Thanks and best regards, > Karli > > > > On 08/05/2016 09:04 PM, Andrew Palumbo wrote: > > Hi Karl, > > > > > > I've been trying to implement tests for: > > > > > > matrix_base<double> C = compressed_matrix<double> A %*% > > > > matrix_base<double,row_major> B. > > > > > > I cant find in the code or the documentation any constructor for > > matrix_base<T>( > > > > matrix_expression<const viennacl::compressed_matrix<T>, const > > viennacl::matrix_base<T>, viennacl::op_prod>) > > > > ie. a mixed expression of compressed_matrix and matrix_base > > > > and get a compilation error when I try to instantiate a: > > > > matrix_base<double>(matrix_expression<const > > viennacl::compressed_matrix<double>, const > > viennacl::matrix_base<double>, > > viennacl::op_prod>) > > > > Is there a transformation that I need to do from this > > > > matrix_expression<compressed_matrix<double>, matrix_base<double>, > > op_prod> > > > > to something else so that I may be able to initialize a matrix_base (or > > possibly even a compressed_matrix) from it? > > > > The compilation error that i get is below. > > > > Thanks, > > > > Andy > > > > > ------------------------------------------------------------------------------ > What NetFlow Analyzer can do for you? Monitors network bandwidth > and traffic > patterns at an interface-level. Reveals which users, apps, and > protocols are > consuming the most bandwidth. Provides multi-vendor support for > NetFlow, > J-Flow, sFlow and other flows. Make informed decisions using > capacity > planning reports. http://sdm.link/zohodev2dev > _______________________________________________ > ViennaCL-devel mailing list > Vie...@li... > <mailto:Vie...@li...> > https://lists.sourceforge.net/lists/listinfo/viennacl-devel > <https://lists.sourceforge.net/lists/listinfo/viennacl-devel> > > > |
From: Karl R. <ru...@iu...> - 2016-08-17 09:01:00
|
Hi, > I have tried now with the regular .size() calls and setting the leading > dimensions with .internal_size(). Same error. This should be either .internal_size1() or .internal_size2(). For dense matrices .internal_size() equals .internal_size1() * internal_size2(). > The error is reported on line 274 of the clBLAS/src/libary/blas/xgemm.cc > file. It appears this loop is where it is causing the problem. > > for (unsigned int i = 0; i < numKernelArgs; i++) { > CL_CHECK( clSetKernelArg( clKernel, i, kernelArgSizes[i], > kernelArgs[i]) ) > } I can't tell what is going wrong here based on the description provided. Best regards, Karli > > On Tue, Aug 16, 2016 at 4:33 AM, Karl Rupp <ru...@iu... > <mailto:ru...@iu...>> wrote: > > Hi, > > > > On 08/15/2016 08:56 PM, Charles Determan wrote: > > Karl, > > I have the OpenCL backend enabled and I have tried: > > cl_mem bufA = A.handle().opencl_handle().get() > cl_mem bufB = B.handle().opencl_handle().get() > cl_mem bufC = C.handle().opencl_handle().get() > > cl_command_queue queue = > viennacl::ocl::current_context().get_queue().handle().get(); > > err = clblasDgemm(clblasRowMajor, clblasNoTrans, clblasNoTrans, > A.internal_size2(), > B.internal_size1(), A.internal_size1(), > > > Are you sure these sizes are correct? I'd expect that you use > .size1() and .size2() here, and use the respective .internal_size1() > and .internal_size2() for the leading dimensions (lda, ldb, ldc). > > Best regards, > Karli > > > alpha, bufA, 0, lda, > bufB, 0, ldb, beta, > bufC, 0, ldc, > 1, > &queue, > 0, NULL, 0); > > > Although this compiles it results in the error - > CL_INVALID_MEM_OBJECT > > > Where (which line) do you get the error? > > Best regards, > Karli > > Not sure if you have any other thoughts or if I should try > asking clBLAS > developers. > > Thanks again, > Charles > > On Mon, Aug 15, 2016 at 1:30 PM, Karl Rupp > <ru...@iu... <mailto:ru...@iu...> > <mailto:ru...@iu... <mailto:ru...@iu...>>> > wrote: > > Hi Charles, > > I am trying to verify my interface with clBLAS before going > completely > in to clMAGMA. However, I keep getting an OpenCL error > -38 which > corresponds to invalid memory (CL_INVALID_MEM_OBJECT) > when trying a > clblasDgemm call. This must be referring to the opencl > memory > handles I > am passing in. The fields generally accepts memory > buffers (cl_mem) > objects. I have tried passing both > A.handle.opencl_handle() and > A.handle.opencl_handle().get() in those fields but get > the same > error. > > > These should be A.handle.opencl_handle().get() > Mind the parantheses after 'handle'. > > Also, you will get the error if you don't enable the OpenCL > backends, or if you enabled the CUDA backend as well (as > CUDA will > be the default then). > > Best regards, > Karli > > > I will continue to poke around (maybe I need to use > internal_size > numbers) but thought I would ask you about this. > > Any insight? > > Thanks, > > Charles > > On Fri, Aug 12, 2016 at 3:21 PM, Charles Determan > <cde...@gm... <mailto:cde...@gm...> > <mailto:cde...@gm... <mailto:cde...@gm...>> > <mailto:cde...@gm... > <mailto:cde...@gm...> <mailto:cde...@gm... > <mailto:cde...@gm...>>>> > wrote: > > Thanks Karl, > > One followup question, what distinguishes handle(), > handle1(), and > handle2()? Do they refer to different buffers? > > Regards, > Charles > > On Fri, Aug 12, 2016 at 3:13 PM, Karl Rupp > <ru...@iu... <mailto:ru...@iu...> > <mailto:ru...@iu... <mailto:ru...@iu...>> > <mailto:ru...@iu... > <mailto:ru...@iu...> > <mailto:ru...@iu... > <mailto:ru...@iu...>>>> wrote: > > Hi Charles, > > call .handle()/.handle1()/.handle2() to get the > abstract > memory > buffers, and call .opencl_handle() on them to > get the cl_mem > handles: > > A.handle().opencl_handle() > > Similarly, the command queue is obtained with > viennacl::ocl::get_queue().handle().get() > > Unfortunately it's not explicitly written in the > manual :-/ > > Best regards, > Karli > > > On 08/12/2016 09:39 PM, Charles Determan wrote: > > I also would need to access the command > queue handle > (cl_command_queue) > object to pass to clBLAS and clMAGMA > functions. Is > this easily > accessible as well? > > Thanks, > Charles > > On Fri, Aug 12, 2016 at 11:45 AM, Charles > Determan > <cde...@gm... > <mailto:cde...@gm...> > <mailto:cde...@gm... > <mailto:cde...@gm...>> <mailto:cde...@gm... > <mailto:cde...@gm...> > <mailto:cde...@gm... > <mailto:cde...@gm...>>> > <mailto:cde...@gm... > <mailto:cde...@gm...> > <mailto:cde...@gm... > <mailto:cde...@gm...>> > <mailto:cde...@gm... > <mailto:cde...@gm...> > <mailto:cde...@gm... > <mailto:cde...@gm...>>>>> wrote: > > Thanks Karl, > > I have been looking through the docs and > I can't > find an > example for > how to pull the OpenCL handles from a > matrix. I > saw a > couple I > think from a context but not sure that > is what I > need. > Is this in > the documentation somewhere? The closest I > could fine > is this page > > (http://viennacl.sourceforge.net/doc/manual-memory.html > <http://viennacl.sourceforge.net/doc/manual-memory.html> > <http://viennacl.sourceforge.net/doc/manual-memory.html > <http://viennacl.sourceforge.net/doc/manual-memory.html>> > > <http://viennacl.sourceforge.net/doc/manual-memory.html > <http://viennacl.sourceforge.net/doc/manual-memory.html> > <http://viennacl.sourceforge.net/doc/manual-memory.html > <http://viennacl.sourceforge.net/doc/manual-memory.html>>> > > <http://viennacl.sourceforge.net/doc/manual-memory.html > <http://viennacl.sourceforge.net/doc/manual-memory.html> > <http://viennacl.sourceforge.net/doc/manual-memory.html > <http://viennacl.sourceforge.net/doc/manual-memory.html>> > > <http://viennacl.sourceforge.net/doc/manual-memory.html > <http://viennacl.sourceforge.net/doc/manual-memory.html> > <http://viennacl.sourceforge.net/doc/manual-memory.html > <http://viennacl.sourceforge.net/doc/manual-memory.html>>>>). > > Regards, > Charles > > On Wed, Aug 10, 2016 at 12:09 PM, > <ru...@iu... <mailto:ru...@iu...> > <mailto:ru...@iu... <mailto:ru...@iu...>> > <mailto:ru...@iu... > <mailto:ru...@iu...> > <mailto:ru...@iu... > <mailto:ru...@iu...>>> > <mailto:ru...@iu... > <mailto:ru...@iu...> > <mailto:ru...@iu... > <mailto:ru...@iu...>> > > <mailto:ru...@iu... > <mailto:ru...@iu...> > <mailto:ru...@iu... > <mailto:ru...@iu...>>>>> wrote: > > Hi Charles, > > > I have recently expressed some > interest > in different > factorizations such as > QR and SVD. I am aware that > these or > currently > experimental > within > ViennaCL. Until such a time > that these > factorizations are > fully supported > (I hope to contribute but the > algorithms are > quite complex) > would it be > feasible to interface with a > library like > clMAGMA? I'm not > sure of any > other library offhand that does > implement these > methods. I > thought perhaps > VexCL but I couldn't find > anything to that > effect in the > documentation. > > > Sure, you can always grab the OpenCL > handles > from > the matrices > and plug that into clMAGMA. > I don't think there is any value in > ViennaCL > wrapping the > clMAGMA interfaces, though. > > Best regards, > Karli > > > > > > > > > > > |
From: Dmitriy L. <dl...@gm...> - 2016-08-16 20:33:51
|
Ok, so the problem here seems to narrow down to what include files are included and in what order. In particular, sparse operations are broken iff both compressed_matrix.hpp and prod.hpp are included (and maybe forward.hpp). Strange that it doesn't break dense template operations but it is confirmed, including linalg/prod.hpp and other redundant files breaks the sparse operations. Karl, what is recommended set of hpp files to be included? and more specifically, what is already included transitively, so we don't include it again? Is there any general rule/convention about it? On Tue, Aug 16, 2016 at 11:16 AM, Dmitriy Lyubimov <dl...@gm...> wrote: > Karl, > > i can independently confirm the problem with prod_impl instantiation over > expression of compressed times base_matrix into matrix type. > > I understand there are tests examples but something goes wrong with the > straightforward code. > > We are compiling for open cl and open mp at the same time. > > > On Mon, Aug 8, 2016 at 11:03 AM, Andrew Palumbo <ap...@ou...> > wrote: > >> Hi Karli, >> >> >> I've mocked up in C++ the method that I'm trying to use from java. Aside >> from adding some values, it looks very similar to the code that you have >> below. >> >> >> I'm getting the same compiler error hat I was getting through javacpp/JNI: >> >> >> >> sparseDenseMmul.cpp:85:103: required from here >> /usr/include/viennacl/matrix.hpp:2247:36: error: no matching >> function for call to >> ‘prod_impl(const viennacl::compressed_matrix<double>&, const >> viennacl::matrix_base<double, long unsigned int, long int>&, >> viennacl::matrix_base<double, long unsigned int, long int>&)’ >> viennacl::linalg::prod_impl(proxy.lhs(), proxy.rhs(), lhs); >> >> ^ >> In file included from /usr/include/viennacl/matrix.hpp:28:0, >> from /usr/include/viennacl/linalg/s >> parse_matrix_operations.hpp:28, >> from /usr/include/viennacl/compressed_matrix.hpp:31, >> from sparseDenseMmul.cpp:7: >> /usr/include/viennacl/linalg/matrix_operations.hpp:438:10: note: >> candidate: >> template<class NumericT> void viennacl::linalg::prod_impl(const >> viennacl::matrix_base<T>&, const viennacl::vector_base<T>&, >> viennacl::vector_base<T>&) >> void prod_impl(const matrix_base<NumericT> & mat, >> >> >> The code is below, and I've attached both the "sparseDenseMmul.cpp" file >> and the full compilation error output (very long, probably not useful) >> >> >> Thanks very much, >> >> >> Andy >> >> >> >> >> >> Attached as "sparseDenseMmul.cpp": >> >> >> #include <iostream> >> // not using openMP for this mockup >> // #define VIENNACL_WITH_OPENMP 1 >> // ViennaCL includes >> #include "viennacl/forwards.h" >> #include "viennacl/compressed_matrix.hpp" >> #include "viennacl/linalg/prod.hpp" >> #include "viennacl/backend/memory.hpp" >> #include "viennacl/matrix.hpp" >> #include "viennacl/detail/matrix_def.hpp" >> #include "viennacl/tools/random.hpp" >> #include "viennacl/context.hpp" >> #include "viennacl/linalg/host_based/sparse_matrix_operations.hpp" >> >> >> // C_dense_matrix = A_compressed_matrix %*% B_dense_matrix. >> >> // compile line w/o OpenMP: g++ sparseDenseMmul.cpp >> -I/usr/include/viennacl/ -o sparseDenseMmul >> >> >> >> int main() >> { >> // trying to recreate javacpp wrapper functionalliy as closely as >> possible >> // so not using typedef, unsigned ints, etc, and defining templates as >> doubles >> // creating buffers as int/double arrays and then setting pointers to >> them. >> // (not 100% sure that this is how javacpp passes pointers but should >> be close.) >> >> >> //typedef double ScalarType; >> >> // in acuallity, we cast `int`s from jni/javacpp. >> unsigned int m = 10; >> unsigned int n = 10; >> unsigned long s = 5; >> >> unsigned int NNz_A = 12; >> >> >> // allocate buffers and set pointers (similarly to javacpp) >> // using ints (not unsigned ints) here from jni/javacpp. >> int A_row_jumpers[m + 1] = {0, 0, 1, 2, 4, 5, 6, 7, 9, 11, 12}; >> int *A_row_ptr = A_row_jumpers; >> >> // using ints (not unsigned ints) here from jni/javacpp. >> int A_col_idxs[NNz_A] = {4, 0, 2, 3, 2, 4, 0, 4, 3, 0, 3, 0}; >> int *A_col_ptr = A_col_idxs; >> >> double A_values[NNz_A] = {0.4065367203992265, 0.04957158909682802, >> 0.3708618354358446, >> 0.5205586068847993, 0.6963900565931678, 0.8330915529787706, >> 0.32839112750638844, >> 0.4265801782090245, 0.7856168903297948, >> 0.14733066454561583, 0.9501663495824946, >> 0.9710498974366047}; >> double* A_values_ptr = A_values; >> >> >> // using double values in Mahout setting template directlyfor our >> compressed_matrix, A >> viennacl::compressed_matrix<double> A_compressed_matrix(m, s); >> >> // set the ptrs for A >> A_compressed_matrix.set(A_row_ptr, A_col_ptr, A_values_ptr, m, s, >> NNz_A); >> >> // B is dense s so we only need s x n values. >> double B_values[s * n] = {0}; >> >> // add some random data to B: >> viennacl::tools::uniform_random_numbers<double> randomNumber; >> for (int i = 0; i< s * n; i++) { >> B_values[i] = randomNumber(); >> } >> >> double* B_values_ptr = B_values; >> >> >> // for our row_major dense_matrix, B can set the double values in the >> construcor >> // this is currently the constructor that we're using through >> scala/javacpp. >> const viennacl::matrix<double,viennacl::row_major> >> B_dense_matrix(B_values_ptr, >> viennacl::MAIN_MEMORY, s, n); >> >> >> // perform multiplication and inside of a compressed_matrix constructor >> viennacl::matrix<double> C_dense_matrix(viennacl::linalg::prod(A_compressed_matrix >> , B_dense_matrix)); >> >> >> // print out matrix >> std::cout << "ViennaCL: " << C_dense_matrix << std::endl; >> >> >> // just exit with success for now if there are no runtime errors. >> >> return EXIT_SUCCESS; >> } >> >> >> ------------------------------ >> *From:* Karl Rupp <ru...@iu...> >> *Sent:* Sunday, August 7, 2016 2:20:26 PM >> *To:* Andrew Palumbo; vie...@li... >> *Subject:* Re: [ViennaCL-devel] compressed_matrix %*% matrix_Base >> >> Hi Andy, >> >> the relevant tests for sparse matrices times dense matrices are in >> tests/spmdm.cpp. In particular, I recreated a test case based on your >> description and couldn't find any issues: >> >> viennacl::compressed_matrix<NumericT> compressed_A; >> viennacl::matrix<NumericT, FactorLayoutT> B1(std_A.size(), cols_rhs); >> viennacl::matrix_base<NumericT> B1_ref(B1); >> viennacl::matrix_base<NumericT> >> C2(viennacl::linalg::prod(compressed_A, B1_ref)); >> >> compiles cleanly. Could you please provide a code snippet demonstrating >> the problem you are encountering? >> >> Thanks and best regards, >> Karli >> >> >> >> On 08/05/2016 09:04 PM, Andrew Palumbo wrote: >> > Hi Karl, >> > >> > >> > I've been trying to implement tests for: >> > >> > >> > matrix_base<double> C = compressed_matrix<double> A %*% >> > >> > matrix_base<double,row_major> B. >> > >> > >> > I cant find in the code or the documentation any constructor for >> > matrix_base<T>( >> > >> > matrix_expression<const viennacl::compressed_matrix<T>, const >> > viennacl::matrix_base<T>, viennacl::op_prod>) >> > >> > ie. a mixed expression of compressed_matrix and matrix_base >> > >> > and get a compilation error when I try to instantiate a: >> > >> > matrix_base<double>(matrix_expression<const >> > viennacl::compressed_matrix<double>, const >> > viennacl::matrix_base<double>, >> > viennacl::op_prod>) >> > >> > Is there a transformation that I need to do from this >> > >> > matrix_expression<compressed_matrix<double>, matrix_base<double>, >> > op_prod> >> > >> > to something else so that I may be able to initialize a matrix_base (or >> > possibly even a compressed_matrix) from it? >> > >> > The compilation error that i get is below. >> > >> > Thanks, >> > >> > Andy >> > >> >> >> ------------------------------------------------------------ >> ------------------ >> What NetFlow Analyzer can do for you? Monitors network bandwidth and >> traffic >> patterns at an interface-level. Reveals which users, apps, and protocols >> are >> consuming the most bandwidth. Provides multi-vendor support for NetFlow, >> J-Flow, sFlow and other flows. Make informed decisions using capacity >> planning reports. http://sdm.link/zohodev2dev >> _______________________________________________ >> ViennaCL-devel mailing list >> Vie...@li... >> https://lists.sourceforge.net/lists/listinfo/viennacl-devel >> >> > |
From: Dmitriy L. <dl...@gm...> - 2016-08-16 18:16:55
|
Karl, i can independently confirm the problem with prod_impl instantiation over expression of compressed times base_matrix into matrix type. I understand there are tests examples but something goes wrong with the straightforward code. We are compiling for open cl and open mp at the same time. On Mon, Aug 8, 2016 at 11:03 AM, Andrew Palumbo <ap...@ou...> wrote: > Hi Karli, > > > I've mocked up in C++ the method that I'm trying to use from java. Aside > from adding some values, it looks very similar to the code that you have > below. > > > I'm getting the same compiler error hat I was getting through javacpp/JNI: > > > > sparseDenseMmul.cpp:85:103: required from here > /usr/include/viennacl/matrix.hpp:2247:36: error: no matching function > for call to > ‘prod_impl(const viennacl::compressed_matrix<double>&, const > viennacl::matrix_base<double, long unsigned int, long int>&, > viennacl::matrix_base<double, long unsigned int, long int>&)’ > viennacl::linalg::prod_impl(proxy.lhs(), proxy.rhs(), lhs); > > ^ > In file included from /usr/include/viennacl/matrix.hpp:28:0, > from /usr/include/viennacl/linalg/ > sparse_matrix_operations.hpp:28, > from /usr/include/viennacl/compressed_matrix.hpp:31, > from sparseDenseMmul.cpp:7: > /usr/include/viennacl/linalg/matrix_operations.hpp:438:10: note: > candidate: > template<class NumericT> void viennacl::linalg::prod_impl(const > viennacl::matrix_base<T>&, const viennacl::vector_base<T>&, > viennacl::vector_base<T>&) > void prod_impl(const matrix_base<NumericT> & mat, > > > The code is below, and I've attached both the "sparseDenseMmul.cpp" file > and the full compilation error output (very long, probably not useful) > > > Thanks very much, > > > Andy > > > > > > Attached as "sparseDenseMmul.cpp": > > > #include <iostream> > // not using openMP for this mockup > // #define VIENNACL_WITH_OPENMP 1 > // ViennaCL includes > #include "viennacl/forwards.h" > #include "viennacl/compressed_matrix.hpp" > #include "viennacl/linalg/prod.hpp" > #include "viennacl/backend/memory.hpp" > #include "viennacl/matrix.hpp" > #include "viennacl/detail/matrix_def.hpp" > #include "viennacl/tools/random.hpp" > #include "viennacl/context.hpp" > #include "viennacl/linalg/host_based/sparse_matrix_operations.hpp" > > > // C_dense_matrix = A_compressed_matrix %*% B_dense_matrix. > > // compile line w/o OpenMP: g++ sparseDenseMmul.cpp > -I/usr/include/viennacl/ -o sparseDenseMmul > > > > int main() > { > // trying to recreate javacpp wrapper functionalliy as closely as > possible > // so not using typedef, unsigned ints, etc, and defining templates as > doubles > // creating buffers as int/double arrays and then setting pointers to > them. > // (not 100% sure that this is how javacpp passes pointers but should be > close.) > > > //typedef double ScalarType; > > // in acuallity, we cast `int`s from jni/javacpp. > unsigned int m = 10; > unsigned int n = 10; > unsigned long s = 5; > > unsigned int NNz_A = 12; > > > // allocate buffers and set pointers (similarly to javacpp) > // using ints (not unsigned ints) here from jni/javacpp. > int A_row_jumpers[m + 1] = {0, 0, 1, 2, 4, 5, 6, 7, 9, 11, 12}; > int *A_row_ptr = A_row_jumpers; > > // using ints (not unsigned ints) here from jni/javacpp. > int A_col_idxs[NNz_A] = {4, 0, 2, 3, 2, 4, 0, 4, 3, 0, 3, 0}; > int *A_col_ptr = A_col_idxs; > > double A_values[NNz_A] = {0.4065367203992265, 0.04957158909682802, > 0.3708618354358446, > 0.5205586068847993, 0.6963900565931678, 0.8330915529787706, > 0.32839112750638844, > 0.4265801782090245, 0.7856168903297948, 0.14733066454561583, > 0.9501663495824946, > 0.9710498974366047}; > double* A_values_ptr = A_values; > > > // using double values in Mahout setting template directlyfor our > compressed_matrix, A > viennacl::compressed_matrix<double> A_compressed_matrix(m, s); > > // set the ptrs for A > A_compressed_matrix.set(A_row_ptr, A_col_ptr, A_values_ptr, m, s, > NNz_A); > > // B is dense s so we only need s x n values. > double B_values[s * n] = {0}; > > // add some random data to B: > viennacl::tools::uniform_random_numbers<double> randomNumber; > for (int i = 0; i< s * n; i++) { > B_values[i] = randomNumber(); > } > > double* B_values_ptr = B_values; > > > // for our row_major dense_matrix, B can set the double values in the > construcor > // this is currently the constructor that we're using through > scala/javacpp. > const viennacl::matrix<double,viennacl::row_major> > B_dense_matrix(B_values_ptr, viennacl::MAIN_MEMORY, > s, n); > > > // perform multiplication and inside of a compressed_matrix constructor > viennacl::matrix<double> C_dense_matrix(viennacl:: > linalg::prod(A_compressed_matrix , B_dense_matrix)); > > > // print out matrix > std::cout << "ViennaCL: " << C_dense_matrix << std::endl; > > > // just exit with success for now if there are no runtime errors. > > return EXIT_SUCCESS; > } > > > ------------------------------ > *From:* Karl Rupp <ru...@iu...> > *Sent:* Sunday, August 7, 2016 2:20:26 PM > *To:* Andrew Palumbo; vie...@li... > *Subject:* Re: [ViennaCL-devel] compressed_matrix %*% matrix_Base > > Hi Andy, > > the relevant tests for sparse matrices times dense matrices are in > tests/spmdm.cpp. In particular, I recreated a test case based on your > description and couldn't find any issues: > > viennacl::compressed_matrix<NumericT> compressed_A; > viennacl::matrix<NumericT, FactorLayoutT> B1(std_A.size(), cols_rhs); > viennacl::matrix_base<NumericT> B1_ref(B1); > viennacl::matrix_base<NumericT> > C2(viennacl::linalg::prod(compressed_A, B1_ref)); > > compiles cleanly. Could you please provide a code snippet demonstrating > the problem you are encountering? > > Thanks and best regards, > Karli > > > > On 08/05/2016 09:04 PM, Andrew Palumbo wrote: > > Hi Karl, > > > > > > I've been trying to implement tests for: > > > > > > matrix_base<double> C = compressed_matrix<double> A %*% > > > > matrix_base<double,row_major> B. > > > > > > I cant find in the code or the documentation any constructor for > > matrix_base<T>( > > > > matrix_expression<const viennacl::compressed_matrix<T>, const > > viennacl::matrix_base<T>, viennacl::op_prod>) > > > > ie. a mixed expression of compressed_matrix and matrix_base > > > > and get a compilation error when I try to instantiate a: > > > > matrix_base<double>(matrix_expression<const > > viennacl::compressed_matrix<double>, const > > viennacl::matrix_base<double>, > > viennacl::op_prod>) > > > > Is there a transformation that I need to do from this > > > > matrix_expression<compressed_matrix<double>, matrix_base<double>, > > op_prod> > > > > to something else so that I may be able to initialize a matrix_base (or > > possibly even a compressed_matrix) from it? > > > > The compilation error that i get is below. > > > > Thanks, > > > > Andy > > > > > ------------------------------------------------------------ > ------------------ > What NetFlow Analyzer can do for you? Monitors network bandwidth and > traffic > patterns at an interface-level. Reveals which users, apps, and protocols > are > consuming the most bandwidth. Provides multi-vendor support for NetFlow, > J-Flow, sFlow and other flows. Make informed decisions using capacity > planning reports. http://sdm.link/zohodev2dev > _______________________________________________ > ViennaCL-devel mailing list > Vie...@li... > https://lists.sourceforge.net/lists/listinfo/viennacl-devel > > |
From: Charles D. <cde...@gm...> - 2016-08-16 12:29:43
|
Karl, I have tried now with the regular .size() calls and setting the leading dimensions with .internal_size(). Same error. The error is reported on line 274 of the clBLAS/src/libary/blas/xgemm.cc file. It appears this loop is where it is causing the problem. for (unsigned int i = 0; i < numKernelArgs; i++) { CL_CHECK( clSetKernelArg( clKernel, i, kernelArgSizes[i], kernelArgs[i]) ) } Regards, Charles On Tue, Aug 16, 2016 at 4:33 AM, Karl Rupp <ru...@iu...> wrote: > Hi, > > > > On 08/15/2016 08:56 PM, Charles Determan wrote: > >> Karl, >> >> I have the OpenCL backend enabled and I have tried: >> >> cl_mem bufA = A.handle().opencl_handle().get() >> cl_mem bufB = B.handle().opencl_handle().get() >> cl_mem bufC = C.handle().opencl_handle().get() >> >> cl_command_queue queue = >> viennacl::ocl::current_context().get_queue().handle().get(); >> >> err = clblasDgemm(clblasRowMajor, clblasNoTrans, clblasNoTrans, >> A.internal_size2(), >> B.internal_size1(), A.internal_size1(), >> > > Are you sure these sizes are correct? I'd expect that you use .size1() and > .size2() here, and use the respective .internal_size1() and > .internal_size2() for the leading dimensions (lda, ldb, ldc). > > Best regards, > Karli > > > alpha, bufA, 0, lda, >> bufB, 0, ldb, beta, >> bufC, 0, ldc, >> 1, >> &queue, >> 0, NULL, 0); >> >> >> Although this compiles it results in the error - CL_INVALID_MEM_OBJECT >> > > Where (which line) do you get the error? > > Best regards, > Karli > > Not sure if you have any other thoughts or if I should try asking clBLAS >> developers. >> >> Thanks again, >> Charles >> >> On Mon, Aug 15, 2016 at 1:30 PM, Karl Rupp <ru...@iu... >> <mailto:ru...@iu...>> wrote: >> >> Hi Charles, >> >> I am trying to verify my interface with clBLAS before going >> completely >> in to clMAGMA. However, I keep getting an OpenCL error -38 which >> corresponds to invalid memory (CL_INVALID_MEM_OBJECT) when trying >> a >> clblasDgemm call. This must be referring to the opencl memory >> handles I >> am passing in. The fields generally accepts memory buffers >> (cl_mem) >> objects. I have tried passing both A.handle.opencl_handle() and >> A.handle.opencl_handle().get() in those fields but get the same >> error. >> >> >> These should be A.handle.opencl_handle().get() >> Mind the parantheses after 'handle'. >> >> Also, you will get the error if you don't enable the OpenCL >> backends, or if you enabled the CUDA backend as well (as CUDA will >> be the default then). >> >> Best regards, >> Karli >> >> >> I will continue to poke around (maybe I need to use internal_size >> numbers) but thought I would ask you about this. >> >> Any insight? >> >> Thanks, >> >> Charles >> >> On Fri, Aug 12, 2016 at 3:21 PM, Charles Determan >> <cde...@gm... <mailto:cde...@gm...> >> <mailto:cde...@gm... <mailto:cde...@gm...>>> >> wrote: >> >> Thanks Karl, >> >> One followup question, what distinguishes handle(), >> handle1(), and >> handle2()? Do they refer to different buffers? >> >> Regards, >> Charles >> >> On Fri, Aug 12, 2016 at 3:13 PM, Karl Rupp >> <ru...@iu... <mailto:ru...@iu...> >> <mailto:ru...@iu... >> <mailto:ru...@iu...>>> wrote: >> >> Hi Charles, >> >> call .handle()/.handle1()/.handle2() to get the abstract >> memory >> buffers, and call .opencl_handle() on them to get the >> cl_mem >> handles: >> >> A.handle().opencl_handle() >> >> Similarly, the command queue is obtained with >> viennacl::ocl::get_queue().handle().get() >> >> Unfortunately it's not explicitly written in the manual >> :-/ >> >> Best regards, >> Karli >> >> >> On 08/12/2016 09:39 PM, Charles Determan wrote: >> >> I also would need to access the command queue handle >> (cl_command_queue) >> object to pass to clBLAS and clMAGMA functions. Is >> this easily >> accessible as well? >> >> Thanks, >> Charles >> >> On Fri, Aug 12, 2016 at 11:45 AM, Charles Determan >> <cde...@gm... >> <mailto:cde...@gm...> <mailto:cde...@gm... >> <mailto:cde...@gm...>> >> <mailto:cde...@gm... >> <mailto:cde...@gm...> >> <mailto:cde...@gm... >> <mailto:cde...@gm...>>>> wrote: >> >> Thanks Karl, >> >> I have been looking through the docs and I can't >> find an >> example for >> how to pull the OpenCL handles from a matrix. I >> saw a >> couple I >> think from a context but not sure that is what I >> need. >> Is this in >> the documentation somewhere? The closest I >> could fine >> is this page >> >> (http://viennacl.sourceforge.net/doc/manual-memory.html >> <http://viennacl.sourceforge.net/doc/manual-memory.html> >> >> <http://viennacl.sourceforge.net/doc/manual-memory.html >> <http://viennacl.sourceforge.net/doc/manual-memory.html>> >> >> <http://viennacl.sourceforge.net/doc/manual-memory.html >> <http://viennacl.sourceforge.net/doc/manual-memory.html> >> >> <http://viennacl.sourceforge.net/doc/manual-memory.html >> <http://viennacl.sourceforge.net/doc/manual-memory.html>>>). >> >> Regards, >> Charles >> >> On Wed, Aug 10, 2016 at 12:09 PM, >> <ru...@iu... <mailto:ru...@iu...> >> <mailto:ru...@iu... >> <mailto:ru...@iu...>> >> <mailto:ru...@iu... >> <mailto:ru...@iu...> >> >> <mailto:ru...@iu... >> <mailto:ru...@iu...>>>> wrote: >> >> Hi Charles, >> >> >> I have recently expressed some interest >> in different >> factorizations such as >> QR and SVD. I am aware that these or >> currently >> experimental >> within >> ViennaCL. Until such a time that these >> factorizations are >> fully supported >> (I hope to contribute but the algorithms >> are >> quite complex) >> would it be >> feasible to interface with a library like >> clMAGMA? I'm not >> sure of any >> other library offhand that does >> implement these >> methods. I >> thought perhaps >> VexCL but I couldn't find anything to that >> effect in the >> documentation. >> >> >> Sure, you can always grab the OpenCL handles >> from >> the matrices >> and plug that into clMAGMA. >> I don't think there is any value in ViennaCL >> wrapping the >> clMAGMA interfaces, though. >> >> Best regards, >> Karli >> >> >> >> >> >> >> >> >> >> > |
From: Karl R. <ru...@iu...> - 2016-08-16 09:33:42
|
Hi, On 08/15/2016 08:56 PM, Charles Determan wrote: > Karl, > > I have the OpenCL backend enabled and I have tried: > > cl_mem bufA = A.handle().opencl_handle().get() > cl_mem bufB = B.handle().opencl_handle().get() > cl_mem bufC = C.handle().opencl_handle().get() > > cl_command_queue queue = > viennacl::ocl::current_context().get_queue().handle().get(); > > err = clblasDgemm(clblasRowMajor, clblasNoTrans, clblasNoTrans, > A.internal_size2(), > B.internal_size1(), A.internal_size1(), Are you sure these sizes are correct? I'd expect that you use .size1() and .size2() here, and use the respective .internal_size1() and .internal_size2() for the leading dimensions (lda, ldb, ldc). Best regards, Karli > alpha, bufA, 0, lda, > bufB, 0, ldb, beta, > bufC, 0, ldc, > 1, > &queue, > 0, NULL, 0); > > > Although this compiles it results in the error - CL_INVALID_MEM_OBJECT Where (which line) do you get the error? Best regards, Karli > Not sure if you have any other thoughts or if I should try asking clBLAS > developers. > > Thanks again, > Charles > > On Mon, Aug 15, 2016 at 1:30 PM, Karl Rupp <ru...@iu... > <mailto:ru...@iu...>> wrote: > > Hi Charles, > > I am trying to verify my interface with clBLAS before going > completely > in to clMAGMA. However, I keep getting an OpenCL error -38 which > corresponds to invalid memory (CL_INVALID_MEM_OBJECT) when trying a > clblasDgemm call. This must be referring to the opencl memory > handles I > am passing in. The fields generally accepts memory buffers (cl_mem) > objects. I have tried passing both A.handle.opencl_handle() and > A.handle.opencl_handle().get() in those fields but get the same > error. > > > These should be A.handle.opencl_handle().get() > Mind the parantheses after 'handle'. > > Also, you will get the error if you don't enable the OpenCL > backends, or if you enabled the CUDA backend as well (as CUDA will > be the default then). > > Best regards, > Karli > > > I will continue to poke around (maybe I need to use internal_size > numbers) but thought I would ask you about this. > > Any insight? > > Thanks, > > Charles > > On Fri, Aug 12, 2016 at 3:21 PM, Charles Determan > <cde...@gm... <mailto:cde...@gm...> > <mailto:cde...@gm... <mailto:cde...@gm...>>> > wrote: > > Thanks Karl, > > One followup question, what distinguishes handle(), > handle1(), and > handle2()? Do they refer to different buffers? > > Regards, > Charles > > On Fri, Aug 12, 2016 at 3:13 PM, Karl Rupp > <ru...@iu... <mailto:ru...@iu...> > <mailto:ru...@iu... > <mailto:ru...@iu...>>> wrote: > > Hi Charles, > > call .handle()/.handle1()/.handle2() to get the abstract > memory > buffers, and call .opencl_handle() on them to get the cl_mem > handles: > > A.handle().opencl_handle() > > Similarly, the command queue is obtained with > viennacl::ocl::get_queue().handle().get() > > Unfortunately it's not explicitly written in the manual :-/ > > Best regards, > Karli > > > On 08/12/2016 09:39 PM, Charles Determan wrote: > > I also would need to access the command queue handle > (cl_command_queue) > object to pass to clBLAS and clMAGMA functions. Is > this easily > accessible as well? > > Thanks, > Charles > > On Fri, Aug 12, 2016 at 11:45 AM, Charles Determan > <cde...@gm... > <mailto:cde...@gm...> <mailto:cde...@gm... > <mailto:cde...@gm...>> > <mailto:cde...@gm... > <mailto:cde...@gm...> > <mailto:cde...@gm... > <mailto:cde...@gm...>>>> wrote: > > Thanks Karl, > > I have been looking through the docs and I can't > find an > example for > how to pull the OpenCL handles from a matrix. I > saw a > couple I > think from a context but not sure that is what I > need. > Is this in > the documentation somewhere? The closest I > could fine > is this page > > (http://viennacl.sourceforge.net/doc/manual-memory.html > <http://viennacl.sourceforge.net/doc/manual-memory.html> > > <http://viennacl.sourceforge.net/doc/manual-memory.html > <http://viennacl.sourceforge.net/doc/manual-memory.html>> > > <http://viennacl.sourceforge.net/doc/manual-memory.html > <http://viennacl.sourceforge.net/doc/manual-memory.html> > > <http://viennacl.sourceforge.net/doc/manual-memory.html > <http://viennacl.sourceforge.net/doc/manual-memory.html>>>). > > Regards, > Charles > > On Wed, Aug 10, 2016 at 12:09 PM, > <ru...@iu... <mailto:ru...@iu...> > <mailto:ru...@iu... > <mailto:ru...@iu...>> > <mailto:ru...@iu... > <mailto:ru...@iu...> > > <mailto:ru...@iu... > <mailto:ru...@iu...>>>> wrote: > > Hi Charles, > > > I have recently expressed some interest > in different > factorizations such as > QR and SVD. I am aware that these or > currently > experimental > within > ViennaCL. Until such a time that these > factorizations are > fully supported > (I hope to contribute but the algorithms are > quite complex) > would it be > feasible to interface with a library like > clMAGMA? I'm not > sure of any > other library offhand that does > implement these > methods. I > thought perhaps > VexCL but I couldn't find anything to that > effect in the > documentation. > > > Sure, you can always grab the OpenCL handles > from > the matrices > and plug that into clMAGMA. > I don't think there is any value in ViennaCL > wrapping the > clMAGMA interfaces, though. > > Best regards, > Karli > > > > > > > > > |
From: Charles D. <cde...@gm...> - 2016-08-15 18:56:13
|
Karl, I have the OpenCL backend enabled and I have tried: cl_mem bufA = A.handle().opencl_handle().get() cl_mem bufB = B.handle().opencl_handle().get() cl_mem bufC = C.handle().opencl_handle().get() cl_command_queue queue = viennacl::ocl::current_context().get_queue().handle().get(); err = clblasDgemm(clblasRowMajor, clblasNoTrans, clblasNoTrans, A.internal_size2(), B.internal_size1(), A.internal_size1(), alpha, bufA, 0, lda, bufB, 0, ldb, beta, bufC, 0, ldc, 1, &queue, 0, NULL, 0); Although this compiles it results in the error - CL_INVALID_MEM_OBJECT Not sure if you have any other thoughts or if I should try asking clBLAS developers. Thanks again, Charles On Mon, Aug 15, 2016 at 1:30 PM, Karl Rupp <ru...@iu...> wrote: > Hi Charles, > > I am trying to verify my interface with clBLAS before going completely >> in to clMAGMA. However, I keep getting an OpenCL error -38 which >> corresponds to invalid memory (CL_INVALID_MEM_OBJECT) when trying a >> clblasDgemm call. This must be referring to the opencl memory handles I >> am passing in. The fields generally accepts memory buffers (cl_mem) >> objects. I have tried passing both A.handle.opencl_handle() and >> A.handle.opencl_handle().get() in those fields but get the same error. >> > > These should be A.handle.opencl_handle().get() > Mind the parantheses after 'handle'. > > Also, you will get the error if you don't enable the OpenCL backends, or > if you enabled the CUDA backend as well (as CUDA will be the default then). > > Best regards, > Karli > > >> I will continue to poke around (maybe I need to use internal_size >> numbers) but thought I would ask you about this. >> >> Any insight? >> >> Thanks, >> >> Charles >> >> On Fri, Aug 12, 2016 at 3:21 PM, Charles Determan <cde...@gm... >> <mailto:cde...@gm...>> wrote: >> >> Thanks Karl, >> >> One followup question, what distinguishes handle(), handle1(), and >> handle2()? Do they refer to different buffers? >> >> Regards, >> Charles >> >> On Fri, Aug 12, 2016 at 3:13 PM, Karl Rupp <ru...@iu... >> <mailto:ru...@iu...>> wrote: >> >> Hi Charles, >> >> call .handle()/.handle1()/.handle2() to get the abstract memory >> buffers, and call .opencl_handle() on them to get the cl_mem >> handles: >> >> A.handle().opencl_handle() >> >> Similarly, the command queue is obtained with >> viennacl::ocl::get_queue().handle().get() >> >> Unfortunately it's not explicitly written in the manual :-/ >> >> Best regards, >> Karli >> >> >> On 08/12/2016 09:39 PM, Charles Determan wrote: >> >> I also would need to access the command queue handle >> (cl_command_queue) >> object to pass to clBLAS and clMAGMA functions. Is this >> easily >> accessible as well? >> >> Thanks, >> Charles >> >> On Fri, Aug 12, 2016 at 11:45 AM, Charles Determan >> <cde...@gm... <mailto:cde...@gm...> >> <mailto:cde...@gm... >> <mailto:cde...@gm...>>> wrote: >> >> Thanks Karl, >> >> I have been looking through the docs and I can't find an >> example for >> how to pull the OpenCL handles from a matrix. I saw a >> couple I >> think from a context but not sure that is what I need. >> Is this in >> the documentation somewhere? The closest I could fine >> is this page >> (http://viennacl.sourceforge.net/doc/manual-memory.html >> <http://viennacl.sourceforge.net/doc/manual-memory.html> >> <http://viennacl.sourceforge.net/doc/manual-memory.html >> <http://viennacl.sourceforge.net/doc/manual-memory.html>>). >> >> Regards, >> Charles >> >> On Wed, Aug 10, 2016 at 12:09 PM, <ru...@iu... >> <mailto:ru...@iu...> >> <mailto:ru...@iu... >> >> <mailto:ru...@iu...>>> wrote: >> >> Hi Charles, >> >> >> I have recently expressed some interest in >> different >> factorizations such as >> QR and SVD. I am aware that these or currently >> experimental >> within >> ViennaCL. Until such a time that these >> factorizations are >> fully supported >> (I hope to contribute but the algorithms are >> quite complex) >> would it be >> feasible to interface with a library like >> clMAGMA? I'm not >> sure of any >> other library offhand that does implement these >> methods. I >> thought perhaps >> VexCL but I couldn't find anything to that >> effect in the >> documentation. >> >> >> Sure, you can always grab the OpenCL handles from >> the matrices >> and plug that into clMAGMA. >> I don't think there is any value in ViennaCL >> wrapping the >> clMAGMA interfaces, though. >> >> Best regards, >> Karli >> >> >> >> >> >> >> >> > |
From: Karl R. <ru...@iu...> - 2016-08-15 18:31:53
|
Hi Charles, > One followup question, what distinguishes handle(), handle1(), and > handle2()? Do they refer to different buffers? .handle() is the only one you need for vectors and dense matrices. Sparse matrices are represented by multiple arrays, hence the different handleX() routines (with numbers X) Best regards, Karli > On Fri, Aug 12, 2016 at 3:13 PM, Karl Rupp <ru...@iu... > <mailto:ru...@iu...>> wrote: > > Hi Charles, > > call .handle()/.handle1()/.handle2() to get the abstract memory > buffers, and call .opencl_handle() on them to get the cl_mem handles: > > A.handle().opencl_handle() > > Similarly, the command queue is obtained with > viennacl::ocl::get_queue().handle().get() > > Unfortunately it's not explicitly written in the manual :-/ > > Best regards, > Karli > > > On 08/12/2016 09:39 PM, Charles Determan wrote: > > I also would need to access the command queue handle > (cl_command_queue) > object to pass to clBLAS and clMAGMA functions. Is this easily > accessible as well? > > Thanks, > Charles > > On Fri, Aug 12, 2016 at 11:45 AM, Charles Determan > <cde...@gm... <mailto:cde...@gm...> > <mailto:cde...@gm... <mailto:cde...@gm...>>> > wrote: > > Thanks Karl, > > I have been looking through the docs and I can't find an > example for > how to pull the OpenCL handles from a matrix. I saw a couple I > think from a context but not sure that is what I need. Is > this in > the documentation somewhere? The closest I could fine is > this page > (http://viennacl.sourceforge.net/doc/manual-memory.html > <http://viennacl.sourceforge.net/doc/manual-memory.html> > <http://viennacl.sourceforge.net/doc/manual-memory.html > <http://viennacl.sourceforge.net/doc/manual-memory.html>>). > > Regards, > Charles > > On Wed, Aug 10, 2016 at 12:09 PM, <ru...@iu... > <mailto:ru...@iu...> > <mailto:ru...@iu... > <mailto:ru...@iu...>>> wrote: > > Hi Charles, > > > I have recently expressed some interest in different > factorizations such as > QR and SVD. I am aware that these or currently > experimental > within > ViennaCL. Until such a time that these > factorizations are > fully supported > (I hope to contribute but the algorithms are quite > complex) > would it be > feasible to interface with a library like clMAGMA? > I'm not > sure of any > other library offhand that does implement these > methods. I > thought perhaps > VexCL but I couldn't find anything to that effect in the > documentation. > > > Sure, you can always grab the OpenCL handles from the > matrices > and plug that into clMAGMA. > I don't think there is any value in ViennaCL wrapping the > clMAGMA interfaces, though. > > Best regards, > Karli > > > > > > |
From: Karl R. <ru...@iu...> - 2016-08-15 18:30:58
|
Hi Charles, > I am trying to verify my interface with clBLAS before going completely > in to clMAGMA. However, I keep getting an OpenCL error -38 which > corresponds to invalid memory (CL_INVALID_MEM_OBJECT) when trying a > clblasDgemm call. This must be referring to the opencl memory handles I > am passing in. The fields generally accepts memory buffers (cl_mem) > objects. I have tried passing both A.handle.opencl_handle() and > A.handle.opencl_handle().get() in those fields but get the same error. These should be A.handle.opencl_handle().get() Mind the parantheses after 'handle'. Also, you will get the error if you don't enable the OpenCL backends, or if you enabled the CUDA backend as well (as CUDA will be the default then). Best regards, Karli > > I will continue to poke around (maybe I need to use internal_size > numbers) but thought I would ask you about this. > > Any insight? > > Thanks, > > Charles > > On Fri, Aug 12, 2016 at 3:21 PM, Charles Determan <cde...@gm... > <mailto:cde...@gm...>> wrote: > > Thanks Karl, > > One followup question, what distinguishes handle(), handle1(), and > handle2()? Do they refer to different buffers? > > Regards, > Charles > > On Fri, Aug 12, 2016 at 3:13 PM, Karl Rupp <ru...@iu... > <mailto:ru...@iu...>> wrote: > > Hi Charles, > > call .handle()/.handle1()/.handle2() to get the abstract memory > buffers, and call .opencl_handle() on them to get the cl_mem > handles: > > A.handle().opencl_handle() > > Similarly, the command queue is obtained with > viennacl::ocl::get_queue().handle().get() > > Unfortunately it's not explicitly written in the manual :-/ > > Best regards, > Karli > > > On 08/12/2016 09:39 PM, Charles Determan wrote: > > I also would need to access the command queue handle > (cl_command_queue) > object to pass to clBLAS and clMAGMA functions. Is this easily > accessible as well? > > Thanks, > Charles > > On Fri, Aug 12, 2016 at 11:45 AM, Charles Determan > <cde...@gm... <mailto:cde...@gm...> > <mailto:cde...@gm... > <mailto:cde...@gm...>>> wrote: > > Thanks Karl, > > I have been looking through the docs and I can't find an > example for > how to pull the OpenCL handles from a matrix. I saw a > couple I > think from a context but not sure that is what I need. > Is this in > the documentation somewhere? The closest I could fine > is this page > (http://viennacl.sourceforge.net/doc/manual-memory.html > <http://viennacl.sourceforge.net/doc/manual-memory.html> > <http://viennacl.sourceforge.net/doc/manual-memory.html > <http://viennacl.sourceforge.net/doc/manual-memory.html>>). > > Regards, > Charles > > On Wed, Aug 10, 2016 at 12:09 PM, <ru...@iu... > <mailto:ru...@iu...> > <mailto:ru...@iu... > <mailto:ru...@iu...>>> wrote: > > Hi Charles, > > > I have recently expressed some interest in different > factorizations such as > QR and SVD. I am aware that these or currently > experimental > within > ViennaCL. Until such a time that these > factorizations are > fully supported > (I hope to contribute but the algorithms are > quite complex) > would it be > feasible to interface with a library like > clMAGMA? I'm not > sure of any > other library offhand that does implement these > methods. I > thought perhaps > VexCL but I couldn't find anything to that > effect in the > documentation. > > > Sure, you can always grab the OpenCL handles from > the matrices > and plug that into clMAGMA. > I don't think there is any value in ViennaCL > wrapping the > clMAGMA interfaces, though. > > Best regards, > Karli > > > > > > > |