From: Philippe T. <phi...@gm...> - 2012-08-23 13:56:19
|
Hello everybody ! Browsing through the internals, I have found : template <typename SCALARTYPE, typename F, unsigned int ALIGNMENT> void fast_copy(SCALARTYPE * cpu_matrix_begin, SCALARTYPE * cpu_matrix_end, matrix<SCALARTYPE, F, ALIGNMENT> & gpu_matrix) Might not have the intended behavior : it reallocs the internal gpu_buffer of gpu_matrix to a buffer of size (cpu_matrix_end-cpu_matrix_begin)*sizeof(SCALARTYPE) , and copies the data. A user might however want to copy just a part of the cpu_matrix to the gpu. Plus, reallocating the matrix without changing its sizes sounds a bit weird. Instead, wouldn't it be more intuitive to call clEnqueueWriteBuffer, and to create an additional constructor : matrix(size1, size2, cpu_matrix_begin) , which would indeed allocate with CL_COPY_HOST_PTR flag. Plus, such a constructor covers a special const-correctness case i've faced : Creating a gpu matrix from cpu data, but not writing them back. In that case, the gpu_matrix is const but it is not possible to call fast_copy on it ! Best regards, Philippe |
From: Karl R. <ru...@iu...> - 2012-08-23 15:22:25
|
Hi Philippe, I've recently spotted this and related issues with copy() and fast_copy(). So far, there is no entirely consistent behavior for vectors and dense/sparse matrices, thus it requires a unificiation for 1.4.0. As for the extended CTOR, we might better mask the cpu_matrix_begin-pointer a bit in order to avoid ambiguities with user-provided handles, hints on the number of nonzeros for sparse matrices, and with explicitly giving the user a hint that the pointer must point to properly aligned data (particularly for viennacl::matrix<double, row_major, 16> and the like). I could think of something like matrix<double> my_matrix(1024, 1024, host_mem_ptr<double>(my_ptr)); which is sufficiently type-safe and allows clean dispatches. Best regards, Karli On 08/23/2012 03:55 PM, Philippe Tillet wrote: > Hello everybody ! > > Browsing through the internals, I have found : > > template <typename SCALARTYPE, typename F, unsigned int ALIGNMENT> > > void fast_copy(SCALARTYPE * cpu_matrix_begin, > > SCALARTYPE * cpu_matrix_end, > > matrix<SCALARTYPE, F, ALIGNMENT> & gpu_matrix) > > Might not have the intended behavior : it reallocs the internal gpu_buffer of gpu_matrix to a buffer of size (cpu_matrix_end-cpu_matrix_begin)*sizeof(SCALARTYPE), and copies the data. > > > > > A user might however want to copy just a part of the cpu_matrix to the gpu. Plus, reallocating the matrix without changing its sizes sounds a bit weird. > > > > Instead, wouldn't it be more intuitive to call clEnqueueWriteBuffer, and to create an additional constructor : matrix(size1, size2, cpu_matrix_begin) , which would indeed allocate with CL_COPY_HOST_PTR flag. > > > > Plus, such a constructor covers a special const-correctness case i've faced : Creating a gpu matrix from cpu data, but not writing them back. In that case, the gpu_matrix is const but it is not possible to call fast_copy on it ! > > > > Best regards, > Philippe > > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > > > > _______________________________________________ > ViennaCL-devel mailing list > Vie...@li... > https://lists.sourceforge.net/lists/listinfo/viennacl-devel > |