From: Philippe T. <phi...@gm...> - 2012-07-29 14:00:25
|
Hello everybody ! I'll inaugurate this mailing list with a little question. I have not seen any kernel for computing the operation A += prod(B,C) . Does this mean that this operation is done doing : tmp = prod(B,C) a+=tmp ? For computing the multi_matrix ( project i'm working on, matrix composed of multiple handles, to solve the CL_MAX_ALLOCABLE_MEMORY and the multi devices issue), I need to do several updates of this kind, in a block layout. For a 2*2 block layout : C(0,0).clear(); => C(0,0) += prod( A(0,0), B(0,0) ) => C(0,0) += prod( A(0,1), B(1,0) ) C(0,1).clear(); => C(0,1) += prod( A(0,0), B(0,1) ) => C(0,1) += prod( A(0,1), B(1,1) ) ... ... This "sort-of-rank-1-update approach" is a special case of the SUMMA Algorithm (OpenCL doing the memory transfers in the back ground, for now at least) and seems to be efficient from a memory point of view. Using another approach would lead to both a huge memory consumption and significant memory transfers... Is there any way of doing so in ViennaCL ? Best regards ! Phil |