From: Karl R. <ru...@iu...> - 2012-08-01 08:39:22
|
Hi Philippe, thanks for the investigations. > kernel 1, device 1 : C(0,0) = A(0,0) * B(0,0) > kernel 2, device 2 : C(0,1) = A(0,0) * B(0,1), > Both the AMD and the NVidia SDK are unable to multicast A(0,0) from the > host to the two GPUs. Even if the two kernels are enqued in parallel, > the execution is serialized, because the 2nd device has to wait for > A(0,0) to be available. This is exactly the behavior I feared. > It does not happen with a simple matrix addition, where all the handles > are independant. Okay, I see, so the const-qualifiers for the kernel handles are ignored (or not abused for a more efficient implementation). Thus, it seems like we have to use separate memory handles in such case and that we better attach some meta-information ('current device') to each memory handle. > I'm desesperately looking for a low-memory handle multicasting. I might > give the Khronos forum a try, even though enqueuing the same handle on > different queues is left implementation-defined by the standards! Oh dear, 'implementation-defined' is nothing I want to see at this point :-( Seems like we should perhaps reconsider using one context per device and benchmark memory transfers for the two options (i.e. one context for all devices vs. one context per device). > But well, the good news is that the kernels are executing! Yep, some good news :-) Best regards, Karli > > 2012/7/31 Karl Rupp <ru...@iu... <mailto:ru...@iu...>> > > Hello again, > > I've justed pushed the following changes to the sourceforge-repository: > * operator+= and operator-= no longer create temporaries > * A = prod(B,C) does not fail if there is garbage in A > > Best regards, > Karli > > > > On 07/29/2012 03:59 PM, Philippe Tillet wrote: > > Hello everybody ! > > I'll inaugurate this mailing list with a little question. > I have not seen any kernel for computing the operation A += > prod(B,C) . > Does this mean that this operation is done doing : > > tmp = prod(B,C) > a+=tmp > > ? > > For computing the multi_matrix ( project i'm working on, matrix > composed > of multiple handles, to solve the CL_MAX_ALLOCABLE_MEMORY and > the multi > devices issue), I need to do several updates of this kind, in a > block > layout. For a 2*2 block layout : > > C(0,0).clear(); > => > C(0,0) += prod( A(0,0), B(0,0) ) > => > C(0,0) += prod( A(0,1), B(1,0) ) > > C(0,1).clear(); > => > C(0,1) += prod( A(0,0), B(0,1) ) > => > C(0,1) += prod( A(0,1), B(1,1) ) > > ... > ... > > This "sort-of-rank-1-update approach" is a special case of the > SUMMA > Algorithm (OpenCL doing the memory transfers in the back ground, > for now > at least) and seems to be efficient from a memory point of view. > Using > another approach would lead to both a huge memory consumption and > significant memory transfers... > > Is there any way of doing so in ViennaCL ? > > Best regards ! > Phil > > > > ------------------------------__------------------------------__------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. > Discussions > will include endpoint security, mobile security and the latest > in malware > threats. > http://www.accelacomm.com/jaw/__sfrnl04242012/114/50122263/ > <http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/> > > > > _________________________________________________ > ViennaCL-devel mailing list > ViennaCL-devel@lists.__sourceforge.net > <mailto:Vie...@li...> > https://lists.sourceforge.net/__lists/listinfo/viennacl-devel > <https://lists.sourceforge.net/lists/listinfo/viennacl-devel> > > > > > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > ViennaCL-devel mailing list > Vie...@li... > https://lists.sourceforge.net/lists/listinfo/viennacl-devel > |