|
From: Pekka J. <pek...@tu...> - 2012-01-19 21:28:57
|
On 01/19/2012 10:28 PM, Erik Schnetter wrote: > It is then not immediately clear in which order the individual work > items are executing the async_work_group_copy(). In my reading (which > may be flawed), this code should be legal -- each work item needs to > call wait_group_events() (or a barrier) with the same arguments, but > it doesn't need to be the same call site. OK, got it. I think it's quite clear the WI control semantics is the same as with the barrier(). If it's not, it leads to tricky cases without apparent benefit. In the other interpretation, how would one in practice differentiate between two separate async copies (even if they were to the same address as one can modify the memory in between the calls)? Or the other way around, how would one prove the copy calls actually should be merged? Without figuring it out somehow, your example should be interpreted as an undefined program as all WIs should execute both of the async copy calls, which is impossible. The specs should be more clear with this but I'm quite sure the "control semantics" are the same as with barriers. Let's trust on that until shown otherwise. -- --Pekka |