From: Karl R. <ru...@iu...> - 2011-11-04 14:36:32
|
Hi Krzysztof, the OpenCL standard suggests that local_work_size times global_work_size refers to the total number of workers (threads). However, OpenCL SDKs are free to provide their own interpretation of the standard. In particular, we have observed (similar to the link you've provided) that the AMD SDK occupies all cores, but still behaves differently depending on the work items specified. If I remember correctly, it scaled quite linearly with the number of cores, but it may not scale beyond a single CPU socket on multi-socket machine. I don't have any experiences with the Intel SDK in this regard, so it may scale better there. Hence, I suggest you simply compare different work sizes (local and global) on Intel and AMD, which should give you quite useful results for your work. Best regards, Karli On 11/04/2011 06:59 AM, Krzysztof Bzowski wrote: > Hi thank you for your reply. > > I am little confused now. I know about local_work_size and > global_work_size, which give me respectively number of workgroups and > number of workitems in one workgroup. Is it correct? So, number of > threads (processors) is a product of these values? > On the other hand according to this: > http://stackoverflow.com/questions/7163962/selecting-number-of-cpu-cores-in-opencl > OpenCL use always all streaming processors to calculation. > |