ViennaCL / Discussion / General Discussion: Custom kernels, clFinish(), and work sizes

Custom kernels, clFinish(), and work sizes

Forum: General Discussion

Creator: Peter Burka

Created: 2013-06-25

Updated: 2013-08-19

Peter Burka - 2013-06-25

I've been experimenting with custom kernels in ViennaCL and I've encountered some problems related to work sizes and clFinish(). If I give my kernel a global work size which exceeds the maximum global work size, ViennaCL attempts to halve the work size until clEnqueueNDRangeKernel() succeeds. After each failed enqueuing, it calls queue.finish() (enqueue.hpp:83), which calls clFinish(). As far as I can tell, the clFinish() call serves no purpose except to slow things down, as it introduces an unnecessary synchronization point.

Additionally, the default behaviour of halving the global and local work sizes until they fit isn't ideal. If I only set the global work size, my global work size may not be divisible by the default local work size. ViennaCL will happily reduce these all the way to 0, if necessary, which is not very useful.

It's not clear that a user gains much from ViennaCL's naive work size selection algorithm. It might be better to simply fail and report an error than to pick suboptimal (or incorrect!) work sizes.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Karl Rupp - 2013-08-19

Hey,

the reason for this behavior dates back at least three years, where one got all different kinds of funny behavior. I agree that this is superseded, I'll remove this for the next release.

Thanks for the input and best regards,
Karli

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Karl Rupp - 2013-08-19

Hi Peter,
the behavior is corrected here:
https://github.com/viennacl/viennacl-dev/commit/43299944de188f742a78705013bddc9c4c9173c6

Thanks for the input!

Best regards,
Karli

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Log in to post a comment.