Menu

#66 Async void dev-to-host callback

open
nobody
5
2011-02-03
2011-02-03
No

Add an ability to place asynchronous void host-to-device calls from inside a NUDA kernel. For instance:

// search through array for an appropriate element, report findings to a host function
nuwork(64) nfor(j in n) {
when(pred(a[j])) logit(a[j])
}

This feature is interesting for debugging (after all, not all OpenCL implementations support printf()), and for offline logging of data. It might also be of interest for various search-based applications.

The. simplest implementation of this will be to place calls to host into arrays with atomically incremented counters, with the id of the function being called placed before each parameter list. However, this would require, first, that kernel executes successfully, and second, that the buffer won't get overflown during GPU kernel execution. As an option, an overflow guard may be provided, for the GPU kernel to have an opt-out if it was impossible to place the call to host. For that, however, a special macro can be provided.

Of course, there will be no guarantee at first that call to host executed during device kernel execution. In fact, it will be executed strictly after that, as OpenCL does not provide any guarantees to using the same data both at the host and device at the same time. Later, however, more "synchronous" option can be provided.

Discussion