From: Carlos S. de La L. <car...@ur...> - 2011-12-19 09:02:10
|
> We currently do this in C++, and I want to port this code to OpenCL. > Unconditional inlining of all functions would not be good for this > application. Would it be possible to skip functions that don't call a > get_*() function, or to skip inlining functions marked "noinline"? It is, that is why I removed the forcing inline. The passes do not strictly require that the kernel is fully inlined (in fact, inlining now is done by LLVM with its own criteria), What needs to be fixed (as per your bug report) is always inline calls leading to one of those get_xxx(). > Instead of privatizing the code for each thread, is it possible to > privatize these variables on which the get_*() functions are based? With > hyperthreading or modern AMD processors, it can be beneficial to have > several threads executing the same code, even if some expressions cannot > be evaluated at build time. No, those variables need to be different for each workgroup, so we cannot make them global (multiple workgroups might be running in parallel in threaded environments). The only way around this is using a context structure that gets passed to all subfunctions, but old passes used to work like that and, in general, generated code is much worse due to load and stores to that structure. Remember there is no threading "within" the workgroup. Threads are created for different workgroups but not for different workitems of the same workgroup. Carlos |