Menu

A potential problem with Microsoft OpenCL™ and OpenGL® Compatibility Pack on DX12 runtime.

2021-09-04
2022-11-08
  • Tang Jinchuan

    Tang Jinchuan - 2021-12-23

    This one is quite interesting. I now have some clues after testings with different assumptions on ocl_cat.

    One important thing I discovered was that If a kernel have been enqueued twice or more (e.g., for loop in cat kernel submission), and if none of it is being executed at the device side, then this OpenCLOn12 runtime will ignore the previously queued kernel cmd but only leave the latest one who has the same kernel name. This causes the incorrect results while doing examples like ocl_cat(1,a,a,a).

    The magic happens when I add clBarrier, clFlush, clFinish one after another to each enqueue inside the for loop, the results now are correct. It's 12:36AM already at my place. I guess I will have a further look at to see if this is the important case or some functions out of thoese three are good enough.

     
  • Tang Jinchuan

    Tang Jinchuan - 2021-12-24

    So, basically, adding clFlush only to a for loop based same-kernel-name submission will work for OpenCLOn12 at a cost of speed for ocl_cat and ndgrid or their related functions.
    Meanwhile, OpenCLOn12 runtime will give incorrect results for single input+ single output version of max, min, accmax and acumin. It is quite similar to AMD's runtime which has a problem when comparing a pointer with another, which was used in four kernels to check if the index outputs is available or not. Since Matt has mentioned there would be problem if simply send 0/NULL as a kernel argument in some old runtimes, the solution here is let what is similar to "give to Caesar what belongs to Caesar, and give to God what belongs to God.". Simply use a kernels with index input and another one without.
    As a result, the system now can run on OpenCLOn12. This is a great news to some Windows-ARM based system that does not have native OpenCL runtime at the moment but supports Dx12 Interfaces.
    The problem is solved.
    Happy Christmas!

     
  • Tang Jinchuan

    Tang Jinchuan - 2022-05-02

    I have upload a test case to preproduce MS’ runtime problem and upload it to their GitHub. So let’s wait and see their response. As a result I won’t push to release a patch for this runtime with clfinish for this will definitely disable the batch submission for ocl_cat.

     
  • Tang Jinchuan

    Tang Jinchuan - 2022-11-08

    After a long time, MS finally confirmed this as a good bug and claimed to fix this problem.

     

Log in to post a comment.