OpenCL on OS X: Can't copy data to GPU

Markus
2013-05-14
2013-06-12
  • Markus

    Markus - 2013-05-14

    I'm just learning to work with ViennaCL. The first tries on the CPU worked fine, now I am trying to use OpenCL. However, I can't manage to get data onto the CPU - while the matrices seem to be created, they don't get any contents:

    #define VIENNACL_WITH_OPENCL
    #define VIENNACL_WITH_UBLAS 
    #include <boost/numeric/ublas/matrix.hpp>
    #include "viennacl/matrix.hpp"
    int main() {
        boost::numeric::ublas::matrix<float> data_cpu(1,1);
        data_cpu(0,0) = 1;
        viennacl::matrix<float> data_gpu(1,1);
        viennacl::copy(data_cpu, data_gpu);
        assert(data_cpu(0,0) == data_gpu(0,0));
    }
    

    After this,

    data_gpu(0,0)
    

    is 0 but I believe it should be 1. Also, if I copy

    data_gpu
    

    back to

    data_cpu
    

    ,

    data_cpu
    

    is incorrect as well.

    I'm compiling this with

    g++ nocopy.cpp -framework OpenCL
    

    . I am using OS X with the provided OpenCL driver.

    What am I doing wrong here?

    Edit: Removing

    VIENNACL_WITH_OPENCL
    

    fixes the problem, but is not what I want.

     
  • Markus

    Markus - 2013-05-14

    Whoops - sorry for messing up the formatting. I thought that inline codes would work.

     
  • Karl Rupp

    Karl Rupp - 2013-05-14

    Hi,

    I reran your example on my machine, everything runs correctly.

    Maybe the values aren't yet copied to the GPU? Could you please add

    viennacl::backend::finish():
    

    after the call to copy() and let me know whether the error remains?

    Thanks and best regards,
    Karli

     
  • Markus

    Markus - 2013-05-14

    Hi Karli,

    Thank you for your reply. Unfortunately, this didn't change anything.

    Are there any known pitfalls when using OS X? I will try this on another machine later today and see if that makes any difference.

    Thanks,
    Markus

     
  • Karl Rupp

    Karl Rupp - 2013-05-14

    Hi Markus,

    hmm, the only known issue with OS X to date is that on older versions the OpenCL SDK has some problems when handles are stored in static variables, but we have a workaround for this. Another user reported issues in some examples when using a Retina display, the reason being too little GPU RAM left. However, this should not affect your simple snippet.

    Do the examples run correctly? Particularly those ending in "-opencl"?

    Best regards,
    Karli

     
  • Markus

    Markus - 2013-05-15

    noname:benchmarks Markus$ ./openclbench-opencl
    ----------------------------------------------
                   Device Info
    ----------------------------------------------
    CL Device Vendor ID: 16918016
    CL Device Name: GeForce GT 650M
    CL Driver Version: CLH 1.0
    --------------------------------
    CL Device Max Compute Units: 2
    CL Device Max Work Group Size: 1024
    CL Device Global Mem Size: 1073741824
    CL Device Local Mem Size: 49152
    ----------------------------------------------
    ----------------------------------------------
    ## Benchmark :: OpenCL performance
    ----------------------------------------------
       -------------------------------
       # benchmarking single-precision
       -------------------------------
    Time for building scalar kernels: 0
    Time for building vector kernels: 0.000444
    Time for building matrix kernels: 0.003909
    Time for building compressed_matrix kernels: 1.6e-05
    Time for 100000 entry accesses on host: 0.000466
    Time per entry: 4.66e-09
    Result of operation on host: 104839
    Time for 100000 entry accesses via OpenCL: 2.50028
    Time per entry: 2.50028e-05
    Result of operation via OpenCL: 4.59163e-36
       -------------------------------
       # benchmarking double-precision
       -------------------------------
    Time for building scalar kernels: 1e-06
    Time for building vector kernels: 0.000706
    Time for building matrix kernels: 0.005705
    Time for building compressed_matrix kernels: 4.3e-05
    Time for 100000 entry accesses on host: 0.000467
    Time per entry: 4.67e-09
    Result of operation on host: 105171
    Time for 100000 entry accesses via OpenCL: 5.92087
    Time per entry: 5.92087e-05
    Result of operation via OpenCL: 6.95322e-305

    I assume this means they didn't:

    Result of operation on host: 105171
    Result of operation via OpenCL: 6.95322e-305
    
     
  • Karl Rupp

    Karl Rupp - 2013-05-15

    Hmm, this looks really broken. Did you run any sample OpenCL examples outside ViennaCL? It seems to me that your OpenCL installation is somehow broken, as no data gets written to the device.

     
  • Markus

    Markus - 2013-05-16
    noname:histogram Markus$ pwd
    /Users/Markus/Desktop/tmp/opencl-book-samples-read-only/src/Chapter_14/histogram
    noname:histogram Markus$ ./histogram 
    OpenCL Device Vendor = NVIDIA,  OpenCL Device Name = GeForce GT 650M,  OpenCL Device Version = OpenCL 1.1 
    Image Histogram for image type = CL_RGBA, CL_UNORM_INT8: verify_histogram_results failed for indx = 0, gpu result = 0, expected result = 8204
    Image dimensions: 1920 x 1080 pixels, Image type = CL_RGBA, CL_UNORM_INT8
    

    Nope. Looks like my OpenCL is broken. I didn't think about that possibility since I didn't to anything to Apple's OpenCL installation. Thanks for your help.

     
  • Karl Rupp

    Karl Rupp - 2013-05-16

    Thanks, Markus, for letting us know. This will certainly help us if similar troubles show up again for other users.

    Best regards,
    Karli

     

Log in to post a comment.