Menu

Question: fast_copy pointer to matrix

2014-05-24
2014-05-26
  • le ngoanhcat

    le ngoanhcat - 2014-05-24

    Hi,

    I am trying to copy a matrix from host to device, pattern1 to vcl_pattern1. They are declared as follows.

    double * pattern1
    viennacl::matrix<double> vcl_patttern1(3,3)

    pattern1 is a pointer to the first element of 3x3 matrix

    [ 1 1 1
    1 1 1
    1 1 1 ]

    In memory allocation, it should be placed consequently [1 1 1 1 1 1 1 1 1].

    When I am trying to use fast_copy function

    void viennacl::fast_copy (
    SCALARTYPE * cpu_matrix_begin,
    SCALARTYPE * cpu_matrix_end,
    matrix< SCALARTYPE, F, ALIGNMENT > & gpu_matrix
    )

    fast_copy(pattern1,pattern1 + 9, vcl_pattern1);

    , only first 3 elements ( a first row of the matrix with pointer pattern 1) are copied to vcl_pattern1 correctly. Meanwhile, the rest of vcl_pattern1 elements are 0 or filled with random number.


    If I define a viennacl::matrix<double> vcl_pattern1(1,9), I can successfully define copy data from pattern1

    double * pattern1;
    viennacl::matrix<double> vcl_patttern1(1,9);
    fast_copy(pattern1,pattern1+9,vcl_pattern1)

    When I try to resize the matrix vcl_pattern1(1,9) to vcl_pattern1(3,3) by

    vcl_pattern1.resize(3,3,true);

    The result is

    [1 1 1
    0 0 0
    0 0 0]

    It only reserves the first three elements.

    1. Can you help me point out which problem I have ?
    2. If similar questions have been asked in the other threads, please point me to that thread. Apology for repeating the same questions.

    P/S: There is one line the fast_copy documentation "Matrix-Layout on CPU must be equal to the matrix-layout on the GPU. ". It may be a hint for answers about this problem.

    Regards

    Cat Le

     

    Last edit: le ngoanhcat 2014-05-24
  • Karl Rupp

    Karl Rupp - 2014-05-24

    Hi,

    the reason for the behavior is that matrices are padded internally. Although you allocate a 3x3-matrix, the actual memory layout is 128x128. Generally, the internal row and column sizes are multiples of 128 of the values you provide. This optimization is necessary to obtain good performance particularly on GPUs. Your code should work if you replace fast_copy() by copy(). If you want to use fast_copy(), then you need to pad your raw memory such that the data is as follows:
    [ 1 1 1 0 0 0 0 ..... 0 #128 values total
    1 1 1 0 0 0 0 ..... 0 #128 values total
    1 1 1 0 0 0 0 ..... 0 ]
    #128 values total
    For row-major matrices you don't need to make the row-count a multiple of 128, only columns. For column-major matrices it's the other way round.

    Double-checking the manual confirms that this is not sufficiently documented, sorry for the troubles. I created a ticket here:
    https://github.com/viennacl/viennacl-dev/issues/76
    to get this fixed.

    Best regards,
    Karli

     
  • le ngoanhcat

    le ngoanhcat - 2014-05-24

    Hi Karli,

    Thanks for quick reply.

    When trying fast_copy(pattern1 + 0,pattern1 + 9, vcl_pattern1), I get the following error.

    rbf_dot_mex.cpp:75:64: error: no matching function for call to ‘copy(ScalarType, ScalarType, viennacl::matrix<double, viennacl::row_major="">\ &)’

    I check the API, I can only find such copy function for viennacl::vector class not viennacl::matrix class.

    Does that copy() function exist for vienna::matrix ?

    Regards

    Cat Le

    P/S: I really appreciate this project and your works. It really make my research much easier. Thanks a ton !!!. I will try to be active on this project's discussion since I will use the viennacl toolbox a lot.

     
  • Karl Rupp

    Karl Rupp - 2014-05-24

    Hi,

    please use vcl_pattern1.begin() instead of vcl_pattern1 inside fast_copy()

    Best regards,
    Karli

    PS: Thanks for the praise. We take all user input serious, so you can certainly help us a lot by providing feedback. Even if you think it is a just a small detail, they also matter :-)

     
  • le ngoanhcat

    le ngoanhcat - 2014-05-26

    Hi,

    Sorry to be unclear. My question is :

    If I want to copy data of pointers to viennacl::matrix directly, is fast_copy the only way ?


    void viennacl::fast_copy
    ( SCALARTYPE * cpu_matrix_begin,
    SCALARTYPE * cpu_matrix_end,
    matrix< SCALARTYPE, F, ALIGNMENT > & gpu_matrix
    )


    I have an extra question:

    If I want to copy from viennacl::matrix gpu_matrix to cpu_matrix_begin, do I need to create cpu_matrix_begin which are already zero padded ?

    [ 1 1 1 0 0 0 0 ..... 0 #128 values total
    1 1 1 0 0 0 0 ..... 0 #128 values total
    1 1 1 0 0 0 0 ..... 0 ]
    #128 values total


    void viennacl::fast_copy
    ( const matrix< SCALARTYPE, F, ALIGNMENT > & gpu_matrix,
    SCALARTYPE * cpu_matrix_begin
    )


    For the above fast_copy function, do I need to do zero padding for both rows and columns ?

    Regards

    Cat Le

     

    Last edit: le ngoanhcat 2014-05-26

Log in to post a comment.