viennacl-devel Mailing List for ViennaCL (Page 5)

viennacl-devel — Developer mailinglist. Suggest and/or discuss new features here.

You can subscribe to this list here.

2012	Jan	Feb	Mar	Apr	May	Jun	Jul (6)	Aug (30)	Sep (1)	Oct (10)	Nov (8)	Dec (1)
2013	Jan	Feb (9)	Mar (3)	Apr (1)	May (2)	Jun (2)	Jul (73)	Aug (145)	Sep (32)	Oct (45)	Nov (4)	Dec (76)
2014	Jan (24)	Feb (92)	Mar (27)	Apr (15)	May (57)	Jun (49)	Jul (105)	Aug (125)	Sep (7)	Oct (19)	Nov (70)	Dec (4)
2015	Jan	Feb	Mar (3)	Apr	May (8)	Jun	Jul (40)	Aug (29)	Sep	Oct (8)	Nov (1)	Dec (7)
2016	Jan (12)	Feb (7)	Mar (8)	Apr (4)	May (20)	Jun (4)	Jul (38)	Aug (44)	Sep (11)	Oct (10)	Nov (13)	Dec (4)
2017	Jan	Feb (7)	Mar	Apr	May (1)	Jun	Jul	Aug (2)	Sep	Oct	Nov	Dec
2018	Jan (1)	Feb	Mar	Apr	May	Jun (4)	Jul	Aug	Sep	Oct	Nov	Dec

Flat | Threaded

<< < 1 .. 3 4 5 6 7 .. 53 > >> (Page 5 of 53)

Re: [ViennaCL-devel] Copying Values out of a compressed_matrix

From: Karl R. <ru...@iu...> - 2016-07-23 08:15:48

Hi,

 > yes. this seems to be the case. if i force out-of-order CSR into
> in-order CSR everything seems to work. Can't see the documentation
> explicitly mentioning this if this is the case indeed.
>
> Karl, can you please confirm only in-order CSRs are supported? Thanks!

out-of-order CSR works for SpMVs, but not for sparse matrix-matrix 
multiplies.

Parallel algorithms usually work better for in-order data layouts. The 
performance penalty for out-of-order data is almost always too high to 
justify any extra kernels for out-of-order data.

Best regards,
Karli

PS: Yes, the documentation should be more explicit about this.

Re: [ViennaCL-devel] Copying Values out of a compressed_matrix

From: Karl R. <ru...@iu...> - 2016-07-23 08:10:50

Hi,

 > PS
> (4) column indices admit out-of-order placements of elements within each
> row.

Column indices *have* to be in ascending order for sparse matrix-matrix 
multiplication.

Best regards,
Karli


>
> Thank you.
> -Dmitriy
>
> On Fri, Jul 22, 2016 at 12:56 PM, Dmitriy Lyubimov <dl...@gm...
> <mailto:dl...@gm...>> wrote:
>
>     I think I still am getting seg faults on attempt to multiply
>     matrices even without conversion back (larger arguments, 3k x 1k)
>
>     I re-wrote another alternative transformation procedure and see
>     nothing wrong with it. Both Andrew's code and mine fail with the
>     same symptoms.
>
>     Karl, can we verify assumptions about the format:
>
>     (1) the compressed_marix.set method expects host memory pointers.
>     (2) the format is compressed row storage (CSR). Documentation never
>     says explicitly that, and actually seems to have errors in size of
>     elements and jumper arrays (it says jumper array has to be cols+1
>     long wheres in CSR it shoud actually be rows + 1 long, right? )
>     (3) the element sizes of jumper and column indices arrays are 32 bit
>     and are in little endian order (at least for the open MP backend).
>
>     Right now I can't even get open mp sparse multiplication work
>     although CSR format is not rocket science at all. Don't see a
>     problem anywhere. Tried to read Vienna's code to converm the
>     assumptions above, but this seems to be pretty elusive for the time
>     being.
>

Re: [ViennaCL-devel] Copying Values out of a compressed_matrix

From: Karl R. <ru...@iu...> - 2016-07-23 08:10:05

Hi Dmitriy,

 > Karl, can we verify assumptions about the format:
>
> (1) the compressed_marix.set method expects host memory pointers.

yes

> (2) the format is compressed row storage (CSR). Documentation never says
> explicitly that, and actually seems to have errors in size of elements
> and jumper arrays (it says jumper array has to be cols+1 long wheres in
> CSR it shoud actually be rows + 1 long, right? )

yes

> (3) the element sizes of jumper and column indices arrays are 32 bit and
> are in little endian order (at least for the open MP backend).

elements are in whatever order your machine supports.

Best regards,
Karli


> Right now I can't even get open mp sparse multiplication work although
> CSR format is not rocket science at all. Don't see a problem anywhere.
> Tried to read Vienna's code to converm the assumptions above, but this
> seems to be pretty elusive for the time being.
>
>
> On Fri, Jul 22, 2016 at 10:26 AM, Andrew Palumbo <ap...@ou...
> <mailto:ap...@ou...>> wrote:
>
>     Yep thats it.  Oh wow- well thats just embarrassing 😊.
>
>
>     Thanks very much for your time, Karl- much appreciated.
>
>
>     Andy
>
>     ------------------------------------------------------------------------
>     *From:* Karl Rupp <ru...@iu... <mailto:ru...@iu...>>
>     *Sent:* Friday, July 22, 2016 12:39:20 PM
>     *To:* Andrew Palumbo; viennacl-devel
>     *Subject:* Re: [ViennaCL-devel] Copying Values out of a
>     compressed_matrix
>     Hi,
>
>     your second and third arguments to memory_read() are incorrect:
>     The second argument is the offset from the beginning, the third
>     argument
>     is the number of bytes to be read. Shifting the zero to the second
>     position fixes the snippet (plus correcting the loop bounds when
>     printing at the end) :-)
>
>     Best regards,
>     Karli
>
>
>
>     On 07/22/2016 08:51 AM, Andrew Palumbo wrote:
>     > a couple of small mistakes in the previous c++ file:
>     >
>     >
>     > The memory_read(..) call should be:
>     >
>     >
>     >    // read data back into our product buffers
>     >    viennacl::backend::memory_read(handle1, product_size_row * 4, 0,
>     > product_row_ptr, false);
>     >    viennacl::backend::memory_read(handle2, product_NNz * 4, 0,
>     > product_col_ptr, false);
>     >    viennacl::backend::memory_read(handle, product_NNz * 8, 0,
>     > product_values_ptr, false);
>     >
>     >
>     > (read product_NNz * x bytes instead of product_size_row * x)
>     >
>     >
>     > I've attached the corrected file.
>     >
>     >
>     > Thanks
>     >
>     >
>     > Andy
>     >
>     > ------------------------------------------------------------------------
>     > *From:* Andrew Palumbo <ap...@ou... <mailto:ap...@ou...>>
>     > *Sent:* Thursday, July 21, 2016 11:03:59 PM
>     > *To:* Karl Rupp; viennacl-devel
>     > *Subject:* Re: [ViennaCL-devel] Copying Values out of a compressed_matrix
>     >
>     > Hello,
>     >
>     >
>     > I've mocked up a sample of the compressed_matrix multiplication that
>     > I've been working with javacpp on in C++.  I am seeing the same type of
>     > memory errors when I try to read the data out of product, and into the
>     > output buffers as I was with javacpp.  By printing the matrix to stdout
>     > as in the compressed_matrix example we can see that there are values
>     > there, and they seem reasonable,  but when i use
>     > backend::memory_read(...)  to retrive the buffers, I'm getting values
>     > consistent with a memory error, and similar to what i was seeing in the
>     > javacpp code.  Maybe I am not using the handles correctly?  Admittedly
>     > my C++ is more than rusty, but I believe I am referencing the buffers
>     > correctly in the output.
>     >
>     >
>     > Below is the output of the attached file: sparse.cpp
>     >
>     >
>     > Thanks very much,
>     >
>     >
>     > Andy
>     >
>     >
>     >
>     > ViennaCL: compressed_matrix of size (10, 10) with 24 nonzeros:
>     >    (1, 2)    0.329908
>     >    (1, 3)    0.0110522
>     >    (1, 4)    0.336839
>     >    (2, 5)    0.0150778
>     >    (2, 7)    0.0143518
>     >    (3, 3)    0.217256
>     >    (3, 6)    0.346854
>     >    (3, 9)    0.45353
>     >    (4, 3)    0.407954
>     >    (4, 6)    0.651308
>     >    (5, 2)    0.676061
>     >    (5, 3)    0.0226486
>     >    (5, 4)    0.690264
>     >    (6, 5)    0.0998838
>     >    (6, 7)    0.0950744
>     >    (7, 2)    0.346173
>     >    (7, 3)    0.0115971
>     >    (7, 4)    0.353446
>     >    (7, 9)    0.684458
>     >    (8, 5)    0.0448123
>     >    (8, 7)    0.0426546
>     >    (8, 9)    0.82782
>     >    (9, 5)    0.295356
>     >    (9, 7)    0.281134
>     >
>     > row jumpers: [
>     > -36207072,32642,-39708721,32642,6390336,0,2012467744 <tel:2012467744>,32767,2012467968
>     <tel:2012467968>,32767,4203729,]
>     > col ptrs: [
>     > 0,0,-39655605,32642,-36207072,32642,6390336,0,10,0,-39672717,32642,2012466352,32767,-32892691,32642,1,0,6390336,0,2012466344,32767,60002304,2059362829,]
>     > elements: [
>     > 0.289516,0.304161,0.795779,0.334456,0.935264,0.585813,0.871237,0.811508,0.828558,0.0271863,6.92683e-310,6.92683e-310,1.061e-313,1.061e-313,6.36599e-314,4.24399e-314,6.36599e-314,6.92683e-310,4.24399e-314,1.2732e-313,2.122e-313,6.95324e-310,0.406537,0.0495716,0.370862,]
>     >
>     >
>     > and similarly for multiplication of 2 1x1 matrices:
>     >
>     > Result:
>     >
>     > ViennaCL: compressed_matrix of size (1, 1) with 1 nonzeros:
>     >    (0, 0)    0.117699
>     >
>     > row jumpers: [
>     > -717571424,32767,]
>     > col ptrs: [
>     > 6386240,]
>     > elements: [
>     > 0.289516,6.9479e-310,]
>     >
>     >
>     >
>     >
>     > ------------------------------------------------------------------------
>     > *From:* Andrew Palumbo <ap...@ou... <mailto:ap...@ou...>>
>     > *Sent:* Wednesday, July 20, 2016 5:40:31 PM
>     > *To:* Karl Rupp; viennacl-devel
>     > *Subject:* Re: [ViennaCL-devel] Copying Values out of a compressed_matrix
>     >
>     > Oops, sorry about not cc'ing all.
>     >
>     >
>     > I do not get correct data back for a (Random.nextDouble() populated) 1 x
>     > 1 Matrix.
>     >
>     >
>     > A:
>     >
>     >    Row Pointer: [0, 1 ]
>     >
>     >    Col Pointer: [0 ]
>     >    element Pointer: [0.6465821602909256 ]
>     >
>     >
>     > B:
>     >
>     >
>     >    Row Pointer: [0, 1 ]
>     >    Col Pointer: [0 ]
>     >    element Pointer: [0.9513577109193919 ]
>     >
>     >
>     > C = A %*% B
>     >
>     >    Row Pointer: [469762248, 32632]
>     >    Col Pointer: [469762248 ]
>     >    element Pointer: [6.9245198744523E-310 ]
>     >
>     >
>     > ouch.
>     >
>     >
>     > It looks like I'm not copying the Buffers correctly at all.  I'm may be
>     > using the javacpp buffers incorrectly here, or I have possibly wrapped
>     > the viennacl::backend::memory_handle class incorrectly, so I'm using a
>     > pointer to the wrong memory from eg. viennacl::compressed_matrix::handle.
>     >
>     >
>     > I mentioned before that the multiplication completed in on small <~300 x
>     > 300 matrices because if I try to multiply two larger sparse matrices, an
>     > err the JVM crashes with a SIGSEGV.
>     >
>     >
>     > Since this code is all wrapped with javacpp, I don't really have a small
>     > sample that I can show you (not going to dump a whole bunch of code on
>     > you).
>     >
>     >
>     > I'll keep trying to figure it out.  Pretty sure the problem is on my end
>     > here �� I really mainly wanted to ask you if I was using the correct
>     > methods at this point, or if there was anything very obviously that I
>     > was doing wrong.
>     >
>     >
>     > Thanks a lot for your help!
>     >
>     >
>     > Andy
>     >
>     >
>     >
>     >
>     >
>     >
>     > ------------------------------------------------------------------------
>     > *From:* Karl Rupp <ru...@iu... <mailto:ru...@iu...>>
>     > *Sent:* Wednesday, July 20, 2016 5:00:36 PM
>     > *To:* Andrew Palumbo; viennacl-devel
>     > *Subject:* Re: [ViennaCL-devel] Copying Values out of a compressed_matrix
>     > Hi,
>     >
>     > please keep viennacl-devel in CC:
>     >
>     > Just to clarify: Do you get incorrect values for a 1-by-1 matrix as
>     > indicated in your sample data? In your previous email you mentioned that
>     > results are fine for small matrices...
>     >
>     > I'm afraid I can only guess at the source of the error with the
>     > informations provided. Any chance that you can provide a standalone code
>     > to reproduce the problem with reasonable effort?
>     >
>     > Best regards,
>     > Karli
>     >
>     >
>     >
>     > On 07/20/2016 10:16 PM, Andrew Palumbo wrote:
>     >> Thanks so much for your quick answer!
>     >>
>     >>
>     >> I actually am sorry to say that I made a mistake when writing the last
>     >> email, I copied the wrong signature from the VCL documentation, and then
>     >> the mistake propagated through the rest of the e-mail.
>     >>
>     >>
>     >> I am actually using viennacl::backend::memory_read().
>     >>
>     >>
>     >> Eg, for the row_jumpers and column_idx  I read use:
>     >>
>     >> @Name("backend::memory_read")
>     >> public static native void memoryReadInt(@Const @ByRef MemHandle src_buffer,
>     >>                                int bytes_to_read,
>     >>                                int offset,
>     >>                                IntPointer ptr,
>     >>                                boolean async);
>     >>
>     >> and for the Values:
>     >>
>     >>
>     >> @Name("backend::memory_read")
>     >> public static native void memoryReadDouble(@Const @ByRef MemHandle src_buffer,
>     >>                                          int bytes_to_read,
>     >>                                          int offset,
>     >>                                          DoublePointer ptr,
>     >>                                          boolean async);
>     >>
>     >> And then call:
>     >>
>     >>
>     >> memoryReadInt(row_ptr_handle, (m +1) *4,0, row_ptr,false)
>     >> memoryReadInt(col_idx_handle, NNz *4,0,col_idx,false)
>     >> memoryReadDouble(element_handle, NNz *8,0, values,false)
>     >>
>     >>
>     >> and after convetring them to java.nio.Buffers, am getting results like:
>     >>
>     >>
>     >> rowBuff.get(1): 0    colBuff(1): 402653448 valBuff(1): 6.91730177312166E-310
>     >>
>     >>
>     >> Have also tried reading into BytePointers similarly with the same type
>     >> of results.  I know that the use of Javacpp obfuscates what the problem
>     >> may be.  But I believe the Memorry is properly allocated.
>     >>
>     >>
>     >>
>     >> Sorry for the mistake.
>     >>
>     >>
>     >> Thanks,
>     >>
>     >>
>     >> Andy
>     >>
>     >>
>     >> ------------------------------------------------------------------------
>     >> *From:* Karl Rupp <ru...@iu... <mailto:ru...@iu...>>
>     >> *Sent:* Wednesday, July 20, 2016 3:50:07 PM
>     >> *To:* Andrew Palumbo;Vie...@li...
>     <mailto:Vie...@li...>
>     >> *Subject:* Re: [ViennaCL-devel] Copying Values out of a compressed_matrix
>     >> Hi Andy,
>     >>
>     >> instead of viennacl::backend::memory_copy(), you want to use
>     >> viennacl::backend::memory_read(), which directly transfers the data into
>     >> your buffer(s).
>     >>
>     >> If you *know* that your handles are in host memory, you can even grab
>     >> the values directly via
>     >>    viennacl::linalg::host_based::detail::extract_raw_pointer<T>();
>     >> defined in viennacl/linalg/host_based/common.hpp, around line 40.
>     >>
>     >> Please let me know if you still get errors after using that.
>     >>
>     >> Best regards,
>     >> Karli
>     >>
>     >>
>     >>
>     >>
>     >> On 07/20/2016 09:05 PM, Andrew Palumbo wrote:
>     >>> Hello,
>     >>>
>     >>>
>     >>> I'm Having some difficulties with compressed_matrix multiplication.
>     >>>
>     >>>
>     >>> Essentially I am copying  three buffers, the CSR conversion of an Apache
>     >>> Mahout SparseMatrix, into two compressed_matrices performing matrix
>     >>> multiplication. I am doing this in scala and Java using javacpp.
>     >>>
>     >>>
>     >>> For example, I have a 5 x 10 matrix of ~20% non-zero values which in CSR
>     >>> format looks like this:
>     >>>
>     >>>
>     >>> NNz: 12
>     >>>
>     >>> Row Pointer: [0, 1, 4, 6, 9, 12, ]
>     >>>
>     >>> Col Pointer: [9, 0, 8, 7, 2, 9, 0, 8, 9, 0, 3, 5, ]
>     >>>
>     >>> element Pointer: [0.4065367203992265, 0.04957158909682802,
>     >>> 0.5205586068847993, 0.3708618354358446, 0.6963900565931678,
>     >>> 0.8330915529787706, 0.32839112750638844, 0.7856168903297948,
>     >>> 0.4265801782090245, 0.14733066454561583, 0.9501663495824946,
>     >>> 0.9710498974366047, ]
>     >>>
>     >>> Multiplied by a similarly Sparse 10 x 5 compressed_matrix
>     >>>
>     >>> I use a CompressedMatrix wrapper which essentially wraps the
>     >>>
>     >>>      viennacl:: compressed_matrix (vcl_size_t rows, vcl_size_t cols,
>     >>> vcl_size_t nonzeros=0, viennacl::context ctx=viennacl::context())
>     >>>
>     >>> constructor as well as the
>     >>>
>     >>>      compressed_matrix (matrix_expression< const compressed_matrix,
>     >>> const compressed_matrix, op_prod > const &proxy).
>     >>>
>     >>> I have a helper function, /toVclCompressedMatrix/(..) which essentially
>     >>> does the CSR conversion from a Mahout src matrix, calls the constructor
>     >>> and uses viennacl::compressed_matrix::set(...) to set the buffers:
>     >>>
>     >>> val ompA =toVclCompressedMatrix(src = mxA, ompCtx)
>     >>> val ompB =toVclCompressedMatrix(src = mxB, ompCtx)
>     >>>
>     >>>
>     >>> and then create a new viennacl::compressed_matrix from the
>     >>> viennacl::linalg::prod of the 2 matrices i.e.:
>     >>>
>     >>> val ompC =new CompressedMatrix(prod(ompA, ompB))
>     >>>
>     >>> The context in the above case is either the Host or OpenMP (I know that
>     >>> there is some special casting of the row_jumpers and col_idxs that needs
>     >>> to be done in the OpenCL version)
>     >>>
>     >>> The Matrix multiplication completes without error on small Matrices eg.
>     >>> < 300 x 300
>     >>> but seems to overwrite the resulting buffers on larger Matrices.
>     >>>
>     >>> My real problem, though is getting the memory back out of the
>     >>> resulting`ompC` compresed_matrix so that i can write it back to a mahout
>     >>> SparseMatrix.
>     >>>
>     >>> currently I am using:
>     >>>
>     >>> void viennacl::backend::memory_copy (mem_handle const &  src_buffer,
>     >>>          mem_handle &      dst_buffer,
>     >>>          vcl_size_t      src_offset,
>     >>>          vcl_size_t      dst_offset,
>     >>>          vcl_size_t      bytes_to_copy
>     >>>      )
>     >>>
>     >>> on ompC.handel1,ompC.handel2 and ompC.handel source handels
>     >>>
>     >>> to copy into pre-allocated  row_jumper,  col_index and element buffers
>     >>> (of size ompC.size1() + 1, ompC.nnz and ompC.nnz, respectivly).
>     >>>
>     >>> I am getting nonsensical values back that one would expect from memory
>     >>> errors. eg:
>     >>>
>     >>> the Matrix geometry of the result: ompC.size1(), and omp.size2() are
>     >>> correct and ompC.nnz is a reasonable value.
>     >>>
>     >>> It is possible that I have mis-allocated some of the memory on my side,
>     >>> but I am pretty sure that most of the Buffers are allocated correctly
>     >>> (usually JavaCPP does a pretty good job of this).
>     >>>
>     >>>
>     >>> I guess, long story short, my question is am i using the correct method
>     >>> of copying the memory out of a compressed_matrix?  is there something
>     >>> glaringly incorrect that i am doing here?  Should I be using
>     >>> viennacl::backend::memory_copy or is there a different method that i
>     >>> should be using?
>     >>>
>     >>>
>     >>> Thanks very much,
>     >>>
>     >>> Andy
>     >>>
>     >>>
>     >>>
>     >>>
>     >>>
>     >>>
>     >>>
>     >>>
>     >>>
>     >>>
>     >>> ------------------------------------------------------------------------------
>     >>> What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
>     >>> patterns at an interface-level. Reveals which users, apps, and protocols are
>     >>> consuming the most bandwidth. Provides multi-vendor support for NetFlow,
>     >>> J-Flow, sFlow and other flows. Make informed decisions using capacity planning
>     >>> reports.http://sdm.link/zohodev2dev
>     >>>
>     >>>
>     >>>
>     >>> _______________________________________________
>     >>> ViennaCL-devel mailing list
>     >>>Vie...@li...
>     <mailto:Vie...@li...>
>     >>>https://lists.sourceforge.net/lists/listinfo/viennacl-devel
>     >>>
>     >>
>     >
>
>
>     ------------------------------------------------------------------------------
>     What NetFlow Analyzer can do for you? Monitors network bandwidth and
>     traffic
>     patterns at an interface-level. Reveals which users, apps, and
>     protocols are
>     consuming the most bandwidth. Provides multi-vendor support for NetFlow,
>     J-Flow, sFlow and other flows. Make informed decisions using
>     capacity planning
>     reports.http://sdm.link/zohodev2dev
>     _______________________________________________
>     ViennaCL-devel mailing list
>     Vie...@li...
>     <mailto:Vie...@li...>
>     https://lists.sourceforge.net/lists/listinfo/viennacl-devel
>
>

Re: [ViennaCL-devel] Copying Values out of a compressed_matrix

From: Dmitriy L. <dl...@gm...> - 2016-07-22 21:44:01

yes. this seems to be the case. if i force out-of-order CSR into in-order
CSR everything seems to work. Can't see the documentation explicitly
mentioning this if this is the case indeed.

Karl, can you please confirm only in-order CSRs are supported? Thanks!
-Dmitriy

On Fri, Jul 22, 2016 at 12:57 PM, Dmitriy Lyubimov <dl...@gm...>
wrote:

> PS
> (4) column indices admit out-of-order placements of elements within each
> row.
>
> Thank you.
> -Dmitriy
>
> On Fri, Jul 22, 2016 at 12:56 PM, Dmitriy Lyubimov <dl...@gm...>
> wrote:
>
>> I think I still am getting seg faults on attempt to multiply matrices
>> even without conversion back (larger arguments, 3k x 1k)
>>
>> I re-wrote another alternative transformation procedure and see nothing
>> wrong with it. Both Andrew's code and mine fail with the same symptoms.
>>
>> Karl, can we verify assumptions about the format:
>>
>> (1) the compressed_marix.set method expects host memory pointers.
>> (2) the format is compressed row storage (CSR). Documentation never says
>> explicitly that, and actually seems to have errors in size of elements and
>> jumper arrays (it says jumper array has to be cols+1 long wheres in CSR it
>> shoud actually be rows + 1 long, right? )
>> (3) the element sizes of jumper and column indices arrays are 32 bit and
>> are in little endian order (at least for the open MP backend).
>>
>> Right now I can't even get open mp sparse multiplication work although
>> CSR format is not rocket science at all. Don't see a problem anywhere.
>> Tried to read Vienna's code to converm the assumptions above, but this
>> seems to be pretty elusive for the time being.
>>
>>
>> On Fri, Jul 22, 2016 at 10:26 AM, Andrew Palumbo <ap...@ou...>
>> wrote:
>>
>>> Yep thats it.  Oh wow- well thats just embarrassing [image: 😊].
>>>
>>>
>>> Thanks very much for your time, Karl- much appreciated.
>>>
>>>
>>> Andy
>>> ------------------------------
>>> *From:* Karl Rupp <ru...@iu...>
>>> *Sent:* Friday, July 22, 2016 12:39:20 PM
>>> *To:* Andrew Palumbo; viennacl-devel
>>> *Subject:* Re: [ViennaCL-devel] Copying Values out of a
>>> compressed_matrix
>>>
>>> Hi,
>>>
>>> your second and third arguments to memory_read() are incorrect:
>>> The second argument is the offset from the beginning, the third argument
>>> is the number of bytes to be read. Shifting the zero to the second
>>> position fixes the snippet (plus correcting the loop bounds when
>>> printing at the end) :-)
>>>
>>> Best regards,
>>> Karli
>>>
>>>
>>>
>>> On 07/22/2016 08:51 AM, Andrew Palumbo wrote:
>>> > a couple of small mistakes in the previous c++ file:
>>> >
>>> >
>>> > The memory_read(..) call should be:
>>> >
>>> >
>>> >    // read data back into our product buffers
>>> >    viennacl::backend::memory_read(handle1, product_size_row * 4, 0,
>>> > product_row_ptr, false);
>>> >    viennacl::backend::memory_read(handle2, product_NNz * 4, 0,
>>> > product_col_ptr, false);
>>> >    viennacl::backend::memory_read(handle, product_NNz * 8, 0,
>>> > product_values_ptr, false);
>>> >
>>> >
>>> > (read product_NNz * x bytes instead of product_size_row * x)
>>> >
>>> >
>>> > I've attached the corrected file.
>>> >
>>> >
>>> > Thanks
>>> >
>>> >
>>> > Andy
>>> >
>>> >
>>> ------------------------------------------------------------------------
>>> > *From:* Andrew Palumbo <ap...@ou...>
>>> > *Sent:* Thursday, July 21, 2016 11:03:59 PM
>>> > *To:* Karl Rupp; viennacl-devel
>>> > *Subject:* Re: [ViennaCL-devel] Copying Values out of a
>>> compressed_matrix
>>> >
>>> > Hello,
>>> >
>>> >
>>> > I've mocked up a sample of the compressed_matrix multiplication that
>>> > I've been working with javacpp on in C++.  I am seeing the same type of
>>> > memory errors when I try to read the data out of product, and into the
>>> > output buffers as I was with javacpp.  By printing the matrix to stdout
>>> > as in the compressed_matrix example we can see that there are values
>>> > there, and they seem reasonable,  but when i use
>>> > backend::memory_read(...)  to retrive the buffers, I'm getting values
>>> > consistent with a memory error, and similar to what i was seeing in the
>>> > javacpp code.  Maybe I am not using the handles correctly?  Admittedly
>>> > my C++ is more than rusty, but I believe I am referencing the buffers
>>> > correctly in the output.
>>> >
>>> >
>>> > Below is the output of the attached file: sparse.cpp
>>> >
>>> >
>>> > Thanks very much,
>>> >
>>> >
>>> > Andy
>>> >
>>> >
>>> >
>>> > ViennaCL: compressed_matrix of size (10, 10) with 24 nonzeros:
>>> >    (1, 2)    0.329908
>>> >    (1, 3)    0.0110522
>>> >    (1, 4)    0.336839
>>> >    (2, 5)    0.0150778
>>> >    (2, 7)    0.0143518
>>> >    (3, 3)    0.217256
>>> >    (3, 6)    0.346854
>>> >    (3, 9)    0.45353
>>> >    (4, 3)    0.407954
>>> >    (4, 6)    0.651308
>>> >    (5, 2)    0.676061
>>> >    (5, 3)    0.0226486
>>> >    (5, 4)    0.690264
>>> >    (6, 5)    0.0998838
>>> >    (6, 7)    0.0950744
>>> >    (7, 2)    0.346173
>>> >    (7, 3)    0.0115971
>>> >    (7, 4)    0.353446
>>> >    (7, 9)    0.684458
>>> >    (8, 5)    0.0448123
>>> >    (8, 7)    0.0426546
>>> >    (8, 9)    0.82782
>>> >    (9, 5)    0.295356
>>> >    (9, 7)    0.281134
>>> >
>>> > row jumpers: [
>>> > -36207072,32642,-39708721,32642,6390336,0,2012467744,32767,2012467968
>>> ,32767,4203729,]
>>> > col ptrs: [
>>> >
>>> 0,0,-39655605,32642,-36207072,32642,6390336,0,10,0,-39672717,32642,2012466352,32767,-32892691,32642,1,0,6390336,0,2012466344,32767,60002304,2059362829,]
>>> > elements: [
>>> >
>>> 0.289516,0.304161,0.795779,0.334456,0.935264,0.585813,0.871237,0.811508,0.828558,0.0271863,6.92683e-310,6.92683e-310,1.061e-313,1.061e-313,6.36599e-314,4.24399e-314,6.36599e-314,6.92683e-310,4.24399e-314,1.2732e-313,2.122e-313,6.95324e-310,0.406537,0.0495716,0.370862,]
>>> >
>>> >
>>> > and similarly for multiplication of 2 1x1 matrices:
>>> >
>>> > Result:
>>> >
>>> > ViennaCL: compressed_matrix of size (1, 1) with 1 nonzeros:
>>> >    (0, 0)    0.117699
>>> >
>>> > row jumpers: [
>>> > -717571424,32767,]
>>> > col ptrs: [
>>> > 6386240,]
>>> > elements: [
>>> > 0.289516,6.9479e-310,]
>>> >
>>> >
>>> >
>>> >
>>> >
>>> ------------------------------------------------------------------------
>>> > *From:* Andrew Palumbo <ap...@ou...>
>>> > *Sent:* Wednesday, July 20, 2016 5:40:31 PM
>>> > *To:* Karl Rupp; viennacl-devel
>>> > *Subject:* Re: [ViennaCL-devel] Copying Values out of a
>>> compressed_matrix
>>> >
>>> > Oops, sorry about not cc'ing all.
>>> >
>>> >
>>> > I do not get correct data back for a (Random.nextDouble() populated) 1
>>> x
>>> > 1 Matrix.
>>> >
>>> >
>>> > A:
>>> >
>>> >    Row Pointer: [0, 1 ]
>>> >
>>> >    Col Pointer: [0 ]
>>> >    element Pointer: [0.6465821602909256 ]
>>> >
>>> >
>>> > B:
>>> >
>>> >
>>> >    Row Pointer: [0, 1 ]
>>> >    Col Pointer: [0 ]
>>> >    element Pointer: [0.9513577109193919 ]
>>> >
>>> >
>>> > C = A %*% B
>>> >
>>> >    Row Pointer: [469762248, 32632]
>>> >    Col Pointer: [469762248 ]
>>> >    element Pointer: [6.9245198744523E-310 ]
>>> >
>>> >
>>> > ouch.
>>> >
>>> >
>>> > It looks like I'm not copying the Buffers correctly at all.  I'm may be
>>> > using the javacpp buffers incorrectly here, or I have possibly wrapped
>>> > the viennacl::backend::memory_handle class incorrectly, so I'm using a
>>> > pointer to the wrong memory from eg.
>>> viennacl::compressed_matrix::handle.
>>> >
>>> >
>>> > I mentioned before that the multiplication completed in on small <~300
>>> x
>>> > 300 matrices because if I try to multiply two larger sparse matrices,
>>> an
>>> > err the JVM crashes with a SIGSEGV.
>>> >
>>> >
>>> > Since this code is all wrapped with javacpp, I don't really have a
>>> small
>>> > sample that I can show you (not going to dump a whole bunch of code on
>>> > you).
>>> >
>>> >
>>> > I'll keep trying to figure it out.  Pretty sure the problem is on my
>>> end
>>> > here �� I really mainly wanted to ask you if I was using the correct
>>> > methods at this point, or if there was anything very obviously that I
>>> > was doing wrong.
>>> >
>>> >
>>> > Thanks a lot for your help!
>>> >
>>> >
>>> > Andy
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> ------------------------------------------------------------------------
>>> > *From:* Karl Rupp <ru...@iu...>
>>> > *Sent:* Wednesday, July 20, 2016 5:00:36 PM
>>> > *To:* Andrew Palumbo; viennacl-devel
>>> > *Subject:* Re: [ViennaCL-devel] Copying Values out of a
>>> compressed_matrix
>>> > Hi,
>>> >
>>> > please keep viennacl-devel in CC:
>>> >
>>> > Just to clarify: Do you get incorrect values for a 1-by-1 matrix as
>>> > indicated in your sample data? In your previous email you mentioned
>>> that
>>> > results are fine for small matrices...
>>> >
>>> > I'm afraid I can only guess at the source of the error with the
>>> > informations provided. Any chance that you can provide a standalone
>>> code
>>> > to reproduce the problem with reasonable effort?
>>> >
>>> > Best regards,
>>> > Karli
>>> >
>>> >
>>> >
>>> > On 07/20/2016 10:16 PM, Andrew Palumbo wrote:
>>> >> Thanks so much for your quick answer!
>>> >>
>>> >>
>>> >> I actually am sorry to say that I made a mistake when writing the last
>>> >> email, I copied the wrong signature from the VCL documentation, and
>>> then
>>> >> the mistake propagated through the rest of the e-mail.
>>> >>
>>> >>
>>> >> I am actually using viennacl::backend::memory_read().
>>> >>
>>> >>
>>> >> Eg, for the row_jumpers and column_idx  I read use:
>>> >>
>>> >> @Name("backend::memory_read")
>>> >> public static native void memoryReadInt(@Const @ByRef MemHandle
>>> src_buffer,
>>> >>                                int bytes_to_read,
>>> >>                                int offset,
>>> >>                                IntPointer ptr,
>>> >>                                boolean async);
>>> >>
>>> >> and for the Values:
>>> >>
>>> >>
>>> >> @Name("backend::memory_read")
>>> >> public static native void memoryReadDouble(@Const @ByRef MemHandle
>>> src_buffer,
>>> >>                                          int bytes_to_read,
>>> >>                                          int offset,
>>> >>                                          DoublePointer ptr,
>>> >>                                          boolean async);
>>> >>
>>> >> And then call:
>>> >>
>>> >>
>>> >> memoryReadInt(row_ptr_handle, (m +1) *4,0, row_ptr,false)
>>> >> memoryReadInt(col_idx_handle, NNz *4,0,col_idx,false)
>>> >> memoryReadDouble(element_handle, NNz *8,0, values,false)
>>> >>
>>> >>
>>> >> and after convetring them to java.nio.Buffers, am getting results
>>> like:
>>> >>
>>> >>
>>> >> rowBuff.get(1): 0    colBuff(1): 402653448 valBuff(1):
>>> 6.91730177312166E-310
>>> >>
>>> >>
>>> >> Have also tried reading into BytePointers similarly with the same type
>>> >> of results.  I know that the use of Javacpp obfuscates what the
>>> problem
>>> >> may be.  But I believe the Memorry is properly allocated.
>>> >>
>>> >>
>>> >>
>>> >> Sorry for the mistake.
>>> >>
>>> >>
>>> >> Thanks,
>>> >>
>>> >>
>>> >> Andy
>>> >>
>>> >>
>>> >>
>>> ------------------------------------------------------------------------
>>> >> *From:* Karl Rupp <ru...@iu...>
>>> >> *Sent:* Wednesday, July 20, 2016 3:50:07 PM
>>> >> *To:* Andrew Palumbo; Vie...@li...
>>> >> *Subject:* Re: [ViennaCL-devel] Copying Values out of a
>>> compressed_matrix
>>> >> Hi Andy,
>>> >>
>>> >> instead of viennacl::backend::memory_copy(), you want to use
>>> >> viennacl::backend::memory_read(), which directly transfers the data
>>> into
>>> >> your buffer(s).
>>> >>
>>> >> If you *know* that your handles are in host memory, you can even grab
>>> >> the values directly via
>>> >>    viennacl::linalg::host_based::detail::extract_raw_pointer<T>();
>>> >> defined in viennacl/linalg/host_based/common.hpp, around line 40.
>>> >>
>>> >> Please let me know if you still get errors after using that.
>>> >>
>>> >> Best regards,
>>> >> Karli
>>> >>
>>> >>
>>> >>
>>> >>
>>> >> On 07/20/2016 09:05 PM, Andrew Palumbo wrote:
>>> >>> Hello,
>>> >>>
>>> >>>
>>> >>> I'm Having some difficulties with compressed_matrix multiplication.
>>> >>>
>>> >>>
>>> >>> Essentially I am copying  three buffers, the CSR conversion of an
>>> Apache
>>> >>> Mahout SparseMatrix, into two compressed_matrices performing matrix
>>> >>> multiplication. I am doing this in scala and Java using javacpp.
>>> >>>
>>> >>>
>>> >>> For example, I have a 5 x 10 matrix of ~20% non-zero values which in
>>> CSR
>>> >>> format looks like this:
>>> >>>
>>> >>>
>>> >>> NNz: 12
>>> >>>
>>> >>> Row Pointer: [0, 1, 4, 6, 9, 12, ]
>>> >>>
>>> >>> Col Pointer: [9, 0, 8, 7, 2, 9, 0, 8, 9, 0, 3, 5, ]
>>> >>>
>>> >>> element Pointer: [0.4065367203992265, 0.04957158909682802,
>>> >>> 0.5205586068847993, 0.3708618354358446, 0.6963900565931678,
>>> >>> 0.8330915529787706, 0.32839112750638844, 0.7856168903297948,
>>> >>> 0.4265801782090245, 0.14733066454561583, 0.9501663495824946,
>>> >>> 0.9710498974366047, ]
>>> >>>
>>> >>> Multiplied by a similarly Sparse 10 x 5 compressed_matrix
>>> >>>
>>> >>> I use a CompressedMatrix wrapper which essentially wraps the
>>> >>>
>>> >>>      viennacl:: compressed_matrix (vcl_size_t rows, vcl_size_t cols,
>>> >>> vcl_size_t nonzeros=0, viennacl::context ctx=viennacl::context())
>>> >>>
>>> >>> constructor as well as the
>>> >>>
>>> >>>      compressed_matrix (matrix_expression< const compressed_matrix,
>>> >>> const compressed_matrix, op_prod > const &proxy).
>>> >>>
>>> >>> I have a helper function, /toVclCompressedMatrix/(..) which
>>> essentially
>>> >>> does the CSR conversion from a Mahout src matrix, calls the
>>> constructor
>>> >>> and uses viennacl::compressed_matrix::set(...) to set the buffers:
>>> >>>
>>> >>> val ompA =toVclCompressedMatrix(src = mxA, ompCtx)
>>> >>> val ompB =toVclCompressedMatrix(src = mxB, ompCtx)
>>> >>>
>>> >>>
>>> >>> and then create a new viennacl::compressed_matrix from the
>>> >>> viennacl::linalg::prod of the 2 matrices i.e.:
>>> >>>
>>> >>> val ompC =new CompressedMatrix(prod(ompA, ompB))
>>> >>>
>>> >>> The context in the above case is either the Host or OpenMP (I know
>>> that
>>> >>> there is some special casting of the row_jumpers and col_idxs that
>>> needs
>>> >>> to be done in the OpenCL version)
>>> >>>
>>> >>> The Matrix multiplication completes without error on small Matrices
>>> eg.
>>> >>> < 300 x 300
>>> >>> but seems to overwrite the resulting buffers on larger Matrices.
>>> >>>
>>> >>> My real problem, though is getting the memory back out of the
>>> >>> resulting`ompC` compresed_matrix so that i can write it back to a
>>> mahout
>>> >>> SparseMatrix.
>>> >>>
>>> >>> currently I am using:
>>> >>>
>>> >>> void viennacl::backend::memory_copy (mem_handle const &  src_buffer,
>>> >>>          mem_handle &      dst_buffer,
>>> >>>          vcl_size_t      src_offset,
>>> >>>          vcl_size_t      dst_offset,
>>> >>>          vcl_size_t      bytes_to_copy
>>> >>>      )
>>> >>>
>>> >>> on ompC.handel1,ompC.handel2 and ompC.handel source handels
>>> >>>
>>> >>> to copy into pre-allocated  row_jumper,  col_index and element
>>> buffers
>>> >>> (of size ompC.size1() + 1, ompC.nnz and ompC.nnz, respectivly).
>>> >>>
>>> >>> I am getting nonsensical values back that one would expect from
>>> memory
>>> >>> errors. eg:
>>> >>>
>>> >>> the Matrix geometry of the result: ompC.size1(), and omp.size2() are
>>> >>> correct and ompC.nnz is a reasonable value.
>>> >>>
>>> >>> It is possible that I have mis-allocated some of the memory on my
>>> side,
>>> >>> but I am pretty sure that most of the Buffers are allocated correctly
>>> >>> (usually JavaCPP does a pretty good job of this).
>>> >>>
>>> >>>
>>> >>> I guess, long story short, my question is am i using the correct
>>> method
>>> >>> of copying the memory out of a compressed_matrix?  is there something
>>> >>> glaringly incorrect that i am doing here?  Should I be using
>>> >>> viennacl::backend::memory_copy or is there a different method that i
>>> >>> should be using?
>>> >>>
>>> >>>
>>> >>> Thanks very much,
>>> >>>
>>> >>> Andy
>>> >>>
>>> >>>
>>> >>>
>>> >>>
>>> >>>
>>> >>>
>>> >>>
>>> >>>
>>> >>>
>>> >>>
>>> >>>
>>> ------------------------------------------------------------------------------
>>> >>> What NetFlow Analyzer can do for you? Monitors network bandwidth and
>>> traffic
>>> >>> patterns at an interface-level. Reveals which users, apps, and
>>> protocols are
>>> >>> consuming the most bandwidth. Provides multi-vendor support for
>>> NetFlow,
>>> >>> J-Flow, sFlow and other flows. Make informed decisions using
>>> capacity planning
>>> >>> reports.http://sdm.link/zohodev2dev
>>> >>>
>>> >>>
>>> >>>
>>> >>> _______________________________________________
>>> >>> ViennaCL-devel mailing list
>>> >>> Vie...@li...
>>> >>>https://lists.sourceforge.net/lists/listinfo/viennacl-devel
>>> >>>
>>> >>
>>> >
>>>
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> What NetFlow Analyzer can do for you? Monitors network bandwidth and
>>> traffic
>>> patterns at an interface-level. Reveals which users, apps, and protocols
>>> are
>>> consuming the most bandwidth. Provides multi-vendor support for NetFlow,
>>> J-Flow, sFlow and other flows. Make informed decisions using capacity
>>> planning
>>> reports.http://sdm.link/zohodev2dev
>>> _______________________________________________
>>> ViennaCL-devel mailing list
>>> Vie...@li...
>>> https://lists.sourceforge.net/lists/listinfo/viennacl-devel
>>>
>>>
>>
>

Re: [ViennaCL-devel] Copying Values out of a compressed_matrix

From: Dmitriy L. <dl...@gm...> - 2016-07-22 21:23:13

On Fri, Jul 22, 2016 at 12:57 PM, Dmitriy Lyubimov <dl...@gm...>
 wrote:

>
> (2) the format is compressed row storage (CSR). Documentation never says
>> explicitly that,
>>
>
Correction: reference documentation of the set () method does not mention
it.

The manual does say compressed_matrix in general operates on CSR.

Re: [ViennaCL-devel] Copying Values out of a compressed_matrix

From: Dmitriy L. <dl...@gm...> - 2016-07-22 19:57:48

PS
(4) column indices admit out-of-order placements of elements within each
row.

Thank you.
-Dmitriy

On Fri, Jul 22, 2016 at 12:56 PM, Dmitriy Lyubimov <dl...@gm...>
wrote:

> I think I still am getting seg faults on attempt to multiply matrices even
> without conversion back (larger arguments, 3k x 1k)
>
> I re-wrote another alternative transformation procedure and see nothing
> wrong with it. Both Andrew's code and mine fail with the same symptoms.
>
> Karl, can we verify assumptions about the format:
>
> (1) the compressed_marix.set method expects host memory pointers.
> (2) the format is compressed row storage (CSR). Documentation never says
> explicitly that, and actually seems to have errors in size of elements and
> jumper arrays (it says jumper array has to be cols+1 long wheres in CSR it
> shoud actually be rows + 1 long, right? )
> (3) the element sizes of jumper and column indices arrays are 32 bit and
> are in little endian order (at least for the open MP backend).
>
> Right now I can't even get open mp sparse multiplication work although CSR
> format is not rocket science at all. Don't see a problem anywhere. Tried to
> read Vienna's code to converm the assumptions above, but this seems to be
> pretty elusive for the time being.
>
>
> On Fri, Jul 22, 2016 at 10:26 AM, Andrew Palumbo <ap...@ou...>
> wrote:
>
>> Yep thats it.  Oh wow- well thats just embarrassing [image: 😊].
>>
>>
>> Thanks very much for your time, Karl- much appreciated.
>>
>>
>> Andy
>> ------------------------------
>> *From:* Karl Rupp <ru...@iu...>
>> *Sent:* Friday, July 22, 2016 12:39:20 PM
>> *To:* Andrew Palumbo; viennacl-devel
>> *Subject:* Re: [ViennaCL-devel] Copying Values out of a compressed_matrix
>>
>> Hi,
>>
>> your second and third arguments to memory_read() are incorrect:
>> The second argument is the offset from the beginning, the third argument
>> is the number of bytes to be read. Shifting the zero to the second
>> position fixes the snippet (plus correcting the loop bounds when
>> printing at the end) :-)
>>
>> Best regards,
>> Karli
>>
>>
>>
>> On 07/22/2016 08:51 AM, Andrew Palumbo wrote:
>> > a couple of small mistakes in the previous c++ file:
>> >
>> >
>> > The memory_read(..) call should be:
>> >
>> >
>> >    // read data back into our product buffers
>> >    viennacl::backend::memory_read(handle1, product_size_row * 4, 0,
>> > product_row_ptr, false);
>> >    viennacl::backend::memory_read(handle2, product_NNz * 4, 0,
>> > product_col_ptr, false);
>> >    viennacl::backend::memory_read(handle, product_NNz * 8, 0,
>> > product_values_ptr, false);
>> >
>> >
>> > (read product_NNz * x bytes instead of product_size_row * x)
>> >
>> >
>> > I've attached the corrected file.
>> >
>> >
>> > Thanks
>> >
>> >
>> > Andy
>> >
>> > ------------------------------------------------------------------------
>> > *From:* Andrew Palumbo <ap...@ou...>
>> > *Sent:* Thursday, July 21, 2016 11:03:59 PM
>> > *To:* Karl Rupp; viennacl-devel
>> > *Subject:* Re: [ViennaCL-devel] Copying Values out of a
>> compressed_matrix
>> >
>> > Hello,
>> >
>> >
>> > I've mocked up a sample of the compressed_matrix multiplication that
>> > I've been working with javacpp on in C++.  I am seeing the same type of
>> > memory errors when I try to read the data out of product, and into the
>> > output buffers as I was with javacpp.  By printing the matrix to stdout
>> > as in the compressed_matrix example we can see that there are values
>> > there, and they seem reasonable,  but when i use
>> > backend::memory_read(...)  to retrive the buffers, I'm getting values
>> > consistent with a memory error, and similar to what i was seeing in the
>> > javacpp code.  Maybe I am not using the handles correctly?  Admittedly
>> > my C++ is more than rusty, but I believe I am referencing the buffers
>> > correctly in the output.
>> >
>> >
>> > Below is the output of the attached file: sparse.cpp
>> >
>> >
>> > Thanks very much,
>> >
>> >
>> > Andy
>> >
>> >
>> >
>> > ViennaCL: compressed_matrix of size (10, 10) with 24 nonzeros:
>> >    (1, 2)    0.329908
>> >    (1, 3)    0.0110522
>> >    (1, 4)    0.336839
>> >    (2, 5)    0.0150778
>> >    (2, 7)    0.0143518
>> >    (3, 3)    0.217256
>> >    (3, 6)    0.346854
>> >    (3, 9)    0.45353
>> >    (4, 3)    0.407954
>> >    (4, 6)    0.651308
>> >    (5, 2)    0.676061
>> >    (5, 3)    0.0226486
>> >    (5, 4)    0.690264
>> >    (6, 5)    0.0998838
>> >    (6, 7)    0.0950744
>> >    (7, 2)    0.346173
>> >    (7, 3)    0.0115971
>> >    (7, 4)    0.353446
>> >    (7, 9)    0.684458
>> >    (8, 5)    0.0448123
>> >    (8, 7)    0.0426546
>> >    (8, 9)    0.82782
>> >    (9, 5)    0.295356
>> >    (9, 7)    0.281134
>> >
>> > row jumpers: [
>> > -36207072,32642,-39708721,32642,6390336,0,2012467744,32767,2012467968
>> ,32767,4203729,]
>> > col ptrs: [
>> >
>> 0,0,-39655605,32642,-36207072,32642,6390336,0,10,0,-39672717,32642,2012466352,32767,-32892691,32642,1,0,6390336,0,2012466344,32767,60002304,2059362829,]
>> > elements: [
>> >
>> 0.289516,0.304161,0.795779,0.334456,0.935264,0.585813,0.871237,0.811508,0.828558,0.0271863,6.92683e-310,6.92683e-310,1.061e-313,1.061e-313,6.36599e-314,4.24399e-314,6.36599e-314,6.92683e-310,4.24399e-314,1.2732e-313,2.122e-313,6.95324e-310,0.406537,0.0495716,0.370862,]
>> >
>> >
>> > and similarly for multiplication of 2 1x1 matrices:
>> >
>> > Result:
>> >
>> > ViennaCL: compressed_matrix of size (1, 1) with 1 nonzeros:
>> >    (0, 0)    0.117699
>> >
>> > row jumpers: [
>> > -717571424,32767,]
>> > col ptrs: [
>> > 6386240,]
>> > elements: [
>> > 0.289516,6.9479e-310,]
>> >
>> >
>> >
>> >
>> > ------------------------------------------------------------------------
>> > *From:* Andrew Palumbo <ap...@ou...>
>> > *Sent:* Wednesday, July 20, 2016 5:40:31 PM
>> > *To:* Karl Rupp; viennacl-devel
>> > *Subject:* Re: [ViennaCL-devel] Copying Values out of a
>> compressed_matrix
>> >
>> > Oops, sorry about not cc'ing all.
>> >
>> >
>> > I do not get correct data back for a (Random.nextDouble() populated) 1 x
>> > 1 Matrix.
>> >
>> >
>> > A:
>> >
>> >    Row Pointer: [0, 1 ]
>> >
>> >    Col Pointer: [0 ]
>> >    element Pointer: [0.6465821602909256 ]
>> >
>> >
>> > B:
>> >
>> >
>> >    Row Pointer: [0, 1 ]
>> >    Col Pointer: [0 ]
>> >    element Pointer: [0.9513577109193919 ]
>> >
>> >
>> > C = A %*% B
>> >
>> >    Row Pointer: [469762248, 32632]
>> >    Col Pointer: [469762248 ]
>> >    element Pointer: [6.9245198744523E-310 ]
>> >
>> >
>> > ouch.
>> >
>> >
>> > It looks like I'm not copying the Buffers correctly at all.  I'm may be
>> > using the javacpp buffers incorrectly here, or I have possibly wrapped
>> > the viennacl::backend::memory_handle class incorrectly, so I'm using a
>> > pointer to the wrong memory from eg.
>> viennacl::compressed_matrix::handle.
>> >
>> >
>> > I mentioned before that the multiplication completed in on small <~300 x
>> > 300 matrices because if I try to multiply two larger sparse matrices, an
>> > err the JVM crashes with a SIGSEGV.
>> >
>> >
>> > Since this code is all wrapped with javacpp, I don't really have a small
>> > sample that I can show you (not going to dump a whole bunch of code on
>> > you).
>> >
>> >
>> > I'll keep trying to figure it out.  Pretty sure the problem is on my end
>> > here �� I really mainly wanted to ask you if I was using the correct
>> > methods at this point, or if there was anything very obviously that I
>> > was doing wrong.
>> >
>> >
>> > Thanks a lot for your help!
>> >
>> >
>> > Andy
>> >
>> >
>> >
>> >
>> >
>> >
>> > ------------------------------------------------------------------------
>> > *From:* Karl Rupp <ru...@iu...>
>> > *Sent:* Wednesday, July 20, 2016 5:00:36 PM
>> > *To:* Andrew Palumbo; viennacl-devel
>> > *Subject:* Re: [ViennaCL-devel] Copying Values out of a
>> compressed_matrix
>> > Hi,
>> >
>> > please keep viennacl-devel in CC:
>> >
>> > Just to clarify: Do you get incorrect values for a 1-by-1 matrix as
>> > indicated in your sample data? In your previous email you mentioned that
>> > results are fine for small matrices...
>> >
>> > I'm afraid I can only guess at the source of the error with the
>> > informations provided. Any chance that you can provide a standalone code
>> > to reproduce the problem with reasonable effort?
>> >
>> > Best regards,
>> > Karli
>> >
>> >
>> >
>> > On 07/20/2016 10:16 PM, Andrew Palumbo wrote:
>> >> Thanks so much for your quick answer!
>> >>
>> >>
>> >> I actually am sorry to say that I made a mistake when writing the last
>> >> email, I copied the wrong signature from the VCL documentation, and
>> then
>> >> the mistake propagated through the rest of the e-mail.
>> >>
>> >>
>> >> I am actually using viennacl::backend::memory_read().
>> >>
>> >>
>> >> Eg, for the row_jumpers and column_idx  I read use:
>> >>
>> >> @Name("backend::memory_read")
>> >> public static native void memoryReadInt(@Const @ByRef MemHandle
>> src_buffer,
>> >>                                int bytes_to_read,
>> >>                                int offset,
>> >>                                IntPointer ptr,
>> >>                                boolean async);
>> >>
>> >> and for the Values:
>> >>
>> >>
>> >> @Name("backend::memory_read")
>> >> public static native void memoryReadDouble(@Const @ByRef MemHandle
>> src_buffer,
>> >>                                          int bytes_to_read,
>> >>                                          int offset,
>> >>                                          DoublePointer ptr,
>> >>                                          boolean async);
>> >>
>> >> And then call:
>> >>
>> >>
>> >> memoryReadInt(row_ptr_handle, (m +1) *4,0, row_ptr,false)
>> >> memoryReadInt(col_idx_handle, NNz *4,0,col_idx,false)
>> >> memoryReadDouble(element_handle, NNz *8,0, values,false)
>> >>
>> >>
>> >> and after convetring them to java.nio.Buffers, am getting results like:
>> >>
>> >>
>> >> rowBuff.get(1): 0    colBuff(1): 402653448 valBuff(1):
>> 6.91730177312166E-310
>> >>
>> >>
>> >> Have also tried reading into BytePointers similarly with the same type
>> >> of results.  I know that the use of Javacpp obfuscates what the problem
>> >> may be.  But I believe the Memorry is properly allocated.
>> >>
>> >>
>> >>
>> >> Sorry for the mistake.
>> >>
>> >>
>> >> Thanks,
>> >>
>> >>
>> >> Andy
>> >>
>> >>
>> >>
>> ------------------------------------------------------------------------
>> >> *From:* Karl Rupp <ru...@iu...>
>> >> *Sent:* Wednesday, July 20, 2016 3:50:07 PM
>> >> *To:* Andrew Palumbo; Vie...@li...
>> >> *Subject:* Re: [ViennaCL-devel] Copying Values out of a
>> compressed_matrix
>> >> Hi Andy,
>> >>
>> >> instead of viennacl::backend::memory_copy(), you want to use
>> >> viennacl::backend::memory_read(), which directly transfers the data
>> into
>> >> your buffer(s).
>> >>
>> >> If you *know* that your handles are in host memory, you can even grab
>> >> the values directly via
>> >>    viennacl::linalg::host_based::detail::extract_raw_pointer<T>();
>> >> defined in viennacl/linalg/host_based/common.hpp, around line 40.
>> >>
>> >> Please let me know if you still get errors after using that.
>> >>
>> >> Best regards,
>> >> Karli
>> >>
>> >>
>> >>
>> >>
>> >> On 07/20/2016 09:05 PM, Andrew Palumbo wrote:
>> >>> Hello,
>> >>>
>> >>>
>> >>> I'm Having some difficulties with compressed_matrix multiplication.
>> >>>
>> >>>
>> >>> Essentially I am copying  three buffers, the CSR conversion of an
>> Apache
>> >>> Mahout SparseMatrix, into two compressed_matrices performing matrix
>> >>> multiplication. I am doing this in scala and Java using javacpp.
>> >>>
>> >>>
>> >>> For example, I have a 5 x 10 matrix of ~20% non-zero values which in
>> CSR
>> >>> format looks like this:
>> >>>
>> >>>
>> >>> NNz: 12
>> >>>
>> >>> Row Pointer: [0, 1, 4, 6, 9, 12, ]
>> >>>
>> >>> Col Pointer: [9, 0, 8, 7, 2, 9, 0, 8, 9, 0, 3, 5, ]
>> >>>
>> >>> element Pointer: [0.4065367203992265, 0.04957158909682802,
>> >>> 0.5205586068847993, 0.3708618354358446, 0.6963900565931678,
>> >>> 0.8330915529787706, 0.32839112750638844, 0.7856168903297948,
>> >>> 0.4265801782090245, 0.14733066454561583, 0.9501663495824946,
>> >>> 0.9710498974366047, ]
>> >>>
>> >>> Multiplied by a similarly Sparse 10 x 5 compressed_matrix
>> >>>
>> >>> I use a CompressedMatrix wrapper which essentially wraps the
>> >>>
>> >>>      viennacl:: compressed_matrix (vcl_size_t rows, vcl_size_t cols,
>> >>> vcl_size_t nonzeros=0, viennacl::context ctx=viennacl::context())
>> >>>
>> >>> constructor as well as the
>> >>>
>> >>>      compressed_matrix (matrix_expression< const compressed_matrix,
>> >>> const compressed_matrix, op_prod > const &proxy).
>> >>>
>> >>> I have a helper function, /toVclCompressedMatrix/(..) which
>> essentially
>> >>> does the CSR conversion from a Mahout src matrix, calls the
>> constructor
>> >>> and uses viennacl::compressed_matrix::set(...) to set the buffers:
>> >>>
>> >>> val ompA =toVclCompressedMatrix(src = mxA, ompCtx)
>> >>> val ompB =toVclCompressedMatrix(src = mxB, ompCtx)
>> >>>
>> >>>
>> >>> and then create a new viennacl::compressed_matrix from the
>> >>> viennacl::linalg::prod of the 2 matrices i.e.:
>> >>>
>> >>> val ompC =new CompressedMatrix(prod(ompA, ompB))
>> >>>
>> >>> The context in the above case is either the Host or OpenMP (I know
>> that
>> >>> there is some special casting of the row_jumpers and col_idxs that
>> needs
>> >>> to be done in the OpenCL version)
>> >>>
>> >>> The Matrix multiplication completes without error on small Matrices
>> eg.
>> >>> < 300 x 300
>> >>> but seems to overwrite the resulting buffers on larger Matrices.
>> >>>
>> >>> My real problem, though is getting the memory back out of the
>> >>> resulting`ompC` compresed_matrix so that i can write it back to a
>> mahout
>> >>> SparseMatrix.
>> >>>
>> >>> currently I am using:
>> >>>
>> >>> void viennacl::backend::memory_copy (mem_handle const &  src_buffer,
>> >>>          mem_handle &      dst_buffer,
>> >>>          vcl_size_t      src_offset,
>> >>>          vcl_size_t      dst_offset,
>> >>>          vcl_size_t      bytes_to_copy
>> >>>      )
>> >>>
>> >>> on ompC.handel1,ompC.handel2 and ompC.handel source handels
>> >>>
>> >>> to copy into pre-allocated  row_jumper,  col_index and element buffers
>> >>> (of size ompC.size1() + 1, ompC.nnz and ompC.nnz, respectivly).
>> >>>
>> >>> I am getting nonsensical values back that one would expect from memory
>> >>> errors. eg:
>> >>>
>> >>> the Matrix geometry of the result: ompC.size1(), and omp.size2() are
>> >>> correct and ompC.nnz is a reasonable value.
>> >>>
>> >>> It is possible that I have mis-allocated some of the memory on my
>> side,
>> >>> but I am pretty sure that most of the Buffers are allocated correctly
>> >>> (usually JavaCPP does a pretty good job of this).
>> >>>
>> >>>
>> >>> I guess, long story short, my question is am i using the correct
>> method
>> >>> of copying the memory out of a compressed_matrix?  is there something
>> >>> glaringly incorrect that i am doing here?  Should I be using
>> >>> viennacl::backend::memory_copy or is there a different method that i
>> >>> should be using?
>> >>>
>> >>>
>> >>> Thanks very much,
>> >>>
>> >>> Andy
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>>
>> ------------------------------------------------------------------------------
>> >>> What NetFlow Analyzer can do for you? Monitors network bandwidth and
>> traffic
>> >>> patterns at an interface-level. Reveals which users, apps, and
>> protocols are
>> >>> consuming the most bandwidth. Provides multi-vendor support for
>> NetFlow,
>> >>> J-Flow, sFlow and other flows. Make informed decisions using capacity
>> planning
>> >>> reports.http://sdm.link/zohodev2dev
>> >>>
>> >>>
>> >>>
>> >>> _______________________________________________
>> >>> ViennaCL-devel mailing list
>> >>> Vie...@li...
>> >>>https://lists.sourceforge.net/lists/listinfo/viennacl-devel
>> >>>
>> >>
>> >
>>
>>
>>
>> ------------------------------------------------------------------------------
>> What NetFlow Analyzer can do for you? Monitors network bandwidth and
>> traffic
>> patterns at an interface-level. Reveals which users, apps, and protocols
>> are
>> consuming the most bandwidth. Provides multi-vendor support for NetFlow,
>> J-Flow, sFlow and other flows. Make informed decisions using capacity
>> planning
>> reports.http://sdm.link/zohodev2dev
>> _______________________________________________
>> ViennaCL-devel mailing list
>> Vie...@li...
>> https://lists.sourceforge.net/lists/listinfo/viennacl-devel
>>
>>
>

Re: [ViennaCL-devel] Copying Values out of a compressed_matrix

From: Dmitriy L. <dl...@gm...> - 2016-07-22 19:56:24

I think I still am getting seg faults on attempt to multiply matrices even
without conversion back (larger arguments, 3k x 1k)

I re-wrote another alternative transformation procedure and see nothing
wrong with it. Both Andrew's code and mine fail with the same symptoms.

Karl, can we verify assumptions about the format:

(1) the compressed_marix.set method expects host memory pointers.
(2) the format is compressed row storage (CSR). Documentation never says
explicitly that, and actually seems to have errors in size of elements and
jumper arrays (it says jumper array has to be cols+1 long wheres in CSR it
shoud actually be rows + 1 long, right? )
(3) the element sizes of jumper and column indices arrays are 32 bit and
are in little endian order (at least for the open MP backend).

Right now I can't even get open mp sparse multiplication work although CSR
format is not rocket science at all. Don't see a problem anywhere. Tried to
read Vienna's code to converm the assumptions above, but this seems to be
pretty elusive for the time being.


On Fri, Jul 22, 2016 at 10:26 AM, Andrew Palumbo <ap...@ou...> wrote:

> Yep thats it.  Oh wow- well thats just embarrassing [image: 😊].
>
>
> Thanks very much for your time, Karl- much appreciated.
>
>
> Andy
> ------------------------------
> *From:* Karl Rupp <ru...@iu...>
> *Sent:* Friday, July 22, 2016 12:39:20 PM
> *To:* Andrew Palumbo; viennacl-devel
> *Subject:* Re: [ViennaCL-devel] Copying Values out of a compressed_matrix
>
> Hi,
>
> your second and third arguments to memory_read() are incorrect:
> The second argument is the offset from the beginning, the third argument
> is the number of bytes to be read. Shifting the zero to the second
> position fixes the snippet (plus correcting the loop bounds when
> printing at the end) :-)
>
> Best regards,
> Karli
>
>
>
> On 07/22/2016 08:51 AM, Andrew Palumbo wrote:
> > a couple of small mistakes in the previous c++ file:
> >
> >
> > The memory_read(..) call should be:
> >
> >
> >    // read data back into our product buffers
> >    viennacl::backend::memory_read(handle1, product_size_row * 4, 0,
> > product_row_ptr, false);
> >    viennacl::backend::memory_read(handle2, product_NNz * 4, 0,
> > product_col_ptr, false);
> >    viennacl::backend::memory_read(handle, product_NNz * 8, 0,
> > product_values_ptr, false);
> >
> >
> > (read product_NNz * x bytes instead of product_size_row * x)
> >
> >
> > I've attached the corrected file.
> >
> >
> > Thanks
> >
> >
> > Andy
> >
> > ------------------------------------------------------------------------
> > *From:* Andrew Palumbo <ap...@ou...>
> > *Sent:* Thursday, July 21, 2016 11:03:59 PM
> > *To:* Karl Rupp; viennacl-devel
> > *Subject:* Re: [ViennaCL-devel] Copying Values out of a compressed_matrix
> >
> > Hello,
> >
> >
> > I've mocked up a sample of the compressed_matrix multiplication that
> > I've been working with javacpp on in C++.  I am seeing the same type of
> > memory errors when I try to read the data out of product, and into the
> > output buffers as I was with javacpp.  By printing the matrix to stdout
> > as in the compressed_matrix example we can see that there are values
> > there, and they seem reasonable,  but when i use
> > backend::memory_read(...)  to retrive the buffers, I'm getting values
> > consistent with a memory error, and similar to what i was seeing in the
> > javacpp code.  Maybe I am not using the handles correctly?  Admittedly
> > my C++ is more than rusty, but I believe I am referencing the buffers
> > correctly in the output.
> >
> >
> > Below is the output of the attached file: sparse.cpp
> >
> >
> > Thanks very much,
> >
> >
> > Andy
> >
> >
> >
> > ViennaCL: compressed_matrix of size (10, 10) with 24 nonzeros:
> >    (1, 2)    0.329908
> >    (1, 3)    0.0110522
> >    (1, 4)    0.336839
> >    (2, 5)    0.0150778
> >    (2, 7)    0.0143518
> >    (3, 3)    0.217256
> >    (3, 6)    0.346854
> >    (3, 9)    0.45353
> >    (4, 3)    0.407954
> >    (4, 6)    0.651308
> >    (5, 2)    0.676061
> >    (5, 3)    0.0226486
> >    (5, 4)    0.690264
> >    (6, 5)    0.0998838
> >    (6, 7)    0.0950744
> >    (7, 2)    0.346173
> >    (7, 3)    0.0115971
> >    (7, 4)    0.353446
> >    (7, 9)    0.684458
> >    (8, 5)    0.0448123
> >    (8, 7)    0.0426546
> >    (8, 9)    0.82782
> >    (9, 5)    0.295356
> >    (9, 7)    0.281134
> >
> > row jumpers: [
> > -36207072,32642,-39708721,32642,6390336,0,2012467744,32767,2012467968
> ,32767,4203729,]
> > col ptrs: [
> >
> 0,0,-39655605,32642,-36207072,32642,6390336,0,10,0,-39672717,32642,2012466352,32767,-32892691,32642,1,0,6390336,0,2012466344,32767,60002304,2059362829,]
> > elements: [
> >
> 0.289516,0.304161,0.795779,0.334456,0.935264,0.585813,0.871237,0.811508,0.828558,0.0271863,6.92683e-310,6.92683e-310,1.061e-313,1.061e-313,6.36599e-314,4.24399e-314,6.36599e-314,6.92683e-310,4.24399e-314,1.2732e-313,2.122e-313,6.95324e-310,0.406537,0.0495716,0.370862,]
> >
> >
> > and similarly for multiplication of 2 1x1 matrices:
> >
> > Result:
> >
> > ViennaCL: compressed_matrix of size (1, 1) with 1 nonzeros:
> >    (0, 0)    0.117699
> >
> > row jumpers: [
> > -717571424,32767,]
> > col ptrs: [
> > 6386240,]
> > elements: [
> > 0.289516,6.9479e-310,]
> >
> >
> >
> >
> > ------------------------------------------------------------------------
> > *From:* Andrew Palumbo <ap...@ou...>
> > *Sent:* Wednesday, July 20, 2016 5:40:31 PM
> > *To:* Karl Rupp; viennacl-devel
> > *Subject:* Re: [ViennaCL-devel] Copying Values out of a compressed_matrix
> >
> > Oops, sorry about not cc'ing all.
> >
> >
> > I do not get correct data back for a (Random.nextDouble() populated) 1 x
> > 1 Matrix.
> >
> >
> > A:
> >
> >    Row Pointer: [0, 1 ]
> >
> >    Col Pointer: [0 ]
> >    element Pointer: [0.6465821602909256 ]
> >
> >
> > B:
> >
> >
> >    Row Pointer: [0, 1 ]
> >    Col Pointer: [0 ]
> >    element Pointer: [0.9513577109193919 ]
> >
> >
> > C = A %*% B
> >
> >    Row Pointer: [469762248, 32632]
> >    Col Pointer: [469762248 ]
> >    element Pointer: [6.9245198744523E-310 ]
> >
> >
> > ouch.
> >
> >
> > It looks like I'm not copying the Buffers correctly at all.  I'm may be
> > using the javacpp buffers incorrectly here, or I have possibly wrapped
> > the viennacl::backend::memory_handle class incorrectly, so I'm using a
> > pointer to the wrong memory from eg. viennacl::compressed_matrix::handle.
> >
> >
> > I mentioned before that the multiplication completed in on small <~300 x
> > 300 matrices because if I try to multiply two larger sparse matrices, an
> > err the JVM crashes with a SIGSEGV.
> >
> >
> > Since this code is all wrapped with javacpp, I don't really have a small
> > sample that I can show you (not going to dump a whole bunch of code on
> > you).
> >
> >
> > I'll keep trying to figure it out.  Pretty sure the problem is on my end
> > here �� I really mainly wanted to ask you if I was using the correct
> > methods at this point, or if there was anything very obviously that I
> > was doing wrong.
> >
> >
> > Thanks a lot for your help!
> >
> >
> > Andy
> >
> >
> >
> >
> >
> >
> > ------------------------------------------------------------------------
> > *From:* Karl Rupp <ru...@iu...>
> > *Sent:* Wednesday, July 20, 2016 5:00:36 PM
> > *To:* Andrew Palumbo; viennacl-devel
> > *Subject:* Re: [ViennaCL-devel] Copying Values out of a compressed_matrix
> > Hi,
> >
> > please keep viennacl-devel in CC:
> >
> > Just to clarify: Do you get incorrect values for a 1-by-1 matrix as
> > indicated in your sample data? In your previous email you mentioned that
> > results are fine for small matrices...
> >
> > I'm afraid I can only guess at the source of the error with the
> > informations provided. Any chance that you can provide a standalone code
> > to reproduce the problem with reasonable effort?
> >
> > Best regards,
> > Karli
> >
> >
> >
> > On 07/20/2016 10:16 PM, Andrew Palumbo wrote:
> >> Thanks so much for your quick answer!
> >>
> >>
> >> I actually am sorry to say that I made a mistake when writing the last
> >> email, I copied the wrong signature from the VCL documentation, and then
> >> the mistake propagated through the rest of the e-mail.
> >>
> >>
> >> I am actually using viennacl::backend::memory_read().
> >>
> >>
> >> Eg, for the row_jumpers and column_idx  I read use:
> >>
> >> @Name("backend::memory_read")
> >> public static native void memoryReadInt(@Const @ByRef MemHandle
> src_buffer,
> >>                                int bytes_to_read,
> >>                                int offset,
> >>                                IntPointer ptr,
> >>                                boolean async);
> >>
> >> and for the Values:
> >>
> >>
> >> @Name("backend::memory_read")
> >> public static native void memoryReadDouble(@Const @ByRef MemHandle
> src_buffer,
> >>                                          int bytes_to_read,
> >>                                          int offset,
> >>                                          DoublePointer ptr,
> >>                                          boolean async);
> >>
> >> And then call:
> >>
> >>
> >> memoryReadInt(row_ptr_handle, (m +1) *4,0, row_ptr,false)
> >> memoryReadInt(col_idx_handle, NNz *4,0,col_idx,false)
> >> memoryReadDouble(element_handle, NNz *8,0, values,false)
> >>
> >>
> >> and after convetring them to java.nio.Buffers, am getting results like:
> >>
> >>
> >> rowBuff.get(1): 0    colBuff(1): 402653448 valBuff(1):
> 6.91730177312166E-310
> >>
> >>
> >> Have also tried reading into BytePointers similarly with the same type
> >> of results.  I know that the use of Javacpp obfuscates what the problem
> >> may be.  But I believe the Memorry is properly allocated.
> >>
> >>
> >>
> >> Sorry for the mistake.
> >>
> >>
> >> Thanks,
> >>
> >>
> >> Andy
> >>
> >>
> >> ------------------------------------------------------------------------
> >> *From:* Karl Rupp <ru...@iu...>
> >> *Sent:* Wednesday, July 20, 2016 3:50:07 PM
> >> *To:* Andrew Palumbo; Vie...@li...
> >> *Subject:* Re: [ViennaCL-devel] Copying Values out of a
> compressed_matrix
> >> Hi Andy,
> >>
> >> instead of viennacl::backend::memory_copy(), you want to use
> >> viennacl::backend::memory_read(), which directly transfers the data into
> >> your buffer(s).
> >>
> >> If you *know* that your handles are in host memory, you can even grab
> >> the values directly via
> >>    viennacl::linalg::host_based::detail::extract_raw_pointer<T>();
> >> defined in viennacl/linalg/host_based/common.hpp, around line 40.
> >>
> >> Please let me know if you still get errors after using that.
> >>
> >> Best regards,
> >> Karli
> >>
> >>
> >>
> >>
> >> On 07/20/2016 09:05 PM, Andrew Palumbo wrote:
> >>> Hello,
> >>>
> >>>
> >>> I'm Having some difficulties with compressed_matrix multiplication.
> >>>
> >>>
> >>> Essentially I am copying  three buffers, the CSR conversion of an
> Apache
> >>> Mahout SparseMatrix, into two compressed_matrices performing matrix
> >>> multiplication. I am doing this in scala and Java using javacpp.
> >>>
> >>>
> >>> For example, I have a 5 x 10 matrix of ~20% non-zero values which in
> CSR
> >>> format looks like this:
> >>>
> >>>
> >>> NNz: 12
> >>>
> >>> Row Pointer: [0, 1, 4, 6, 9, 12, ]
> >>>
> >>> Col Pointer: [9, 0, 8, 7, 2, 9, 0, 8, 9, 0, 3, 5, ]
> >>>
> >>> element Pointer: [0.4065367203992265, 0.04957158909682802,
> >>> 0.5205586068847993, 0.3708618354358446, 0.6963900565931678,
> >>> 0.8330915529787706, 0.32839112750638844, 0.7856168903297948,
> >>> 0.4265801782090245, 0.14733066454561583, 0.9501663495824946,
> >>> 0.9710498974366047, ]
> >>>
> >>> Multiplied by a similarly Sparse 10 x 5 compressed_matrix
> >>>
> >>> I use a CompressedMatrix wrapper which essentially wraps the
> >>>
> >>>      viennacl:: compressed_matrix (vcl_size_t rows, vcl_size_t cols,
> >>> vcl_size_t nonzeros=0, viennacl::context ctx=viennacl::context())
> >>>
> >>> constructor as well as the
> >>>
> >>>      compressed_matrix (matrix_expression< const compressed_matrix,
> >>> const compressed_matrix, op_prod > const &proxy).
> >>>
> >>> I have a helper function, /toVclCompressedMatrix/(..) which essentially
> >>> does the CSR conversion from a Mahout src matrix, calls the constructor
> >>> and uses viennacl::compressed_matrix::set(...) to set the buffers:
> >>>
> >>> val ompA =toVclCompressedMatrix(src = mxA, ompCtx)
> >>> val ompB =toVclCompressedMatrix(src = mxB, ompCtx)
> >>>
> >>>
> >>> and then create a new viennacl::compressed_matrix from the
> >>> viennacl::linalg::prod of the 2 matrices i.e.:
> >>>
> >>> val ompC =new CompressedMatrix(prod(ompA, ompB))
> >>>
> >>> The context in the above case is either the Host or OpenMP (I know that
> >>> there is some special casting of the row_jumpers and col_idxs that
> needs
> >>> to be done in the OpenCL version)
> >>>
> >>> The Matrix multiplication completes without error on small Matrices eg.
> >>> < 300 x 300
> >>> but seems to overwrite the resulting buffers on larger Matrices.
> >>>
> >>> My real problem, though is getting the memory back out of the
> >>> resulting`ompC` compresed_matrix so that i can write it back to a
> mahout
> >>> SparseMatrix.
> >>>
> >>> currently I am using:
> >>>
> >>> void viennacl::backend::memory_copy (mem_handle const &  src_buffer,
> >>>          mem_handle &      dst_buffer,
> >>>          vcl_size_t      src_offset,
> >>>          vcl_size_t      dst_offset,
> >>>          vcl_size_t      bytes_to_copy
> >>>      )
> >>>
> >>> on ompC.handel1,ompC.handel2 and ompC.handel source handels
> >>>
> >>> to copy into pre-allocated  row_jumper,  col_index and element buffers
> >>> (of size ompC.size1() + 1, ompC.nnz and ompC.nnz, respectivly).
> >>>
> >>> I am getting nonsensical values back that one would expect from memory
> >>> errors. eg:
> >>>
> >>> the Matrix geometry of the result: ompC.size1(), and omp.size2() are
> >>> correct and ompC.nnz is a reasonable value.
> >>>
> >>> It is possible that I have mis-allocated some of the memory on my side,
> >>> but I am pretty sure that most of the Buffers are allocated correctly
> >>> (usually JavaCPP does a pretty good job of this).
> >>>
> >>>
> >>> I guess, long story short, my question is am i using the correct method
> >>> of copying the memory out of a compressed_matrix?  is there something
> >>> glaringly incorrect that i am doing here?  Should I be using
> >>> viennacl::backend::memory_copy or is there a different method that i
> >>> should be using?
> >>>
> >>>
> >>> Thanks very much,
> >>>
> >>> Andy
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> ------------------------------------------------------------------------------
> >>> What NetFlow Analyzer can do for you? Monitors network bandwidth and
> traffic
> >>> patterns at an interface-level. Reveals which users, apps, and
> protocols are
> >>> consuming the most bandwidth. Provides multi-vendor support for
> NetFlow,
> >>> J-Flow, sFlow and other flows. Make informed decisions using capacity
> planning
> >>> reports.http://sdm.link/zohodev2dev
> >>>
> >>>
> >>>
> >>> _______________________________________________
> >>> ViennaCL-devel mailing list
> >>> Vie...@li...
> >>>https://lists.sourceforge.net/lists/listinfo/viennacl-devel
> >>>
> >>
> >
>
>
>
> ------------------------------------------------------------------------------
> What NetFlow Analyzer can do for you? Monitors network bandwidth and
> traffic
> patterns at an interface-level. Reveals which users, apps, and protocols
> are
> consuming the most bandwidth. Provides multi-vendor support for NetFlow,
> J-Flow, sFlow and other flows. Make informed decisions using capacity
> planning
> reports.http://sdm.link/zohodev2dev
> _______________________________________________
> ViennaCL-devel mailing list
> Vie...@li...
> https://lists.sourceforge.net/lists/listinfo/viennacl-devel
>
>

Re: [ViennaCL-devel] Copying Values out of a compressed_matrix

From: Andrew P. <ap...@ou...> - 2016-07-22 17:26:32

Yep thats it.  Oh wow- well thats just embarrassing [😊] .


Thanks very much for your time, Karl- much appreciated.


Andy

________________________________
From: Karl Rupp <ru...@iu...>
Sent: Friday, July 22, 2016 12:39:20 PM
To: Andrew Palumbo; viennacl-devel
Subject: Re: [ViennaCL-devel] Copying Values out of a compressed_matrix

Hi,

your second and third arguments to memory_read() are incorrect:
The second argument is the offset from the beginning, the third argument
is the number of bytes to be read. Shifting the zero to the second
position fixes the snippet (plus correcting the loop bounds when
printing at the end) :-)

Best regards,
Karli



On 07/22/2016 08:51 AM, Andrew Palumbo wrote:
> a couple of small mistakes in the previous c++ file:
>
>
> The memory_read(..) call should be:
>
>
>    // read data back into our product buffers
>    viennacl::backend::memory_read(handle1, product_size_row * 4, 0,
> product_row_ptr, false);
>    viennacl::backend::memory_read(handle2, product_NNz * 4, 0,
> product_col_ptr, false);
>    viennacl::backend::memory_read(handle, product_NNz * 8, 0,
> product_values_ptr, false);
>
>
> (read product_NNz * x bytes instead of product_size_row * x)
>
>
> I've attached the corrected file.
>
>
> Thanks
>
>
> Andy
>
> ------------------------------------------------------------------------
> *From:* Andrew Palumbo <ap...@ou...>
> *Sent:* Thursday, July 21, 2016 11:03:59 PM
> *To:* Karl Rupp; viennacl-devel
> *Subject:* Re: [ViennaCL-devel] Copying Values out of a compressed_matrix
>
> Hello,
>
>
> I've mocked up a sample of the compressed_matrix multiplication that
> I've been working with javacpp on in C++.  I am seeing the same type of
> memory errors when I try to read the data out of product, and into the
> output buffers as I was with javacpp.  By printing the matrix to stdout
> as in the compressed_matrix example we can see that there are values
> there, and they seem reasonable,  but when i use
> backend::memory_read(...)  to retrive the buffers, I'm getting values
> consistent with a memory error, and similar to what i was seeing in the
> javacpp code.  Maybe I am not using the handles correctly?  Admittedly
> my C++ is more than rusty, but I believe I am referencing the buffers
> correctly in the output.
>
>
> Below is the output of the attached file: sparse.cpp
>
>
> Thanks very much,
>
>
> Andy
>
>
>
> ViennaCL: compressed_matrix of size (10, 10) with 24 nonzeros:
>    (1, 2)    0.329908
>    (1, 3)    0.0110522
>    (1, 4)    0.336839
>    (2, 5)    0.0150778
>    (2, 7)    0.0143518
>    (3, 3)    0.217256
>    (3, 6)    0.346854
>    (3, 9)    0.45353
>    (4, 3)    0.407954
>    (4, 6)    0.651308
>    (5, 2)    0.676061
>    (5, 3)    0.0226486
>    (5, 4)    0.690264
>    (6, 5)    0.0998838
>    (6, 7)    0.0950744
>    (7, 2)    0.346173
>    (7, 3)    0.0115971
>    (7, 4)    0.353446
>    (7, 9)    0.684458
>    (8, 5)    0.0448123
>    (8, 7)    0.0426546
>    (8, 9)    0.82782
>    (9, 5)    0.295356
>    (9, 7)    0.281134
>
> row jumpers: [
> -36207072,32642,-39708721,32642,6390336,0,2012467744,32767,2012467968,32767,4203729,]
> col ptrs: [
> 0,0,-39655605,32642,-36207072,32642,6390336,0,10,0,-39672717,32642,2012466352,32767,-32892691,32642,1,0,6390336,0,2012466344,32767,60002304,2059362829,]
> elements: [
> 0.289516,0.304161,0.795779,0.334456,0.935264,0.585813,0.871237,0.811508,0.828558,0.0271863,6.92683e-310,6.92683e-310,1.061e-313,1.061e-313,6.36599e-314,4.24399e-314,6.36599e-314,6.92683e-310,4.24399e-314,1.2732e-313,2.122e-313,6.95324e-310,0.406537,0.0495716,0.370862,]
>
>
> and similarly for multiplication of 2 1x1 matrices:
>
> Result:
>
> ViennaCL: compressed_matrix of size (1, 1) with 1 nonzeros:
>    (0, 0)    0.117699
>
> row jumpers: [
> -717571424,32767,]
> col ptrs: [
> 6386240,]
> elements: [
> 0.289516,6.9479e-310,]
>
>
>
>
> ------------------------------------------------------------------------
> *From:* Andrew Palumbo <ap...@ou...>
> *Sent:* Wednesday, July 20, 2016 5:40:31 PM
> *To:* Karl Rupp; viennacl-devel
> *Subject:* Re: [ViennaCL-devel] Copying Values out of a compressed_matrix
>
> Oops, sorry about not cc'ing all.
>
>
> I do not get correct data back for a (Random.nextDouble() populated) 1 x
> 1 Matrix.
>
>
> A:
>
>    Row Pointer: [0, 1 ]
>
>    Col Pointer: [0 ]
>    element Pointer: [0.6465821602909256 ]
>
>
> B:
>
>
>    Row Pointer: [0, 1 ]
>    Col Pointer: [0 ]
>    element Pointer: [0.9513577109193919 ]
>
>
> C = A %*% B
>
>    Row Pointer: [469762248, 32632]
>    Col Pointer: [469762248 ]
>    element Pointer: [6.9245198744523E-310 ]
>
>
> ouch.
>
>
> It looks like I'm not copying the Buffers correctly at all.  I'm may be
> using the javacpp buffers incorrectly here, or I have possibly wrapped
> the viennacl::backend::memory_handle class incorrectly, so I'm using a
> pointer to the wrong memory from eg. viennacl::compressed_matrix::handle.
>
>
> I mentioned before that the multiplication completed in on small <~300 x
> 300 matrices because if I try to multiply two larger sparse matrices, an
> err the JVM crashes with a SIGSEGV.
>
>
> Since this code is all wrapped with javacpp, I don't really have a small
> sample that I can show you (not going to dump a whole bunch of code on
> you).
>
>
> I'll keep trying to figure it out.  Pretty sure the problem is on my end
> here �� I really mainly wanted to ask you if I was using the correct
> methods at this point, or if there was anything very obviously that I
> was doing wrong.
>
>
> Thanks a lot for your help!
>
>
> Andy
>
>
>
>
>
>
> ------------------------------------------------------------------------
> *From:* Karl Rupp <ru...@iu...>
> *Sent:* Wednesday, July 20, 2016 5:00:36 PM
> *To:* Andrew Palumbo; viennacl-devel
> *Subject:* Re: [ViennaCL-devel] Copying Values out of a compressed_matrix
> Hi,
>
> please keep viennacl-devel in CC:
>
> Just to clarify: Do you get incorrect values for a 1-by-1 matrix as
> indicated in your sample data? In your previous email you mentioned that
> results are fine for small matrices...
>
> I'm afraid I can only guess at the source of the error with the
> informations provided. Any chance that you can provide a standalone code
> to reproduce the problem with reasonable effort?
>
> Best regards,
> Karli
>
>
>
> On 07/20/2016 10:16 PM, Andrew Palumbo wrote:
>> Thanks so much for your quick answer!
>>
>>
>> I actually am sorry to say that I made a mistake when writing the last
>> email, I copied the wrong signature from the VCL documentation, and then
>> the mistake propagated through the rest of the e-mail.
>>
>>
>> I am actually using viennacl::backend::memory_read().
>>
>>
>> Eg, for the row_jumpers and column_idx  I read use:
>>
>> @Name("backend::memory_read")
>> public static native void memoryReadInt(@Const @ByRef MemHandle src_buffer,
>>                                int bytes_to_read,
>>                                int offset,
>>                                IntPointer ptr,
>>                                boolean async);
>>
>> and for the Values:
>>
>>
>> @Name("backend::memory_read")
>> public static native void memoryReadDouble(@Const @ByRef MemHandle src_buffer,
>>                                          int bytes_to_read,
>>                                          int offset,
>>                                          DoublePointer ptr,
>>                                          boolean async);
>>
>> And then call:
>>
>>
>> memoryReadInt(row_ptr_handle, (m +1) *4,0, row_ptr,false)
>> memoryReadInt(col_idx_handle, NNz *4,0,col_idx,false)
>> memoryReadDouble(element_handle, NNz *8,0, values,false)
>>
>>
>> and after convetring them to java.nio.Buffers, am getting results like:
>>
>>
>> rowBuff.get(1): 0    colBuff(1): 402653448 valBuff(1): 6.91730177312166E-310
>>
>>
>> Have also tried reading into BytePointers similarly with the same type
>> of results.  I know that the use of Javacpp obfuscates what the problem
>> may be.  But I believe the Memorry is properly allocated.
>>
>>
>>
>> Sorry for the mistake.
>>
>>
>> Thanks,
>>
>>
>> Andy
>>
>>
>> ------------------------------------------------------------------------
>> *From:* Karl Rupp <ru...@iu...>
>> *Sent:* Wednesday, July 20, 2016 3:50:07 PM
>> *To:* Andrew Palumbo; Vie...@li...
>> *Subject:* Re: [ViennaCL-devel] Copying Values out of a compressed_matrix
>> Hi Andy,
>>
>> instead of viennacl::backend::memory_copy(), you want to use
>> viennacl::backend::memory_read(), which directly transfers the data into
>> your buffer(s).
>>
>> If you *know* that your handles are in host memory, you can even grab
>> the values directly via
>>    viennacl::linalg::host_based::detail::extract_raw_pointer<T>();
>> defined in viennacl/linalg/host_based/common.hpp, around line 40.
>>
>> Please let me know if you still get errors after using that.
>>
>> Best regards,
>> Karli
>>
>>
>>
>>
>> On 07/20/2016 09:05 PM, Andrew Palumbo wrote:
>>> Hello,
>>>
>>>
>>> I'm Having some difficulties with compressed_matrix multiplication.
>>>
>>>
>>> Essentially I am copying  three buffers, the CSR conversion of an Apache
>>> Mahout SparseMatrix, into two compressed_matrices performing matrix
>>> multiplication. I am doing this in scala and Java using javacpp.
>>>
>>>
>>> For example, I have a 5 x 10 matrix of ~20% non-zero values which in CSR
>>> format looks like this:
>>>
>>>
>>> NNz: 12
>>>
>>> Row Pointer: [0, 1, 4, 6, 9, 12, ]
>>>
>>> Col Pointer: [9, 0, 8, 7, 2, 9, 0, 8, 9, 0, 3, 5, ]
>>>
>>> element Pointer: [0.4065367203992265, 0.04957158909682802,
>>> 0.5205586068847993, 0.3708618354358446, 0.6963900565931678,
>>> 0.8330915529787706, 0.32839112750638844, 0.7856168903297948,
>>> 0.4265801782090245, 0.14733066454561583, 0.9501663495824946,
>>> 0.9710498974366047, ]
>>>
>>> Multiplied by a similarly Sparse 10 x 5 compressed_matrix
>>>
>>> I use a CompressedMatrix wrapper which essentially wraps the
>>>
>>>      viennacl:: compressed_matrix (vcl_size_t rows, vcl_size_t cols,
>>> vcl_size_t nonzeros=0, viennacl::context ctx=viennacl::context())
>>>
>>> constructor as well as the
>>>
>>>      compressed_matrix (matrix_expression< const compressed_matrix,
>>> const compressed_matrix, op_prod > const &proxy).
>>>
>>> I have a helper function, /toVclCompressedMatrix/(..) which essentially
>>> does the CSR conversion from a Mahout src matrix, calls the constructor
>>> and uses viennacl::compressed_matrix::set(...) to set the buffers:
>>>
>>> val ompA =toVclCompressedMatrix(src = mxA, ompCtx)
>>> val ompB =toVclCompressedMatrix(src = mxB, ompCtx)
>>>
>>>
>>> and then create a new viennacl::compressed_matrix from the
>>> viennacl::linalg::prod of the 2 matrices i.e.:
>>>
>>> val ompC =new CompressedMatrix(prod(ompA, ompB))
>>>
>>> The context in the above case is either the Host or OpenMP (I know that
>>> there is some special casting of the row_jumpers and col_idxs that needs
>>> to be done in the OpenCL version)
>>>
>>> The Matrix multiplication completes without error on small Matrices eg.
>>> < 300 x 300
>>> but seems to overwrite the resulting buffers on larger Matrices.
>>>
>>> My real problem, though is getting the memory back out of the
>>> resulting`ompC` compresed_matrix so that i can write it back to a mahout
>>> SparseMatrix.
>>>
>>> currently I am using:
>>>
>>> void viennacl::backend::memory_copy (mem_handle const &  src_buffer,
>>>          mem_handle &      dst_buffer,
>>>          vcl_size_t      src_offset,
>>>          vcl_size_t      dst_offset,
>>>          vcl_size_t      bytes_to_copy
>>>      )
>>>
>>> on ompC.handel1,ompC.handel2 and ompC.handel source handels
>>>
>>> to copy into pre-allocated  row_jumper,  col_index and element buffers
>>> (of size ompC.size1() + 1, ompC.nnz and ompC.nnz, respectivly).
>>>
>>> I am getting nonsensical values back that one would expect from memory
>>> errors. eg:
>>>
>>> the Matrix geometry of the result: ompC.size1(), and omp.size2() are
>>> correct and ompC.nnz is a reasonable value.
>>>
>>> It is possible that I have mis-allocated some of the memory on my side,
>>> but I am pretty sure that most of the Buffers are allocated correctly
>>> (usually JavaCPP does a pretty good job of this).
>>>
>>>
>>> I guess, long story short, my question is am i using the correct method
>>> of copying the memory out of a compressed_matrix?  is there something
>>> glaringly incorrect that i am doing here?  Should I be using
>>> viennacl::backend::memory_copy or is there a different method that i
>>> should be using?
>>>
>>>
>>> Thanks very much,
>>>
>>> Andy
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
>>> patterns at an interface-level. Reveals which users, apps, and protocols are
>>> consuming the most bandwidth. Provides multi-vendor support for NetFlow,
>>> J-Flow, sFlow and other flows. Make informed decisions using capacity planning
>>> reports.http://sdm.link/zohodev2dev
>>>
>>>
>>>
>>> _______________________________________________
>>> ViennaCL-devel mailing list
>>> Vie...@li...
>>>https://lists.sourceforge.net/lists/listinfo/viennacl-devel
>>>
>>
>

Re: [ViennaCL-devel] Copying Values out of a compressed_matrix

From: Karl R. <ru...@iu...> - 2016-07-22 16:39:32

Hi,

your second and third arguments to memory_read() are incorrect:
The second argument is the offset from the beginning, the third argument 
is the number of bytes to be read. Shifting the zero to the second 
position fixes the snippet (plus correcting the loop bounds when 
printing at the end) :-)

Best regards,
Karli



On 07/22/2016 08:51 AM, Andrew Palumbo wrote:
> a couple of small mistakes in the previous c++ file:
>
>
> The memory_read(..) call should be:
>
>
>    // read data back into our product buffers
>    viennacl::backend::memory_read(handle1, product_size_row * 4, 0,
> product_row_ptr, false);
>    viennacl::backend::memory_read(handle2, product_NNz * 4, 0,
> product_col_ptr, false);
>    viennacl::backend::memory_read(handle, product_NNz * 8, 0,
> product_values_ptr, false);
>
>
> (read product_NNz * x bytes instead of product_size_row * x)
>
>
> I've attached the corrected file.
>
>
> Thanks
>
>
> Andy
>
> ------------------------------------------------------------------------
> *From:* Andrew Palumbo <ap...@ou...>
> *Sent:* Thursday, July 21, 2016 11:03:59 PM
> *To:* Karl Rupp; viennacl-devel
> *Subject:* Re: [ViennaCL-devel] Copying Values out of a compressed_matrix
>
> Hello,
>
>
> I've mocked up a sample of the compressed_matrix multiplication that
> I've been working with javacpp on in C++.  I am seeing the same type of
> memory errors when I try to read the data out of product, and into the
> output buffers as I was with javacpp.  By printing the matrix to stdout
> as in the compressed_matrix example we can see that there are values
> there, and they seem reasonable,  but when i use
> backend::memory_read(...)  to retrive the buffers, I'm getting values
> consistent with a memory error, and similar to what i was seeing in the
> javacpp code.  Maybe I am not using the handles correctly?  Admittedly
> my C++ is more than rusty, but I believe I am referencing the buffers
> correctly in the output.
>
>
> Below is the output of the attached file: sparse.cpp
>
>
> Thanks very much,
>
>
> Andy
>
>
>
> ViennaCL: compressed_matrix of size (10, 10) with 24 nonzeros:
>    (1, 2)    0.329908
>    (1, 3)    0.0110522
>    (1, 4)    0.336839
>    (2, 5)    0.0150778
>    (2, 7)    0.0143518
>    (3, 3)    0.217256
>    (3, 6)    0.346854
>    (3, 9)    0.45353
>    (4, 3)    0.407954
>    (4, 6)    0.651308
>    (5, 2)    0.676061
>    (5, 3)    0.0226486
>    (5, 4)    0.690264
>    (6, 5)    0.0998838
>    (6, 7)    0.0950744
>    (7, 2)    0.346173
>    (7, 3)    0.0115971
>    (7, 4)    0.353446
>    (7, 9)    0.684458
>    (8, 5)    0.0448123
>    (8, 7)    0.0426546
>    (8, 9)    0.82782
>    (9, 5)    0.295356
>    (9, 7)    0.281134
>
> row jumpers: [
> -36207072,32642,-39708721,32642,6390336,0,2012467744,32767,2012467968,32767,4203729,]
> col ptrs: [
> 0,0,-39655605,32642,-36207072,32642,6390336,0,10,0,-39672717,32642,2012466352,32767,-32892691,32642,1,0,6390336,0,2012466344,32767,60002304,2059362829,]
> elements: [
> 0.289516,0.304161,0.795779,0.334456,0.935264,0.585813,0.871237,0.811508,0.828558,0.0271863,6.92683e-310,6.92683e-310,1.061e-313,1.061e-313,6.36599e-314,4.24399e-314,6.36599e-314,6.92683e-310,4.24399e-314,1.2732e-313,2.122e-313,6.95324e-310,0.406537,0.0495716,0.370862,]
>
>
> and similarly for multiplication of 2 1x1 matrices:
>
> Result:
>
> ViennaCL: compressed_matrix of size (1, 1) with 1 nonzeros:
>    (0, 0)    0.117699
>
> row jumpers: [
> -717571424,32767,]
> col ptrs: [
> 6386240,]
> elements: [
> 0.289516,6.9479e-310,]
>
>
>
>
> ------------------------------------------------------------------------
> *From:* Andrew Palumbo <ap...@ou...>
> *Sent:* Wednesday, July 20, 2016 5:40:31 PM
> *To:* Karl Rupp; viennacl-devel
> *Subject:* Re: [ViennaCL-devel] Copying Values out of a compressed_matrix
>
> Oops, sorry about not cc'ing all.
>
>
> I do not get correct data back for a (Random.nextDouble() populated) 1 x
> 1 Matrix.
>
>
> A:
>
>    Row Pointer: [0, 1 ]
>
>    Col Pointer: [0 ]
>    element Pointer: [0.6465821602909256 ]
>
>
> B:
>
>
>    Row Pointer: [0, 1 ]
>    Col Pointer: [0 ]
>    element Pointer: [0.9513577109193919 ]
>
>
> C = A %*% B
>
>    Row Pointer: [469762248, 32632]
>    Col Pointer: [469762248 ]
>    element Pointer: [6.9245198744523E-310 ]
>
>
> ouch.
>
>
> It looks like I'm not copying the Buffers correctly at all.  I'm may be
> using the javacpp buffers incorrectly here, or I have possibly wrapped
> the viennacl::backend::memory_handle class incorrectly, so I'm using a
> pointer to the wrong memory from eg. viennacl::compressed_matrix::handle.
>
>
> I mentioned before that the multiplication completed in on small <~300 x
> 300 matrices because if I try to multiply two larger sparse matrices, an
> err the JVM crashes with a SIGSEGV.
>
>
> Since this code is all wrapped with javacpp, I don't really have a small
> sample that I can show you (not going to dump a whole bunch of code on
> you).
>
>
> I'll keep trying to figure it out.  Pretty sure the problem is on my end
> here �� I really mainly wanted to ask you if I was using the correct
> methods at this point, or if there was anything very obviously that I
> was doing wrong.
>
>
> Thanks a lot for your help!
>
>
> Andy
>
>
>
>
>
>
> ------------------------------------------------------------------------
> *From:* Karl Rupp <ru...@iu...>
> *Sent:* Wednesday, July 20, 2016 5:00:36 PM
> *To:* Andrew Palumbo; viennacl-devel
> *Subject:* Re: [ViennaCL-devel] Copying Values out of a compressed_matrix
> Hi,
>
> please keep viennacl-devel in CC:
>
> Just to clarify: Do you get incorrect values for a 1-by-1 matrix as
> indicated in your sample data? In your previous email you mentioned that
> results are fine for small matrices...
>
> I'm afraid I can only guess at the source of the error with the
> informations provided. Any chance that you can provide a standalone code
> to reproduce the problem with reasonable effort?
>
> Best regards,
> Karli
>
>
>
> On 07/20/2016 10:16 PM, Andrew Palumbo wrote:
>> Thanks so much for your quick answer!
>>
>>
>> I actually am sorry to say that I made a mistake when writing the last
>> email, I copied the wrong signature from the VCL documentation, and then
>> the mistake propagated through the rest of the e-mail.
>>
>>
>> I am actually using viennacl::backend::memory_read().
>>
>>
>> Eg, for the row_jumpers and column_idx  I read use:
>>
>> @Name("backend::memory_read")
>> public static native void memoryReadInt(@Const @ByRef MemHandle src_buffer,
>>                                int bytes_to_read,
>>                                int offset,
>>                                IntPointer ptr,
>>                                boolean async);
>>
>> and for the Values:
>>
>>
>> @Name("backend::memory_read")
>> public static native void memoryReadDouble(@Const @ByRef MemHandle src_buffer,
>>                                          int bytes_to_read,
>>                                          int offset,
>>                                          DoublePointer ptr,
>>                                          boolean async);
>>
>> And then call:
>>
>>
>> memoryReadInt(row_ptr_handle, (m +1) *4,0, row_ptr,false)
>> memoryReadInt(col_idx_handle, NNz *4,0,col_idx,false)
>> memoryReadDouble(element_handle, NNz *8,0, values,false)
>>
>>
>> and after convetring them to java.nio.Buffers, am getting results like:
>>
>>
>> rowBuff.get(1): 0    colBuff(1): 402653448 valBuff(1): 6.91730177312166E-310
>>
>>
>> Have also tried reading into BytePointers similarly with the same type
>> of results.  I know that the use of Javacpp obfuscates what the problem
>> may be.  But I believe the Memorry is properly allocated.
>>
>>
>>
>> Sorry for the mistake.
>>
>>
>> Thanks,
>>
>>
>> Andy
>>
>>
>> ------------------------------------------------------------------------
>> *From:* Karl Rupp <ru...@iu...>
>> *Sent:* Wednesday, July 20, 2016 3:50:07 PM
>> *To:* Andrew Palumbo; Vie...@li...
>> *Subject:* Re: [ViennaCL-devel] Copying Values out of a compressed_matrix
>> Hi Andy,
>>
>> instead of viennacl::backend::memory_copy(), you want to use
>> viennacl::backend::memory_read(), which directly transfers the data into
>> your buffer(s).
>>
>> If you *know* that your handles are in host memory, you can even grab
>> the values directly via
>>    viennacl::linalg::host_based::detail::extract_raw_pointer<T>();
>> defined in viennacl/linalg/host_based/common.hpp, around line 40.
>>
>> Please let me know if you still get errors after using that.
>>
>> Best regards,
>> Karli
>>
>>
>>
>>
>> On 07/20/2016 09:05 PM, Andrew Palumbo wrote:
>>> Hello,
>>>
>>>
>>> I'm Having some difficulties with compressed_matrix multiplication.
>>>
>>>
>>> Essentially I am copying  three buffers, the CSR conversion of an Apache
>>> Mahout SparseMatrix, into two compressed_matrices performing matrix
>>> multiplication. I am doing this in scala and Java using javacpp.
>>>
>>>
>>> For example, I have a 5 x 10 matrix of ~20% non-zero values which in CSR
>>> format looks like this:
>>>
>>>
>>> NNz: 12
>>>
>>> Row Pointer: [0, 1, 4, 6, 9, 12, ]
>>>
>>> Col Pointer: [9, 0, 8, 7, 2, 9, 0, 8, 9, 0, 3, 5, ]
>>>
>>> element Pointer: [0.4065367203992265, 0.04957158909682802,
>>> 0.5205586068847993, 0.3708618354358446, 0.6963900565931678,
>>> 0.8330915529787706, 0.32839112750638844, 0.7856168903297948,
>>> 0.4265801782090245, 0.14733066454561583, 0.9501663495824946,
>>> 0.9710498974366047, ]
>>>
>>> Multiplied by a similarly Sparse 10 x 5 compressed_matrix
>>>
>>> I use a CompressedMatrix wrapper which essentially wraps the
>>>
>>>      viennacl:: compressed_matrix (vcl_size_t rows, vcl_size_t cols,
>>> vcl_size_t nonzeros=0, viennacl::context ctx=viennacl::context())
>>>
>>> constructor as well as the
>>>
>>>      compressed_matrix (matrix_expression< const compressed_matrix,
>>> const compressed_matrix, op_prod > const &proxy).
>>>
>>> I have a helper function, /toVclCompressedMatrix/(..) which essentially
>>> does the CSR conversion from a Mahout src matrix, calls the constructor
>>> and uses viennacl::compressed_matrix::set(...) to set the buffers:
>>>
>>> val ompA =toVclCompressedMatrix(src = mxA, ompCtx)
>>> val ompB =toVclCompressedMatrix(src = mxB, ompCtx)
>>>
>>>
>>> and then create a new viennacl::compressed_matrix from the
>>> viennacl::linalg::prod of the 2 matrices i.e.:
>>>
>>> val ompC =new CompressedMatrix(prod(ompA, ompB))
>>>
>>> The context in the above case is either the Host or OpenMP (I know that
>>> there is some special casting of the row_jumpers and col_idxs that needs
>>> to be done in the OpenCL version)
>>>
>>> The Matrix multiplication completes without error on small Matrices eg.
>>> < 300 x 300
>>> but seems to overwrite the resulting buffers on larger Matrices.
>>>
>>> My real problem, though is getting the memory back out of the
>>> resulting`ompC` compresed_matrix so that i can write it back to a mahout
>>> SparseMatrix.
>>>
>>> currently I am using:
>>>
>>> void viennacl::backend::memory_copy (mem_handle const &  src_buffer,
>>>          mem_handle &      dst_buffer,
>>>          vcl_size_t      src_offset,
>>>          vcl_size_t      dst_offset,
>>>          vcl_size_t      bytes_to_copy
>>>      )
>>>
>>> on ompC.handel1,ompC.handel2 and ompC.handel source handels
>>>
>>> to copy into pre-allocated  row_jumper,  col_index and element buffers
>>> (of size ompC.size1() + 1, ompC.nnz and ompC.nnz, respectivly).
>>>
>>> I am getting nonsensical values back that one would expect from memory
>>> errors. eg:
>>>
>>> the Matrix geometry of the result: ompC.size1(), and omp.size2() are
>>> correct and ompC.nnz is a reasonable value.
>>>
>>> It is possible that I have mis-allocated some of the memory on my side,
>>> but I am pretty sure that most of the Buffers are allocated correctly
>>> (usually JavaCPP does a pretty good job of this).
>>>
>>>
>>> I guess, long story short, my question is am i using the correct method
>>> of copying the memory out of a compressed_matrix?  is there something
>>> glaringly incorrect that i am doing here?  Should I be using
>>> viennacl::backend::memory_copy or is there a different method that i
>>> should be using?
>>>
>>>
>>> Thanks very much,
>>>
>>> Andy
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
>>> patterns at an interface-level. Reveals which users, apps, and protocols are
>>> consuming the most bandwidth. Provides multi-vendor support for NetFlow,
>>> J-Flow, sFlow and other flows. Make informed decisions using capacity planning
>>> reports.http://sdm.link/zohodev2dev
>>>
>>>
>>>
>>> _______________________________________________
>>> ViennaCL-devel mailing list
>>> Vie...@li...
>>>https://lists.sourceforge.net/lists/listinfo/viennacl-devel
>>>
>>
>

Re: [ViennaCL-devel] Copying Values out of a compressed_matrix

From: Andrew P. <ap...@ou...> - 2016-07-22 06:51:18

Attachments: sparse.cpp

a couple of small mistakes in the previous c++ file:


The memory_read(..) call should be:


  // read data back into our product buffers
  viennacl::backend::memory_read(handle1, product_size_row * 4, 0, product_row_ptr, false);
  viennacl::backend::memory_read(handle2, product_NNz * 4, 0, product_col_ptr, false);
  viennacl::backend::memory_read(handle, product_NNz * 8, 0, product_values_ptr, false);


(read product_NNz * x bytes instead of product_size_row * x)


I've attached the corrected file.


Thanks


Andy

________________________________
From: Andrew Palumbo <ap...@ou...>
Sent: Thursday, July 21, 2016 11:03:59 PM
To: Karl Rupp; viennacl-devel
Subject: Re: [ViennaCL-devel] Copying Values out of a compressed_matrix


Hello,


I've mocked up a sample of the compressed_matrix multiplication that I've been working with javacpp on in C++.  I am seeing the same type of memory errors when I try to read the data out of product, and into the output buffers as I was with javacpp.  By printing the matrix to stdout as in the compressed_matrix example we can see that there are values there, and they seem reasonable,  but when i use backend::memory_read(...)  to retrive the buffers, I'm getting values consistent with a memory error, and similar to what i was seeing in the javacpp code.  Maybe I am not using the handles correctly?  Admittedly my C++ is more than rusty, but I believe I am referencing the buffers correctly in the output.


Below is the output of the attached file: sparse.cpp


Thanks very much,


Andy



ViennaCL: compressed_matrix of size (10, 10) with 24 nonzeros:
  (1, 2)    0.329908
  (1, 3)    0.0110522
  (1, 4)    0.336839
  (2, 5)    0.0150778
  (2, 7)    0.0143518
  (3, 3)    0.217256
  (3, 6)    0.346854
  (3, 9)    0.45353
  (4, 3)    0.407954
  (4, 6)    0.651308
  (5, 2)    0.676061
  (5, 3)    0.0226486
  (5, 4)    0.690264
  (6, 5)    0.0998838
  (6, 7)    0.0950744
  (7, 2)    0.346173
  (7, 3)    0.0115971
  (7, 4)    0.353446
  (7, 9)    0.684458
  (8, 5)    0.0448123
  (8, 7)    0.0426546
  (8, 9)    0.82782
  (9, 5)    0.295356
  (9, 7)    0.281134

row jumpers: [
-36207072,32642,-39708721,32642,6390336,0,2012467744,32767,2012467968,32767,4203729,]
col ptrs: [
0,0,-39655605,32642,-36207072,32642,6390336,0,10,0,-39672717,32642,2012466352,32767,-32892691,32642,1,0,6390336,0,2012466344,32767,60002304,2059362829,]
elements: [
0.289516,0.304161,0.795779,0.334456,0.935264,0.585813,0.871237,0.811508,0.828558,0.0271863,6.92683e-310,6.92683e-310,1.061e-313,1.061e-313,6.36599e-314,4.24399e-314,6.36599e-314,6.92683e-310,4.24399e-314,1.2732e-313,2.122e-313,6.95324e-310,0.406537,0.0495716,0.370862,]


and similarly for multiplication of 2 1x1 matrices:

Result:

ViennaCL: compressed_matrix of size (1, 1) with 1 nonzeros:
  (0, 0)    0.117699

row jumpers: [
-717571424,32767,]
col ptrs: [
6386240,]
elements: [
0.289516,6.9479e-310,]





________________________________
From: Andrew Palumbo <ap...@ou...>
Sent: Wednesday, July 20, 2016 5:40:31 PM
To: Karl Rupp; viennacl-devel
Subject: Re: [ViennaCL-devel] Copying Values out of a compressed_matrix


Oops, sorry about not cc'ing all.


I do not get correct data back for a (Random.nextDouble() populated) 1 x 1 Matrix.


A:

  Row Pointer: [0, 1 ]

  Col Pointer: [0 ]
  element Pointer: [0.6465821602909256 ]


B:


  Row Pointer: [0, 1 ]
  Col Pointer: [0 ]
  element Pointer: [0.9513577109193919 ]


C = A %*% B



  Row Pointer: [469762248, 32632]
  Col Pointer: [469762248 ]
  element Pointer: [6.9245198744523E-310 ]


ouch.


It looks like I'm not copying the Buffers correctly at all.  I'm may be using the javacpp buffers incorrectly here, or I have possibly wrapped the viennacl::backend::memory_handle class incorrectly, so I'm using a pointer to the wrong memory from eg. viennacl::compressed_matrix::handle.


I mentioned before that the multiplication completed in on small <~300 x 300 matrices because if I try to multiply two larger sparse matrices, an err the JVM crashes with a SIGSEGV.


Since this code is all wrapped with javacpp, I don't really have a small sample that I can show you (not going to dump a whole bunch of code on you).


I'll keep trying to figure it out.  Pretty sure the problem is on my end here [?]  I really mainly wanted to ask you if I was using the correct methods at this point, or if there was anything very obviously that I was doing wrong.


Thanks a lot for your help!


Andy









________________________________
From: Karl Rupp <ru...@iu...>
Sent: Wednesday, July 20, 2016 5:00:36 PM
To: Andrew Palumbo; viennacl-devel
Subject: Re: [ViennaCL-devel] Copying Values out of a compressed_matrix

Hi,

please keep viennacl-devel in CC:

Just to clarify: Do you get incorrect values for a 1-by-1 matrix as
indicated in your sample data? In your previous email you mentioned that
results are fine for small matrices...

I'm afraid I can only guess at the source of the error with the
informations provided. Any chance that you can provide a standalone code
to reproduce the problem with reasonable effort?

Best regards,
Karli



On 07/20/2016 10:16 PM, Andrew Palumbo wrote:
> Thanks so much for your quick answer!
>
>
> I actually am sorry to say that I made a mistake when writing the last
> email, I copied the wrong signature from the VCL documentation, and then
> the mistake propagated through the rest of the e-mail.
>
>
> I am actually using viennacl::backend::memory_read().
>
>
> Eg, for the row_jumpers and column_idx  I read use:
>
> @Name("backend::memory_read")
> public static native void memoryReadInt(@Const @ByRef MemHandle src_buffer,
>                                int bytes_to_read,
>                                int offset,
>                                IntPointer ptr,
>                                boolean async);
>
> and for the Values:
>
>
> @Name("backend::memory_read")
> public static native void memoryReadDouble(@Const @ByRef MemHandle src_buffer,
>                                          int bytes_to_read,
>                                          int offset,
>                                          DoublePointer ptr,
>                                          boolean async);
>
> And then call:
>
>
> memoryReadInt(row_ptr_handle, (m +1) *4,0, row_ptr,false)
> memoryReadInt(col_idx_handle, NNz *4,0,col_idx,false)
> memoryReadDouble(element_handle, NNz *8,0, values,false)
>
>
> and after convetring them to java.nio.Buffers, am getting results like:
>
>
> rowBuff.get(1): 0    colBuff(1): 402653448 valBuff(1): 6.91730177312166E-310
>
>
> Have also tried reading into BytePointers similarly with the same type
> of results.  I know that the use of Javacpp obfuscates what the problem
> may be.  But I believe the Memorry is properly allocated.
>
>
>
> Sorry for the mistake.
>
>
> Thanks,
>
>
> Andy
>
>
> ------------------------------------------------------------------------
> *From:* Karl Rupp <ru...@iu...>
> *Sent:* Wednesday, July 20, 2016 3:50:07 PM
> *To:* Andrew Palumbo; Vie...@li...
> *Subject:* Re: [ViennaCL-devel] Copying Values out of a compressed_matrix
> Hi Andy,
>
> instead of viennacl::backend::memory_copy(), you want to use
> viennacl::backend::memory_read(), which directly transfers the data into
> your buffer(s).
>
> If you *know* that your handles are in host memory, you can even grab
> the values directly via
>    viennacl::linalg::host_based::detail::extract_raw_pointer<T>();
> defined in viennacl/linalg/host_based/common.hpp, around line 40.
>
> Please let me know if you still get errors after using that.
>
> Best regards,
> Karli
>
>
>
>
> On 07/20/2016 09:05 PM, Andrew Palumbo wrote:
>> Hello,
>>
>>
>> I'm Having some difficulties with compressed_matrix multiplication.
>>
>>
>> Essentially I am copying  three buffers, the CSR conversion of an Apache
>> Mahout SparseMatrix, into two compressed_matrices performing matrix
>> multiplication. I am doing this in scala and Java using javacpp.
>>
>>
>> For example, I have a 5 x 10 matrix of ~20% non-zero values which in CSR
>> format looks like this:
>>
>>
>> NNz: 12
>>
>> Row Pointer: [0, 1, 4, 6, 9, 12, ]
>>
>> Col Pointer: [9, 0, 8, 7, 2, 9, 0, 8, 9, 0, 3, 5, ]
>>
>> element Pointer: [0.4065367203992265, 0.04957158909682802,
>> 0.5205586068847993, 0.3708618354358446, 0.6963900565931678,
>> 0.8330915529787706, 0.32839112750638844, 0.7856168903297948,
>> 0.4265801782090245, 0.14733066454561583, 0.9501663495824946,
>> 0.9710498974366047, ]
>>
>> Multiplied by a similarly Sparse 10 x 5 compressed_matrix
>>
>> I use a CompressedMatrix wrapper which essentially wraps the
>>
>>      viennacl:: compressed_matrix (vcl_size_t rows, vcl_size_t cols,
>> vcl_size_t nonzeros=0, viennacl::context ctx=viennacl::context())
>>
>> constructor as well as the
>>
>>      compressed_matrix (matrix_expression< const compressed_matrix,
>> const compressed_matrix, op_prod > const &proxy).
>>
>> I have a helper function, /toVclCompressedMatrix/(..) which essentially
>> does the CSR conversion from a Mahout src matrix, calls the constructor
>> and uses viennacl::compressed_matrix::set(...) to set the buffers:
>>
>> val ompA =toVclCompressedMatrix(src = mxA, ompCtx)
>> val ompB =toVclCompressedMatrix(src = mxB, ompCtx)
>>
>>
>> and then create a new viennacl::compressed_matrix from the
>> viennacl::linalg::prod of the 2 matrices i.e.:
>>
>> val ompC =new CompressedMatrix(prod(ompA, ompB))
>>
>> The context in the above case is either the Host or OpenMP (I know that
>> there is some special casting of the row_jumpers and col_idxs that needs
>> to be done in the OpenCL version)
>>
>> The Matrix multiplication completes without error on small Matrices eg.
>> < 300 x 300
>> but seems to overwrite the resulting buffers on larger Matrices.
>>
>> My real problem, though is getting the memory back out of the
>> resulting`ompC` compresed_matrix so that i can write it back to a mahout
>> SparseMatrix.
>>
>> currently I am using:
>>
>> void viennacl::backend::memory_copy (mem_handle const &  src_buffer,
>>          mem_handle &      dst_buffer,
>>          vcl_size_t      src_offset,
>>          vcl_size_t      dst_offset,
>>          vcl_size_t      bytes_to_copy
>>      )
>>
>> on ompC.handel1,ompC.handel2 and ompC.handel source handels
>>
>> to copy into pre-allocated  row_jumper,  col_index and element buffers
>> (of size ompC.size1() + 1, ompC.nnz and ompC.nnz, respectivly).
>>
>> I am getting nonsensical values back that one would expect from memory
>> errors. eg:
>>
>> the Matrix geometry of the result: ompC.size1(), and omp.size2() are
>> correct and ompC.nnz is a reasonable value.
>>
>> It is possible that I have mis-allocated some of the memory on my side,
>> but I am pretty sure that most of the Buffers are allocated correctly
>> (usually JavaCPP does a pretty good job of this).
>>
>>
>> I guess, long story short, my question is am i using the correct method
>> of copying the memory out of a compressed_matrix?  is there something
>> glaringly incorrect that i am doing here?  Should I be using
>> viennacl::backend::memory_copy or is there a different method that i
>> should be using?
>>
>>
>> Thanks very much,
>>
>> Andy
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> ------------------------------------------------------------------------------
>> What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
>> patterns at an interface-level. Reveals which users, apps, and protocols are
>> consuming the most bandwidth. Provides multi-vendor support for NetFlow,
>> J-Flow, sFlow and other flows. Make informed decisions using capacity planning
>> reports.http://sdm.link/zohodev2dev
>>
>>
>>
>> _______________________________________________
>> ViennaCL-devel mailing list
>> Vie...@li...
>>https://lists.sourceforge.net/lists/listinfo/viennacl-devel
>>
>

Re: [ViennaCL-devel] Copying Values out of a compressed_matrix

From: Andrew P. <ap...@ou...> - 2016-07-22 03:04:11

Attachments: sparse.cpp

Hello,


I've mocked up a sample of the compressed_matrix multiplication that I've been working with javacpp on in C++.  I am seeing the same type of memory errors when I try to read the data out of product, and into the output buffers as I was with javacpp.  By printing the matrix to stdout as in the compressed_matrix example we can see that there are values there, and they seem reasonable,  but when i use backend::memory_read(...)  to retrive the buffers, I'm getting values consistent with a memory error, and similar to what i was seeing in the javacpp code.  Maybe I am not using the handles correctly?  Admittedly my C++ is more than rusty, but I believe I am referencing the buffers correctly in the output.


Below is the output of the attached file: sparse.cpp


Thanks very much,


Andy



ViennaCL: compressed_matrix of size (10, 10) with 24 nonzeros:
  (1, 2)    0.329908
  (1, 3)    0.0110522
  (1, 4)    0.336839
  (2, 5)    0.0150778
  (2, 7)    0.0143518
  (3, 3)    0.217256
  (3, 6)    0.346854
  (3, 9)    0.45353
  (4, 3)    0.407954
  (4, 6)    0.651308
  (5, 2)    0.676061
  (5, 3)    0.0226486
  (5, 4)    0.690264
  (6, 5)    0.0998838
  (6, 7)    0.0950744
  (7, 2)    0.346173
  (7, 3)    0.0115971
  (7, 4)    0.353446
  (7, 9)    0.684458
  (8, 5)    0.0448123
  (8, 7)    0.0426546
  (8, 9)    0.82782
  (9, 5)    0.295356
  (9, 7)    0.281134

row jumpers: [
-36207072,32642,-39708721,32642,6390336,0,2012467744,32767,2012467968,32767,4203729,]
col ptrs: [
0,0,-39655605,32642,-36207072,32642,6390336,0,10,0,-39672717,32642,2012466352,32767,-32892691,32642,1,0,6390336,0,2012466344,32767,60002304,2059362829,]
elements: [
0.289516,0.304161,0.795779,0.334456,0.935264,0.585813,0.871237,0.811508,0.828558,0.0271863,6.92683e-310,6.92683e-310,1.061e-313,1.061e-313,6.36599e-314,4.24399e-314,6.36599e-314,6.92683e-310,4.24399e-314,1.2732e-313,2.122e-313,6.95324e-310,0.406537,0.0495716,0.370862,]


and similarly for multiplication of 2 1x1 matrices:

Result:

ViennaCL: compressed_matrix of size (1, 1) with 1 nonzeros:
  (0, 0)    0.117699

row jumpers: [
-717571424,32767,]
col ptrs: [
6386240,]
elements: [
0.289516,6.9479e-310,]





________________________________
From: Andrew Palumbo <ap...@ou...>
Sent: Wednesday, July 20, 2016 5:40:31 PM
To: Karl Rupp; viennacl-devel
Subject: Re: [ViennaCL-devel] Copying Values out of a compressed_matrix


Oops, sorry about not cc'ing all.


I do not get correct data back for a (Random.nextDouble() populated) 1 x 1 Matrix.


A:

  Row Pointer: [0, 1 ]

  Col Pointer: [0 ]
  element Pointer: [0.6465821602909256 ]


B:


  Row Pointer: [0, 1 ]
  Col Pointer: [0 ]
  element Pointer: [0.9513577109193919 ]


C = A %*% B



  Row Pointer: [469762248, 32632]
  Col Pointer: [469762248 ]
  element Pointer: [6.9245198744523E-310 ]


ouch.


It looks like I'm not copying the Buffers correctly at all.  I'm may be using the javacpp buffers incorrectly here, or I have possibly wrapped the viennacl::backend::memory_handle class incorrectly, so I'm using a pointer to the wrong memory from eg. viennacl::compressed_matrix::handle.


I mentioned before that the multiplication completed in on small <~300 x 300 matrices because if I try to multiply two larger sparse matrices, an err the JVM crashes with a SIGSEGV.


Since this code is all wrapped with javacpp, I don't really have a small sample that I can show you (not going to dump a whole bunch of code on you).


I'll keep trying to figure it out.  Pretty sure the problem is on my end here [?]  I really mainly wanted to ask you if I was using the correct methods at this point, or if there was anything very obviously that I was doing wrong.


Thanks a lot for your help!


Andy









________________________________
From: Karl Rupp <ru...@iu...>
Sent: Wednesday, July 20, 2016 5:00:36 PM
To: Andrew Palumbo; viennacl-devel
Subject: Re: [ViennaCL-devel] Copying Values out of a compressed_matrix

Hi,

please keep viennacl-devel in CC:

Just to clarify: Do you get incorrect values for a 1-by-1 matrix as
indicated in your sample data? In your previous email you mentioned that
results are fine for small matrices...

I'm afraid I can only guess at the source of the error with the
informations provided. Any chance that you can provide a standalone code
to reproduce the problem with reasonable effort?

Best regards,
Karli



On 07/20/2016 10:16 PM, Andrew Palumbo wrote:
> Thanks so much for your quick answer!
>
>
> I actually am sorry to say that I made a mistake when writing the last
> email, I copied the wrong signature from the VCL documentation, and then
> the mistake propagated through the rest of the e-mail.
>
>
> I am actually using viennacl::backend::memory_read().
>
>
> Eg, for the row_jumpers and column_idx  I read use:
>
> @Name("backend::memory_read")
> public static native void memoryReadInt(@Const @ByRef MemHandle src_buffer,
>                                int bytes_to_read,
>                                int offset,
>                                IntPointer ptr,
>                                boolean async);
>
> and for the Values:
>
>
> @Name("backend::memory_read")
> public static native void memoryReadDouble(@Const @ByRef MemHandle src_buffer,
>                                          int bytes_to_read,
>                                          int offset,
>                                          DoublePointer ptr,
>                                          boolean async);
>
> And then call:
>
>
> memoryReadInt(row_ptr_handle, (m +1) *4,0, row_ptr,false)
> memoryReadInt(col_idx_handle, NNz *4,0,col_idx,false)
> memoryReadDouble(element_handle, NNz *8,0, values,false)
>
>
> and after convetring them to java.nio.Buffers, am getting results like:
>
>
> rowBuff.get(1): 0    colBuff(1): 402653448 valBuff(1): 6.91730177312166E-310
>
>
> Have also tried reading into BytePointers similarly with the same type
> of results.  I know that the use of Javacpp obfuscates what the problem
> may be.  But I believe the Memorry is properly allocated.
>
>
>
> Sorry for the mistake.
>
>
> Thanks,
>
>
> Andy
>
>
> ------------------------------------------------------------------------
> *From:* Karl Rupp <ru...@iu...>
> *Sent:* Wednesday, July 20, 2016 3:50:07 PM
> *To:* Andrew Palumbo; Vie...@li...
> *Subject:* Re: [ViennaCL-devel] Copying Values out of a compressed_matrix
> Hi Andy,
>
> instead of viennacl::backend::memory_copy(), you want to use
> viennacl::backend::memory_read(), which directly transfers the data into
> your buffer(s).
>
> If you *know* that your handles are in host memory, you can even grab
> the values directly via
>    viennacl::linalg::host_based::detail::extract_raw_pointer<T>();
> defined in viennacl/linalg/host_based/common.hpp, around line 40.
>
> Please let me know if you still get errors after using that.
>
> Best regards,
> Karli
>
>
>
>
> On 07/20/2016 09:05 PM, Andrew Palumbo wrote:
>> Hello,
>>
>>
>> I'm Having some difficulties with compressed_matrix multiplication.
>>
>>
>> Essentially I am copying  three buffers, the CSR conversion of an Apache
>> Mahout SparseMatrix, into two compressed_matrices performing matrix
>> multiplication. I am doing this in scala and Java using javacpp.
>>
>>
>> For example, I have a 5 x 10 matrix of ~20% non-zero values which in CSR
>> format looks like this:
>>
>>
>> NNz: 12
>>
>> Row Pointer: [0, 1, 4, 6, 9, 12, ]
>>
>> Col Pointer: [9, 0, 8, 7, 2, 9, 0, 8, 9, 0, 3, 5, ]
>>
>> element Pointer: [0.4065367203992265, 0.04957158909682802,
>> 0.5205586068847993, 0.3708618354358446, 0.6963900565931678,
>> 0.8330915529787706, 0.32839112750638844, 0.7856168903297948,
>> 0.4265801782090245, 0.14733066454561583, 0.9501663495824946,
>> 0.9710498974366047, ]
>>
>> Multiplied by a similarly Sparse 10 x 5 compressed_matrix
>>
>> I use a CompressedMatrix wrapper which essentially wraps the
>>
>>      viennacl:: compressed_matrix (vcl_size_t rows, vcl_size_t cols,
>> vcl_size_t nonzeros=0, viennacl::context ctx=viennacl::context())
>>
>> constructor as well as the
>>
>>      compressed_matrix (matrix_expression< const compressed_matrix,
>> const compressed_matrix, op_prod > const &proxy).
>>
>> I have a helper function, /toVclCompressedMatrix/(..) which essentially
>> does the CSR conversion from a Mahout src matrix, calls the constructor
>> and uses viennacl::compressed_matrix::set(...) to set the buffers:
>>
>> val ompA =toVclCompressedMatrix(src = mxA, ompCtx)
>> val ompB =toVclCompressedMatrix(src = mxB, ompCtx)
>>
>>
>> and then create a new viennacl::compressed_matrix from the
>> viennacl::linalg::prod of the 2 matrices i.e.:
>>
>> val ompC =new CompressedMatrix(prod(ompA, ompB))
>>
>> The context in the above case is either the Host or OpenMP (I know that
>> there is some special casting of the row_jumpers and col_idxs that needs
>> to be done in the OpenCL version)
>>
>> The Matrix multiplication completes without error on small Matrices eg.
>> < 300 x 300
>> but seems to overwrite the resulting buffers on larger Matrices.
>>
>> My real problem, though is getting the memory back out of the
>> resulting`ompC` compresed_matrix so that i can write it back to a mahout
>> SparseMatrix.
>>
>> currently I am using:
>>
>> void viennacl::backend::memory_copy (mem_handle const &  src_buffer,
>>          mem_handle &      dst_buffer,
>>          vcl_size_t      src_offset,
>>          vcl_size_t      dst_offset,
>>          vcl_size_t      bytes_to_copy
>>      )
>>
>> on ompC.handel1,ompC.handel2 and ompC.handel source handels
>>
>> to copy into pre-allocated  row_jumper,  col_index and element buffers
>> (of size ompC.size1() + 1, ompC.nnz and ompC.nnz, respectivly).
>>
>> I am getting nonsensical values back that one would expect from memory
>> errors. eg:
>>
>> the Matrix geometry of the result: ompC.size1(), and omp.size2() are
>> correct and ompC.nnz is a reasonable value.
>>
>> It is possible that I have mis-allocated some of the memory on my side,
>> but I am pretty sure that most of the Buffers are allocated correctly
>> (usually JavaCPP does a pretty good job of this).
>>
>>
>> I guess, long story short, my question is am i using the correct method
>> of copying the memory out of a compressed_matrix?  is there something
>> glaringly incorrect that i am doing here?  Should I be using
>> viennacl::backend::memory_copy or is there a different method that i
>> should be using?
>>
>>
>> Thanks very much,
>>
>> Andy
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> ------------------------------------------------------------------------------
>> What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
>> patterns at an interface-level. Reveals which users, apps, and protocols are
>> consuming the most bandwidth. Provides multi-vendor support for NetFlow,
>> J-Flow, sFlow and other flows. Make informed decisions using capacity planning
>> reports.http://sdm.link/zohodev2dev
>>
>>
>>
>> _______________________________________________
>> ViennaCL-devel mailing list
>> Vie...@li...
>>https://lists.sourceforge.net/lists/listinfo/viennacl-devel
>>
>

Re: [ViennaCL-devel] Copying Values out of a compressed_matrix

From: Andrew P. <ap...@ou...> - 2016-07-20 21:40:49

Oops, sorry about not cc'ing all.


I do not get correct data back for a (Random.nextDouble() populated) 1 x 1 Matrix.


A:

  Row Pointer: [0, 1 ]

  Col Pointer: [0 ]
  element Pointer: [0.6465821602909256 ]


B:


  Row Pointer: [0, 1 ]
  Col Pointer: [0 ]
  element Pointer: [0.9513577109193919 ]


C = A %*% B



  Row Pointer: [469762248, 32632]
  Col Pointer: [469762248 ]
  element Pointer: [6.9245198744523E-310 ]


ouch.


It looks like I'm not copying the Buffers correctly at all.  I'm may be using the javacpp buffers incorrectly here, or I have possibly wrapped the viennacl::backend::memory_handle class incorrectly, so I'm using a pointer to the wrong memory from eg. viennacl::compressed_matrix::handle.


I mentioned before that the multiplication completed in on small <~300 x 300 matrices because if I try to multiply two larger sparse matrices, an err the JVM crashes with a SIGSEGV.


Since this code is all wrapped with javacpp, I don't really have a small sample that I can show you (not going to dump a whole bunch of code on you).


I'll keep trying to figure it out.  Pretty sure the problem is on my end here [?]  I really mainly wanted to ask you if I was using the correct methods at this point, or if there was anything very obviously that I was doing wrong.


Thanks a lot for your help!


Andy









________________________________
From: Karl Rupp <ru...@iu...>
Sent: Wednesday, July 20, 2016 5:00:36 PM
To: Andrew Palumbo; viennacl-devel
Subject: Re: [ViennaCL-devel] Copying Values out of a compressed_matrix

Hi,

please keep viennacl-devel in CC:

Just to clarify: Do you get incorrect values for a 1-by-1 matrix as
indicated in your sample data? In your previous email you mentioned that
results are fine for small matrices...

I'm afraid I can only guess at the source of the error with the
informations provided. Any chance that you can provide a standalone code
to reproduce the problem with reasonable effort?

Best regards,
Karli



On 07/20/2016 10:16 PM, Andrew Palumbo wrote:
> Thanks so much for your quick answer!
>
>
> I actually am sorry to say that I made a mistake when writing the last
> email, I copied the wrong signature from the VCL documentation, and then
> the mistake propagated through the rest of the e-mail.
>
>
> I am actually using viennacl::backend::memory_read().
>
>
> Eg, for the row_jumpers and column_idx  I read use:
>
> @Name("backend::memory_read")
> public static native void memoryReadInt(@Const @ByRef MemHandle src_buffer,
>                                int bytes_to_read,
>                                int offset,
>                                IntPointer ptr,
>                                boolean async);
>
> and for the Values:
>
>
> @Name("backend::memory_read")
> public static native void memoryReadDouble(@Const @ByRef MemHandle src_buffer,
>                                          int bytes_to_read,
>                                          int offset,
>                                          DoublePointer ptr,
>                                          boolean async);
>
> And then call:
>
>
> memoryReadInt(row_ptr_handle, (m +1) *4,0, row_ptr,false)
> memoryReadInt(col_idx_handle, NNz *4,0,col_idx,false)
> memoryReadDouble(element_handle, NNz *8,0, values,false)
>
>
> and after convetring them to java.nio.Buffers, am getting results like:
>
>
> rowBuff.get(1): 0    colBuff(1): 402653448 valBuff(1): 6.91730177312166E-310
>
>
> Have also tried reading into BytePointers similarly with the same type
> of results.  I know that the use of Javacpp obfuscates what the problem
> may be.  But I believe the Memorry is properly allocated.
>
>
>
> Sorry for the mistake.
>
>
> Thanks,
>
>
> Andy
>
>
> ------------------------------------------------------------------------
> *From:* Karl Rupp <ru...@iu...>
> *Sent:* Wednesday, July 20, 2016 3:50:07 PM
> *To:* Andrew Palumbo; Vie...@li...
> *Subject:* Re: [ViennaCL-devel] Copying Values out of a compressed_matrix
> Hi Andy,
>
> instead of viennacl::backend::memory_copy(), you want to use
> viennacl::backend::memory_read(), which directly transfers the data into
> your buffer(s).
>
> If you *know* that your handles are in host memory, you can even grab
> the values directly via
>    viennacl::linalg::host_based::detail::extract_raw_pointer<T>();
> defined in viennacl/linalg/host_based/common.hpp, around line 40.
>
> Please let me know if you still get errors after using that.
>
> Best regards,
> Karli
>
>
>
>
> On 07/20/2016 09:05 PM, Andrew Palumbo wrote:
>> Hello,
>>
>>
>> I'm Having some difficulties with compressed_matrix multiplication.
>>
>>
>> Essentially I am copying  three buffers, the CSR conversion of an Apache
>> Mahout SparseMatrix, into two compressed_matrices performing matrix
>> multiplication. I am doing this in scala and Java using javacpp.
>>
>>
>> For example, I have a 5 x 10 matrix of ~20% non-zero values which in CSR
>> format looks like this:
>>
>>
>> NNz: 12
>>
>> Row Pointer: [0, 1, 4, 6, 9, 12, ]
>>
>> Col Pointer: [9, 0, 8, 7, 2, 9, 0, 8, 9, 0, 3, 5, ]
>>
>> element Pointer: [0.4065367203992265, 0.04957158909682802,
>> 0.5205586068847993, 0.3708618354358446, 0.6963900565931678,
>> 0.8330915529787706, 0.32839112750638844, 0.7856168903297948,
>> 0.4265801782090245, 0.14733066454561583, 0.9501663495824946,
>> 0.9710498974366047, ]
>>
>> Multiplied by a similarly Sparse 10 x 5 compressed_matrix
>>
>> I use a CompressedMatrix wrapper which essentially wraps the
>>
>>      viennacl:: compressed_matrix (vcl_size_t rows, vcl_size_t cols,
>> vcl_size_t nonzeros=0, viennacl::context ctx=viennacl::context())
>>
>> constructor as well as the
>>
>>      compressed_matrix (matrix_expression< const compressed_matrix,
>> const compressed_matrix, op_prod > const &proxy).
>>
>> I have a helper function, /toVclCompressedMatrix/(..) which essentially
>> does the CSR conversion from a Mahout src matrix, calls the constructor
>> and uses viennacl::compressed_matrix::set(...) to set the buffers:
>>
>> val ompA =toVclCompressedMatrix(src = mxA, ompCtx)
>> val ompB =toVclCompressedMatrix(src = mxB, ompCtx)
>>
>>
>> and then create a new viennacl::compressed_matrix from the
>> viennacl::linalg::prod of the 2 matrices i.e.:
>>
>> val ompC =new CompressedMatrix(prod(ompA, ompB))
>>
>> The context in the above case is either the Host or OpenMP (I know that
>> there is some special casting of the row_jumpers and col_idxs that needs
>> to be done in the OpenCL version)
>>
>> The Matrix multiplication completes without error on small Matrices eg.
>> < 300 x 300
>> but seems to overwrite the resulting buffers on larger Matrices.
>>
>> My real problem, though is getting the memory back out of the
>> resulting`ompC` compresed_matrix so that i can write it back to a mahout
>> SparseMatrix.
>>
>> currently I am using:
>>
>> void viennacl::backend::memory_copy (mem_handle const &  src_buffer,
>>          mem_handle &      dst_buffer,
>>          vcl_size_t      src_offset,
>>          vcl_size_t      dst_offset,
>>          vcl_size_t      bytes_to_copy
>>      )
>>
>> on ompC.handel1,ompC.handel2 and ompC.handel source handels
>>
>> to copy into pre-allocated  row_jumper,  col_index and element buffers
>> (of size ompC.size1() + 1, ompC.nnz and ompC.nnz, respectivly).
>>
>> I am getting nonsensical values back that one would expect from memory
>> errors. eg:
>>
>> the Matrix geometry of the result: ompC.size1(), and omp.size2() are
>> correct and ompC.nnz is a reasonable value.
>>
>> It is possible that I have mis-allocated some of the memory on my side,
>> but I am pretty sure that most of the Buffers are allocated correctly
>> (usually JavaCPP does a pretty good job of this).
>>
>>
>> I guess, long story short, my question is am i using the correct method
>> of copying the memory out of a compressed_matrix?  is there something
>> glaringly incorrect that i am doing here?  Should I be using
>> viennacl::backend::memory_copy or is there a different method that i
>> should be using?
>>
>>
>> Thanks very much,
>>
>> Andy
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> ------------------------------------------------------------------------------
>> What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
>> patterns at an interface-level. Reveals which users, apps, and protocols are
>> consuming the most bandwidth. Provides multi-vendor support for NetFlow,
>> J-Flow, sFlow and other flows. Make informed decisions using capacity planning
>> reports.http://sdm.link/zohodev2dev
>>
>>
>>
>> _______________________________________________
>> ViennaCL-devel mailing list
>> Vie...@li...
>>https://lists.sourceforge.net/lists/listinfo/viennacl-devel
>>
>

Re: [ViennaCL-devel] Copying Values out of a compressed_matrix

From: Karl R. <ru...@iu...> - 2016-07-20 21:00:47

Hi,

please keep viennacl-devel in CC:

Just to clarify: Do you get incorrect values for a 1-by-1 matrix as 
indicated in your sample data? In your previous email you mentioned that 
results are fine for small matrices...

I'm afraid I can only guess at the source of the error with the 
informations provided. Any chance that you can provide a standalone code 
to reproduce the problem with reasonable effort?

Best regards,
Karli



On 07/20/2016 10:16 PM, Andrew Palumbo wrote:
> Thanks so much for your quick answer!
>
>
> I actually am sorry to say that I made a mistake when writing the last
> email, I copied the wrong signature from the VCL documentation, and then
> the mistake propagated through the rest of the e-mail.
>
>
> I am actually using viennacl::backend::memory_read().
>
>
> Eg, for the row_jumpers and column_idx  I read use:
>
> @Name("backend::memory_read")
> public static native void memoryReadInt(@Const @ByRef MemHandle src_buffer,
>                                int bytes_to_read,
>                                int offset,
>                                IntPointer ptr,
>                                boolean async);
>
> and for the Values:
>
>
> @Name("backend::memory_read")
> public static native void memoryReadDouble(@Const @ByRef MemHandle src_buffer,
>                                          int bytes_to_read,
>                                          int offset,
>                                          DoublePointer ptr,
>                                          boolean async);
>
> And then call:
>
>
> memoryReadInt(row_ptr_handle, (m +1) *4,0, row_ptr,false)
> memoryReadInt(col_idx_handle, NNz *4,0,col_idx,false)
> memoryReadDouble(element_handle, NNz *8,0, values,false)
>
>
> and after convetring them to java.nio.Buffers, am getting results like:
>
>
> rowBuff.get(1): 0    colBuff(1): 402653448 valBuff(1): 6.91730177312166E-310
>
>
> Have also tried reading into BytePointers similarly with the same type
> of results.  I know that the use of Javacpp obfuscates what the problem
> may be.  But I believe the Memorry is properly allocated.
>
>
>
> Sorry for the mistake.
>
>
> Thanks,
>
>
> Andy
>
>
> ------------------------------------------------------------------------
> *From:* Karl Rupp <ru...@iu...>
> *Sent:* Wednesday, July 20, 2016 3:50:07 PM
> *To:* Andrew Palumbo; Vie...@li...
> *Subject:* Re: [ViennaCL-devel] Copying Values out of a compressed_matrix
> Hi Andy,
>
> instead of viennacl::backend::memory_copy(), you want to use
> viennacl::backend::memory_read(), which directly transfers the data into
> your buffer(s).
>
> If you *know* that your handles are in host memory, you can even grab
> the values directly via
>    viennacl::linalg::host_based::detail::extract_raw_pointer<T>();
> defined in viennacl/linalg/host_based/common.hpp, around line 40.
>
> Please let me know if you still get errors after using that.
>
> Best regards,
> Karli
>
>
>
>
> On 07/20/2016 09:05 PM, Andrew Palumbo wrote:
>> Hello,
>>
>>
>> I'm Having some difficulties with compressed_matrix multiplication.
>>
>>
>> Essentially I am copying  three buffers, the CSR conversion of an Apache
>> Mahout SparseMatrix, into two compressed_matrices performing matrix
>> multiplication. I am doing this in scala and Java using javacpp.
>>
>>
>> For example, I have a 5 x 10 matrix of ~20% non-zero values which in CSR
>> format looks like this:
>>
>>
>> NNz: 12
>>
>> Row Pointer: [0, 1, 4, 6, 9, 12, ]
>>
>> Col Pointer: [9, 0, 8, 7, 2, 9, 0, 8, 9, 0, 3, 5, ]
>>
>> element Pointer: [0.4065367203992265, 0.04957158909682802,
>> 0.5205586068847993, 0.3708618354358446, 0.6963900565931678,
>> 0.8330915529787706, 0.32839112750638844, 0.7856168903297948,
>> 0.4265801782090245, 0.14733066454561583, 0.9501663495824946,
>> 0.9710498974366047, ]
>>
>> Multiplied by a similarly Sparse 10 x 5 compressed_matrix
>>
>> I use a CompressedMatrix wrapper which essentially wraps the
>>
>>      viennacl:: compressed_matrix (vcl_size_t rows, vcl_size_t cols,
>> vcl_size_t nonzeros=0, viennacl::context ctx=viennacl::context())
>>
>> constructor as well as the
>>
>>      compressed_matrix (matrix_expression< const compressed_matrix,
>> const compressed_matrix, op_prod > const &proxy).
>>
>> I have a helper function, /toVclCompressedMatrix/(..) which essentially
>> does the CSR conversion from a Mahout src matrix, calls the constructor
>> and uses viennacl::compressed_matrix::set(...) to set the buffers:
>>
>> val ompA =toVclCompressedMatrix(src = mxA, ompCtx)
>> val ompB =toVclCompressedMatrix(src = mxB, ompCtx)
>>
>>
>> and then create a new viennacl::compressed_matrix from the
>> viennacl::linalg::prod of the 2 matrices i.e.:
>>
>> val ompC =new CompressedMatrix(prod(ompA, ompB))
>>
>> The context in the above case is either the Host or OpenMP (I know that
>> there is some special casting of the row_jumpers and col_idxs that needs
>> to be done in the OpenCL version)
>>
>> The Matrix multiplication completes without error on small Matrices eg.
>> < 300 x 300
>> but seems to overwrite the resulting buffers on larger Matrices.
>>
>> My real problem, though is getting the memory back out of the
>> resulting`ompC` compresed_matrix so that i can write it back to a mahout
>> SparseMatrix.
>>
>> currently I am using:
>>
>> void viennacl::backend::memory_copy (mem_handle const &  src_buffer,
>>          mem_handle &      dst_buffer,
>>          vcl_size_t      src_offset,
>>          vcl_size_t      dst_offset,
>>          vcl_size_t      bytes_to_copy
>>      )
>>
>> on ompC.handel1,ompC.handel2 and ompC.handel source handels
>>
>> to copy into pre-allocated  row_jumper,  col_index and element buffers
>> (of size ompC.size1() + 1, ompC.nnz and ompC.nnz, respectivly).
>>
>> I am getting nonsensical values back that one would expect from memory
>> errors. eg:
>>
>> the Matrix geometry of the result: ompC.size1(), and omp.size2() are
>> correct and ompC.nnz is a reasonable value.
>>
>> It is possible that I have mis-allocated some of the memory on my side,
>> but I am pretty sure that most of the Buffers are allocated correctly
>> (usually JavaCPP does a pretty good job of this).
>>
>>
>> I guess, long story short, my question is am i using the correct method
>> of copying the memory out of a compressed_matrix?  is there something
>> glaringly incorrect that i am doing here?  Should I be using
>> viennacl::backend::memory_copy or is there a different method that i
>> should be using?
>>
>>
>> Thanks very much,
>>
>> Andy
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> ------------------------------------------------------------------------------
>> What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
>> patterns at an interface-level. Reveals which users, apps, and protocols are
>> consuming the most bandwidth. Provides multi-vendor support for NetFlow,
>> J-Flow, sFlow and other flows. Make informed decisions using capacity planning
>> reports.http://sdm.link/zohodev2dev
>>
>>
>>
>> _______________________________________________
>> ViennaCL-devel mailing list
>> Vie...@li...
>>https://lists.sourceforge.net/lists/listinfo/viennacl-devel
>>
>

Re: [ViennaCL-devel] Copying Values out of a compressed_matrix

From: Karl R. <ru...@iu...> - 2016-07-20 19:50:17

Hi Andy,

instead of viennacl::backend::memory_copy(), you want to use
viennacl::backend::memory_read(), which directly transfers the data into 
your buffer(s).

If you *know* that your handles are in host memory, you can even grab 
the values directly via
  viennacl::linalg::host_based::detail::extract_raw_pointer<T>();
defined in viennacl/linalg/host_based/common.hpp, around line 40.

Please let me know if you still get errors after using that.

Best regards,
Karli




On 07/20/2016 09:05 PM, Andrew Palumbo wrote:
> Hello,
>
>
> I'm Having some difficulties with compressed_matrix multiplication.
>
>
> Essentially I am copying  three buffers, the CSR conversion of an Apache
> Mahout SparseMatrix, into two compressed_matrices performing matrix
> multiplication. I am doing this in scala and Java using javacpp.
>
>
> For example, I have a 5 x 10 matrix of ~20% non-zero values which in CSR
> format looks like this:
>
>
> NNz: 12
>
> Row Pointer: [0, 1, 4, 6, 9, 12, ]
>
> Col Pointer: [9, 0, 8, 7, 2, 9, 0, 8, 9, 0, 3, 5, ]
>
> element Pointer: [0.4065367203992265, 0.04957158909682802,
> 0.5205586068847993, 0.3708618354358446, 0.6963900565931678,
> 0.8330915529787706, 0.32839112750638844, 0.7856168903297948,
> 0.4265801782090245, 0.14733066454561583, 0.9501663495824946,
> 0.9710498974366047, ]
>
> Multiplied by a similarly Sparse 10 x 5 compressed_matrix
>
> I use a CompressedMatrix wrapper which essentially wraps the
>
>      viennacl:: compressed_matrix (vcl_size_t rows, vcl_size_t cols,
> vcl_size_t nonzeros=0, viennacl::context ctx=viennacl::context())
>
> constructor as well as the
>
>      compressed_matrix (matrix_expression< const compressed_matrix,
> const compressed_matrix, op_prod > const &proxy).
>
> I have a helper function, /toVclCompressedMatrix/(..) which essentially
> does the CSR conversion from a Mahout src matrix, calls the constructor
> and uses viennacl::compressed_matrix::set(...) to set the buffers:
>
> val ompA =toVclCompressedMatrix(src = mxA, ompCtx)
> val ompB =toVclCompressedMatrix(src = mxB, ompCtx)
>
>
> and then create a new viennacl::compressed_matrix from the
> viennacl::linalg::prod of the 2 matrices i.e.:
>
> val ompC =new CompressedMatrix(prod(ompA, ompB))
>
> The context in the above case is either the Host or OpenMP (I know that
> there is some special casting of the row_jumpers and col_idxs that needs
> to be done in the OpenCL version)
>
> The Matrix multiplication completes without error on small Matrices eg.
> < 300 x 300
> but seems to overwrite the resulting buffers on larger Matrices.
>
> My real problem, though is getting the memory back out of the
> resulting`ompC` compresed_matrix so that i can write it back to a mahout
> SparseMatrix.
>
> currently I am using:
>
> void viennacl::backend::memory_copy (mem_handle const &  src_buffer,
>          mem_handle &      dst_buffer,
>          vcl_size_t      src_offset,
>          vcl_size_t      dst_offset,
>          vcl_size_t      bytes_to_copy
>      )
>
> on ompC.handel1,ompC.handel2 and ompC.handel source handels
>
> to copy into pre-allocated  row_jumper,  col_index and element buffers
> (of size ompC.size1() + 1, ompC.nnz and ompC.nnz, respectivly).
>
> I am getting nonsensical values back that one would expect from memory
> errors. eg:
>
> the Matrix geometry of the result: ompC.size1(), and omp.size2() are
> correct and ompC.nnz is a reasonable value.
>
> It is possible that I have mis-allocated some of the memory on my side,
> but I am pretty sure that most of the Buffers are allocated correctly
> (usually JavaCPP does a pretty good job of this).
>
>
> I guess, long story short, my question is am i using the correct method
> of copying the memory out of a compressed_matrix?  is there something
> glaringly incorrect that i am doing here?  Should I be using
> viennacl::backend::memory_copy or is there a different method that i
> should be using?
>
>
> Thanks very much,
>
> Andy
>
>
>
>
>
>
>
>
>
>
> ------------------------------------------------------------------------------
> What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
> patterns at an interface-level. Reveals which users, apps, and protocols are
> consuming the most bandwidth. Provides multi-vendor support for NetFlow,
> J-Flow, sFlow and other flows. Make informed decisions using capacity planning
> reports.http://sdm.link/zohodev2dev
>
>
>
> _______________________________________________
> ViennaCL-devel mailing list
> Vie...@li...
> https://lists.sourceforge.net/lists/listinfo/viennacl-devel
>

[ViennaCL-devel] Copying Values out of a compressed_matrix

From: Andrew P. <ap...@ou...> - 2016-07-20 19:05:46

Hello,


I'm Having some difficulties with compressed_matrix multiplication.


Essentially I am copying  three buffers, the CSR conversion of an Apache Mahout SparseMatrix, into two compressed_matrices performing matrix multiplication. I am doing this in scala and Java using javacpp.


For example, I have a 5 x 10 matrix of ~20% non-zero values which in CSR format looks like this:


NNz: 12

Row Pointer: [0, 1, 4, 6, 9, 12, ]

Col Pointer: [9, 0, 8, 7, 2, 9, 0, 8, 9, 0, 3, 5, ]

element Pointer: [0.4065367203992265, 0.04957158909682802, 0.5205586068847993, 0.3708618354358446, 0.6963900565931678, 0.8330915529787706, 0.32839112750638844, 0.7856168903297948, 0.4265801782090245, 0.14733066454561583, 0.9501663495824946, 0.9710498974366047, ]

Multiplied by a similarly Sparse 10 x 5 compressed_matrix

I use a CompressedMatrix wrapper which essentially wraps the

    viennacl:: compressed_matrix (vcl_size_t rows, vcl_size_t cols, vcl_size_t nonzeros=0, viennacl::context ctx=viennacl::context())

constructor as well as the

    compressed_matrix (matrix_expression< const compressed_matrix, const compressed_matrix, op_prod > const &proxy).

I have a helper function, toVclCompressedMatrix(..) which essentially does the CSR conversion from a Mahout src matrix, calls the constructor  and uses viennacl::compressed_matrix::set(...) to set the buffers:


val ompA = toVclCompressedMatrix(src = mxA, ompCtx)
val ompB = toVclCompressedMatrix(src = mxB, ompCtx)

and then create a new viennacl::compressed_matrix from the viennacl::linalg::prod of the 2 matrices i.e.:


val ompC = new CompressedMatrix(prod(ompA, ompB))


The context in the above case is either the Host or OpenMP (I know that there is some special casting of the row_jumpers and col_idxs that needs to be done in the OpenCL version)

The Matrix multiplication completes without error on small Matrices eg. < 300 x 300
but seems to overwrite the resulting buffers on larger Matrices.

My real problem, though is getting the memory back out of the resulting `ompC` compresed_matrix so that i can write it back to a mahout SparseMatrix.

currently I am using:

void viennacl::backend::memory_copy (mem_handle const &  src_buffer,
        mem_handle &      dst_buffer,
        vcl_size_t      src_offset,
        vcl_size_t      dst_offset,
        vcl_size_t      bytes_to_copy
    )

on ompC.handel1, ompC.handel2 and ompC.handel source handels

to copy into pre-allocated  row_jumper,  col_index and element buffers (of size ompC.size1() + 1, ompC.nnz and ompC.nnz, respectivly).

I am getting nonsensical values back that one would expect from memory errors. eg:

the Matrix geometry of the result: ompC.size1(), and omp.size2() are correct and ompC.nnz is a reasonable value.

It is possible that I have mis-allocated some of the memory on my side, but I am pretty sure that most of the Buffers are allocated correctly (usually JavaCPP does a pretty good job of this).


I guess, long story short, my question is am i using the correct method of copying the memory out of a compressed_matrix?  is there something glaringly incorrect that i am doing here?  Should I be using viennacl::backend::memory_copy or is there a different method that i should be using?


Thanks very much,

Andy

Re: [ViennaCL-devel] MPI layer for ViennaCL

From: Karl R. <ru...@iu...> - 2016-07-19 06:53:36

Hi Sumit,

 > I point your attention to this :
> http://www.iue.tuwien.ac.at/cse/index.php/gsoc/2011/ideas-2011/104-viennacl-mpi-layer-for-linear-algebra-with-large-matrices-new.html
>
> has this ever been incorporated into ViennaCL?

No, no student ever worked on this. However, ViennaCL's functionality is 
available in an MPI setting via PETSc [1]. This is much better than 
offering MPI-funtionality in ViennaCL directly.

Best regards,
Karli

[1] http://www.mcs.anl.gov/petsc/

[ViennaCL-devel] MPI layer for ViennaCL

From: Sumit K. <dos...@ya...> - 2016-07-19 06:46:11

Karl,
I point your attention to this :
http://www.iue.tuwien.ac.at/cse/index.php/gsoc/2011/ideas-2011/104-viennacl-mpi-layer-for-linear-algebra-with-large-matrices-new.html

has this ever been incorporated into ViennaCL?

Thanks and Regards Sumit

Re: [ViennaCL-devel] Initializing matrices via direct serialization (row major or CCS)

From: Karl R. <ru...@iu...> - 2016-07-18 17:46:53

Hi,

 >     Is DGEMM your performance-critical operation? Are there any other
>     performance-critical operations?
>
>
> For now we are only looking at (especially sparse) blas3 and
> decompositions. Basically, your normal R base functionality for
> in-memory sparse algebra.

Sparse factorizations (LU, QR, etc.) are very hard to parallelize for 
many-core architectures (GPUs in particular).


> One more question i had:
>
> do you guys handle low resource cases? like transfer optimization for
> blockwise multiplication in case operands do not fit -- out-of-core
> algorithms?

out-of-core has gone out-of-fashion. The reason is that the differences 
in memory speed has become so large that falling back to a slower memory 
type almost never pays off.


> Did you look at gpu+cpu combined balanced algorithms (as i guess MAGMA
> did for some)?

yes, a couple of algorithms in ViennaCL use GPUs for the main work (i.e. 
GEMM) and CPUs for sequential in the algorithm.

Best regards,
Karli

Re: [ViennaCL-devel] Initializing matrices via direct serialization (row major or CCS)

From: Dmitriy L. <dl...@gm...> - 2016-07-18 17:42:20

Thank you, Karl!

this is very helpful!



> Is DGEMM your performance-critical operation? Are there any other
> performance-critical operations?
>
>
> For now we are only looking at (especially sparse) blas3 and
decompositions. Basically, your normal R base functionality for in-memory
sparse algebra.

One more question i had:

do you guys handle low resource cases? like transfer optimization for
blockwise multiplication in case operands do not fit -- out-of-core
algorithms?

Did you look at gpu+cpu combined balanced algorithms (as i guess MAGMA did
for some)?

Re: [ViennaCL-devel] Initializing matrices via direct serialization (row major or CCS)

From: Karl R. <ru...@iu...> - 2016-07-18 11:42:05

>     Why do you expect to beat OpenBLAS? Their kernels are really well
>     optimized, and for lare dense matrix-matrix you are always FLOP-limited.
>
>
> I don't expect, i experiment. I don't know why, current results are such
> that stock ubuntu blas takes about 88 seconds for dense 10k
> multiplication test (with R which is setup to use it, perhaps they also
> take long time to convert to blas, but nevertheless it pins cpu 100%).
> If i compile Vienna with -march=haswell and -ffast-math then i get about
> 35 seconds. What's purplexing, the same test in bidmat's MatD matrices
> takes less than 10 seconds on my computer -- and they don't even
> saturate my cpu 100%. Something is fishy about bidmat. I don't have a
> super-beafy cpu, only a 6-core/12threads haswell-e. I know that even mkl
> takes in the area of 16 seconds on 24 threads in xeons, so 88 seconds
> for openblas on my platform looks plausible. 10 or even 8 seconds
> (BidMat+supposedly MKL) does not -- something is fishy there.

it shouldn't be too hard to directly verify correctness of the results :-)


>     Multiplication of 10k-by-10k matrices amounts to 200 GFLOP of
>     compute in double precision. A Haswell-E machine provides that
>     within a few seconds, depending on the number of cores (2.4 GHz * 4
>     doubles with AVX * 2 for FMA = 19.2 GFLOP/sec per core. MKL achieves
>     about 15 GFLOP/sec per core).
>
>
> So this sounds like a validation of the BidMat's results. Interesting.
> Why R+openblas is so slow then? What is the expected output for ViennaCL
> + OpenMP then compared to MKL rates?

I don't know the internals of R+OpenBLAS. Maybe there is extensive 
debugging going, or OpenBLAS is only used with a single thread. 
ViennaCL+OpenMP vs. MKL is hard to answer in general. It all depends a 
lot on compiler flags, the underlying CPU, etc.


> How much of improvement do you observe/expect from a new pull request,
> is there any hope to get closer to MKL dense dgemm?

The student reported about 50 percent of MKL on a laptop CPU. More 
importantly, though, is that the new code provides a good infrastructure 
for further improvements for different architectures, e.g. ARM-based CPUs.


> The primary reason against blas/mkl are that they are yet another
> platform which, most importantly, we cannot redistribute being an
> apache2 licensed. So we'd have to ask people to install a particular
> commercial product, but if ViennaCL would cover our sparse algorithm
> needs, we'd rather just have it all in one package (or at least leverage
> hardware/software support in steps). We are very limited in resources,
> that's why reason we are trying to get working with ViennaCL:
>
> -- it has sparse algorithms
> -- it supports host/OpenCL/cuda with need for new apis/conversions
> -- it does not require installation of any shared libraries beyond what
> javacpp already does for us automagically.  So we basically can drop a
> jar with javacpp in it into a spark application and having it running on
> ViennaCL. Even netlib (blas) or netlib-java api does not make it quite
> as easy (which btw we cannot redistribute either becaause of their
> licenses).

ah, makes sense!


> This is hard to beat, especially if ViennaCL becomes well-rounded in
> performance in most areas of interest, we don't need to depend on a
> particular flavor of libblas.so to be present (or any libblas.so for
> that matter).

Is DGEMM your performance-critical operation? Are there any other 
performance-critical operations?


> One more question: is it possible to copy one matrix into an openCL
> device while solving another?
> thank you!

yes, that is possible using async_copy(). I recommend to copy before the 
solver is started. You can also achieve a similar effect through a 
second OpenCL command queue.

(Needless to say, you should first profile in order to find out whether 
it is worth the effort)

Best regards,
Karli

Re: [ViennaCL-devel] Initializing matrices via direct serialization (row major or CCS)

From: Dmitriy L. <dl...@gm...> - 2016-07-15 23:02:21

On Thu, Jul 14, 2016 at 11:21 AM, Dmitriy Lyubimov <dl...@gm...>
wrote:

> One more question: is it possible to copy one matrix into an openCL device
> while solving another?
> thank you!
>
>>
>>
Also known in cuda as "concurrent kernel and execution" capability? Thank
you.

Re: [ViennaCL-devel] Initializing matrices via direct serialization (row major or CCS)

From: Dmitriy L. <dl...@gm...> - 2016-07-14 18:25:08

sorry this should read

>
> I don't expect, i experiment. I don't know why, current results are such
> that stock ubuntu OPENBLAS takes about 88 seconds for dense 10k
> multiplication test
>

Re: [ViennaCL-devel] Initializing matrices via direct serialization (row major or CCS)

From: Dmitriy L. <dl...@gm...> - 2016-07-14 18:21:36

Karl, thank you for your reply!

On Thu, Jul 14, 2016 at 1:45 AM, Karl Rupp <ru...@iu...> wrote:

> Hi again,
>
>
>>
> 15 seconds of copying for a 10k-by-10k matrix looks way too much.
> 10k-by-10k is 800 MB of data for double precision, so this should not take
> much more than 100 ms on a low-range laptop (10 GB/sec memory bandwidth).
> Even with multiple matrices and copies you should stay in the 1 second
> regime.

You are right. i don't see a significant variation weither i use fast_copy
or constructor. The time at this point is mostly consumed by moving mahout
data structures into RM or CCS format and it is really a POC now so we are
working to get it faster. But java is really slow, especially when working
 with native buffers naively -- we will have to improve that. For the
record, 15 seconds probably include loading all necessary classes, and to
serialize 2 10k-x-10k matrices in and 1 10k x k matrix out, including all
these scala-side conversions.

> Why do you expect to beat OpenBLAS? Their kernels are really well
> optimized, and for lare dense matrix-matrix you are always FLOP-limited.

I don't expect, i experiment. I don't know why, current results are such
that stock ubuntu blas takes about 88 seconds for dense 10k multiplication
test (with R which is setup to use it, perhaps they also take long time to
convert to blas, but nevertheless it pins cpu 100%). If i compile Vienna
with -march=haswell and -ffast-math then i get about 35 seconds. What's
purplexing, the same test in bidmat's MatD matrices takes less than 10
seconds on my computer -- and they don't even saturate my cpu 100%.
Something is fishy about bidmat. I don't have a super-beafy cpu, only a
6-core/12threads haswell-e. I know that even mkl takes in the area of 16
seconds on 24 threads in xeons, so 88 seconds for openblas on my platform
looks plausible. 10 or even 8 seconds (BidMat+supposedly MKL) does not --
something is fishy there.

>
>
>
> On the other hand, bidmat (which allegedly uses mkl) does the same test,
>> double precision, in under 10 seconds. I can't fathom how, but it does.
>> I have a haswell-E platform.
>>
>
> Multiplication of 10k-by-10k matrices amounts to 200 GFLOP of compute in
> double precision. A Haswell-E machine provides that within a few seconds,
> depending on the number of cores (2.4 GHz * 4 doubles with AVX * 2 for FMA
> = 19.2 GFLOP/sec per core. MKL achieves about 15 GFLOP/sec per core).
>

So this sounds like a validation of the BidMat's results. Interesting. Why
R+openblas is so slow then? What is the expected output for ViennaCL +
OpenMP then compared to MKL rates?

How much of improvement do you observe/expect from a new pull request, is
there any hope to get closer to MKL dense dgemm?

The primary reason against blas/mkl are that they are yet another platform
which, most importantly, we cannot redistribute being an apache2 licensed.
So we'd have to ask people to install a particular commercial product, but
if ViennaCL would cover our sparse algorithm needs, we'd rather just have
it all in one package (or at least leverage hardware/software support in
steps). We are very limited in resources, that's why reason we are trying
to get working with ViennaCL:

-- it has sparse algorithms
-- it supports host/OpenCL/cuda with need for new apis/conversions
-- it does not require installation of any shared libraries beyond what
javacpp already does for us automagically.  So we basically can drop a jar
with javacpp in it into a spark application and having it running on
ViennaCL. Even netlib (blas) or netlib-java api does not make it quite as
easy (which btw we cannot redistribute either becaause of their licenses).

This is hard to beat, especially if ViennaCL becomes well-rounded in
performance in most areas of interest, we don't need to depend on a
particular flavor of libblas.so to be present (or any libblas.so for that
matter).

One more question: is it possible to copy one matrix into an openCL device
while solving another?
thank you!

>
>
>

Re: [ViennaCL-devel] Initializing matrices via direct serialization (row major or CCS)

From: Karl R. <ru...@iu...> - 2016-07-14 08:45:23

Hi again,

 > So fast_copy still copies the memory and has copying overhead, even with
> MAIN_MEMORY context?

Yes. It's a copy() operation, so it just does what the name suggests.

> Is there a way to do shallow copying  (i.e. just pointer initialization)
> to the matrix data buffer? Isn't it what some constructors of matrix or
> matrix_base do?

Yes, you can pass your pointer via the constructors, e.g.
https://github.com/viennacl/viennacl-dev/blob/master/viennacl/matrix.hpp#L721


> What i am getting at, it looks like i am getting a significant overhead
> for just copying -- actually, it seems i am getting double overhead --
> once when i prepare padding and all as required by the internal_size?(),
> and then i pass it into the fast_copy() which apparently does copying
> again, even if we are using host memory matrices.

If you want to 'wrap' your data in a ViennaCL matrix, pass the pointer 
to the constructors. If you want to quickly copy your data over to 
memory managed by a ViennaCL matrix, use copy() or fast_copy(). From 
your description it looks like you are now looking for the constructor 
calls, but from your earlier email I thought that you are looking for a 
fast_copy().



> all in all, by my estimates this copying back and forth (which is,
> granted, is not greatly optimized on our side) takes ~15..17 seconds out
> of 60 seconds total when multiplying 10k x 10k dense arguments via
> ViennaCL. I also optimize to -march=haswell  and use -ffast-math,
> without those i seem to fall too far behind what R + openblas can do in
> this test. Then, my processing time swells up to 2 minutes without
> optimizing for non-compliant arithmetics.

15 seconds of copying for a 10k-by-10k matrix looks way too much. 
10k-by-10k is 800 MB of data for double precision, so this should not 
take much more than 100 ms on a low-range laptop (10 GB/sec memory 
bandwidth). Even with multiple matrices and copies you should stay in 
the 1 second regime.


> If i can wrap the buffer and avoid copying for MAIN_MEMORY context, i'd
> be shaving off another 10% or so of the execution time. Which would make
> me happier, as i probably would be able to beat openblas given custom
> cpu architecture flags.

Why do you expect to beat OpenBLAS? Their kernels are really well 
optimized, and for lare dense matrix-matrix you are always FLOP-limited.


> On the other hand, bidmat (which allegedly uses mkl) does the same test,
> double precision, in under 10 seconds. I can't fathom how, but it does.
> I have a haswell-E platform.

Multiplication of 10k-by-10k matrices amounts to 200 GFLOP of compute in 
double precision. A Haswell-E machine provides that within a few 
seconds, depending on the number of cores (2.4 GHz * 4 doubles with AVX 
* 2 for FMA = 19.2 GFLOP/sec per core. MKL achieves about 15 GFLOP/sec 
per core).

ViennaCL's host-backend is not strong on dense matrix-matrix multiplies 
(even though we've got some improvements in a pull request), so for this 
particular operation you will get better performance from MKL, OpenBLAS, 
or libflame.

Best regards,
Karli





> On Tue, Jul 12, 2016 at 9:27 AM, Karl Rupp <ru...@iu...
> <mailto:ru...@iu...>> wrote:
>
>     Hi,
>
>     > One question: you mentioned padding for the `matrix` type. When i
>
>         initialize the `matrix` instance, i only specify dimensions. how
>         do I
>         know padding values?
>
>
>     if you want to provide your own padded dimensions, consider using
>     matrix_base directly. If you want to query the padded dimensions,
>     use internal_size1() and internal_size2() for the internal number of
>     rows and columns.
>
>     http://viennacl.sourceforge.net/doc/manual-types.html#manual-types-matrix
>
>     Best regards,
>     Karli
>
>
>
>
>         On Tue, Jul 12, 2016 at 5:53 AM, Karl Rupp
>         <ru...@iu... <mailto:ru...@iu...>
>         <mailto:ru...@iu... <mailto:ru...@iu...>>>
>         wrote:
>
>              Hi Dmitriy,
>
>              On 07/12/2016 07:17 AM, Dmitriy Lyubimov wrote:
>
>                  Hi,
>
>                  I am trying to create some elementary wrappers for VCL
>         in javacpp.
>
>                  Everything goes fine, except i really would rather not
>         use those
>                  "cpu"
>                  types (std::map,
>                  std::vector) and rather initialize matrices directly by
>         feeding
>                  row-major or CCS formats.
>
>                  I see that matrix () constructor accepts this form of
>                  initialization;
>                  but it really states that
>                  it does "wrapping" for the device memory.
>
>
>              Yes, the constructors either create their own memory buffer
>              (zero-initialized) or wrap an existing buffer. These are
>         the only
>              reasonable options.
>
>
>                  Now, i can create a host matrix() using host memory and
>         row-major
>                  packing. This works ok it seems.
>
>                  However, these are still host instances. Can i copy host
>                  instances to
>                  instances on opencl context?
>
>
>              Did you look at viennacl::copy() or viennacl::fast_copy()?
>
>
>                  That might be one way bypassing unnecessary (in my case)
>                  complexities of
>                  working with std::vector and std::map classes from java
>         side.
>
>                  But it looks like there's no copy() variation that
>         would accept a
>                  matrix-on-host and matrix-on-opencl arguments (or
>         rather, it of
>                  course
>                  declares those to be ambiguous since two methods fit).
>
>
>              If you want to copy your OpenCL data into a
>         viennacl::matrix, you
>              may wrap the memory handle (obtained with .elements()) into
>         a vector
>              and copy that. If you have plain host data, use
>              viennacl::fast_copy() and mind the data layout (padding of
>              rows/columns!)
>
>
>                  For compressed_matrix, there seems to be a set()
>         method, but i guess
>                  this also requires CCS arrays in the device memory if I
>         use it. Same
>                  question, is there a way to send-and-wrap CCS arrays to an
>                  opencl device
>                  instance of compressed matrix without using std::map?
>
>
>              Currently you have to use .set() if you want to bypass
>              viennacl::copy() and std::map.
>
>              I acknowledge that the C++ type system is a pain when
>         interfacing
>              from other languages. We will make this much more convenient in
>              ViennaCL 2.0. The existing interface in ViennaCL 1.x is too
>         hard to
>              fix without breaking lots of user code, so we won't invest
>         time in
>              that (contributions welcome, though :-) )
>
>              Best regards,
>              Karli
>
>
>
>
>

Re: [ViennaCL-devel] Initializing matrices via direct serialization (row major or CCS)

From: Karl R. <ru...@iu...> - 2016-07-14 08:28:53

Hi Dmitriy,

 > To get a little bit more background, we are thinking of enabling Apache
> mahout algebra to auto probe for hardware and use ViennaCL-supported
> in-memory computations there (not to mention additional solvers are just
> great, some of our basic in-memory java-only algebra is just very slow).
>
> We were thinking, and were having a question, perhaps if we wanted to
> support all of the possible backend options, it looks like we would have
> to build 3 different jars: one is that loads -l openCL, one that loads
> cuda, and one is just all in host memory backend.

You might be fine with just 2: host-only and host+OpenCL. You can use 
OpenCL on NVIDIA GPUs without any performance drop compared to CUDA. 
host+OpenCL is also much more build-friendly than anything CUDA-related.


> The reason we don't seem to be able to create just one jar maven module
> to support all backends in one maven artifact is because once we load
> the library, it will probably at some point will try to load both opencl
> and cuda, whereas in reality we expect environments that at runtime may
> have only one (or even none) of those apis configured and supported.
>
> So what we were thinking, perhaps we need 3 separate artifacts
> eventually, one for host+opencl, one for host+cuda, and if all of that
> works, then we could also have just host-only module (since cpu
> benchmarks seem to be quite good too, so if we get this for free, then
> there's no reason not to use it).
>
> on the downside of this 3 differently compiled modules, it would seem we
> won't able to load any two of these modules at the same time since the
> symbols are likely clash. Hard to see how much that might be a problem.
>
> Or, is it possible that we are completely misunderstanding this and it
> is possible to compile a single ViennaCL-enabled  .so so that it will
> load libOpenCL.so if it is available only, and cuda if it is available
> only etc. etc. (I don't think so, right now loading .so crashes if there
> is no libOpenCL.so in the system but we try to create opencl context).
>
> Or perhaps it is just possible to probe for support of opencl and cuda
> and just avoid creating context with non-supported devices by our logic
> and then we can support all options in one module? That option would be
> terrific to have. I really don't like having to compile and publish 3
> different maven artifcacts for each case too much.

What you are asking for is a dynamic load of backends at runtime, just 
like you can load all kinds of plugins in your webbrowser dynamically. 
Currently such a dynamic backend enabling is not supported in ViennaCL 
(most scientific software does not support that), so you have to work 
with two or three separate builds of ViennaCL and load the correct one 
at runtime to avoid symbol clashes.

We intend to provide a dynamic backend detection at runtime with 
ViennaCL 2.0. Although an earlier poll in this mailing list indicated 
that most users are fine with the status quo, over the last months I've 
come to the conclusion that such a dynamic backend detection mechanism 
is necessary to bring ViennaCL to the next level. Applications such as 
the one you are working on cannot (or do not want to) afford messy 
recompilations or managing multiple builds of the different libraries 
they rely on. This is a topic I touched in some of my recent talks, so 
it's something I'm really serious about :-)

Best regards,
Karli




>
>
> On Tue, Jul 12, 2016 at 9:27 AM, Karl Rupp <ru...@iu...
> <mailto:ru...@iu...>> wrote:
>
>     Hi,
>
>     > One question: you mentioned padding for the `matrix` type. When i
>
>         initialize the `matrix` instance, i only specify dimensions. how
>         do I
>         know padding values?
>
>
>     if you want to provide your own padded dimensions, consider using
>     matrix_base directly. If you want to query the padded dimensions,
>     use internal_size1() and internal_size2() for the internal number of
>     rows and columns.
>
>     http://viennacl.sourceforge.net/doc/manual-types.html#manual-types-matrix
>
>     Best regards,
>     Karli
>
>
>
>
>         On Tue, Jul 12, 2016 at 5:53 AM, Karl Rupp
>         <ru...@iu... <mailto:ru...@iu...>
>         <mailto:ru...@iu... <mailto:ru...@iu...>>>
>         wrote:
>
>              Hi Dmitriy,
>
>              On 07/12/2016 07:17 AM, Dmitriy Lyubimov wrote:
>
>                  Hi,
>
>                  I am trying to create some elementary wrappers for VCL
>         in javacpp.
>
>                  Everything goes fine, except i really would rather not
>         use those
>                  "cpu"
>                  types (std::map,
>                  std::vector) and rather initialize matrices directly by
>         feeding
>                  row-major or CCS formats.
>
>                  I see that matrix () constructor accepts this form of
>                  initialization;
>                  but it really states that
>                  it does "wrapping" for the device memory.
>
>
>              Yes, the constructors either create their own memory buffer
>              (zero-initialized) or wrap an existing buffer. These are
>         the only
>              reasonable options.
>
>
>                  Now, i can create a host matrix() using host memory and
>         row-major
>                  packing. This works ok it seems.
>
>                  However, these are still host instances. Can i copy host
>                  instances to
>                  instances on opencl context?
>
>
>              Did you look at viennacl::copy() or viennacl::fast_copy()?
>
>
>                  That might be one way bypassing unnecessary (in my case)
>                  complexities of
>                  working with std::vector and std::map classes from java
>         side.
>
>                  But it looks like there's no copy() variation that
>         would accept a
>                  matrix-on-host and matrix-on-opencl arguments (or
>         rather, it of
>                  course
>                  declares those to be ambiguous since two methods fit).
>
>
>              If you want to copy your OpenCL data into a
>         viennacl::matrix, you
>              may wrap the memory handle (obtained with .elements()) into
>         a vector
>              and copy that. If you have plain host data, use
>              viennacl::fast_copy() and mind the data layout (padding of
>              rows/columns!)
>
>
>                  For compressed_matrix, there seems to be a set()
>         method, but i guess
>                  this also requires CCS arrays in the device memory if I
>         use it. Same
>                  question, is there a way to send-and-wrap CCS arrays to an
>                  opencl device
>                  instance of compressed matrix without using std::map?
>
>
>              Currently you have to use .set() if you want to bypass
>              viennacl::copy() and std::map.
>
>              I acknowledge that the C++ type system is a pain when
>         interfacing
>              from other languages. We will make this much more convenient in
>              ViennaCL 2.0. The existing interface in ViennaCL 1.x is too
>         hard to
>              fix without breaking lots of user code, so we won't invest
>         time in
>              that (contributions welcome, though :-) )
>
>              Best regards,
>              Karli
>
>
>
>
>

3 messages has been excluded from this view by a project administrator.

Flat | Threaded

<< < 1 .. 3 4 5 6 7 .. 53 > >> (Page 5 of 53)