You can subscribe to this list here.
2010 
_{Jan}

_{Feb}

_{Mar}

_{Apr}

_{May}
(1) 
_{Jun}
(8) 
_{Jul}
(16) 
_{Aug}
(6) 
_{Sep}

_{Oct}

_{Nov}

_{Dec}
(5) 

2011 
_{Jan}
(4) 
_{Feb}
(3) 
_{Mar}
(5) 
_{Apr}

_{May}
(24) 
_{Jun}

_{Jul}
(5) 
_{Aug}
(17) 
_{Sep}

_{Oct}
(6) 
_{Nov}
(9) 
_{Dec}
(8) 
2012 
_{Jan}
(5) 
_{Feb}
(14) 
_{Mar}
(25) 
_{Apr}
(7) 
_{May}
(15) 
_{Jun}
(12) 
_{Jul}
(22) 
_{Aug}
(4) 
_{Sep}
(10) 
_{Oct}
(10) 
_{Nov}
(19) 
_{Dec}
(17) 
2013 
_{Jan}
(8) 
_{Feb}
(10) 
_{Mar}
(16) 
_{Apr}
(3) 
_{May}
(16) 
_{Jun}
(26) 
_{Jul}

_{Aug}
(9) 
_{Sep}

_{Oct}
(8) 
_{Nov}
(17) 
_{Dec}
(2) 
2014 
_{Jan}
(37) 
_{Feb}
(15) 
_{Mar}
(6) 
_{Apr}
(9) 
_{May}
(11) 
_{Jun}
(11) 
_{Jul}
(9) 
_{Aug}
(9) 
_{Sep}
(19) 
_{Oct}
(4) 
_{Nov}
(22) 
_{Dec}
(21) 
2015 
_{Jan}

_{Feb}
(7) 
_{Mar}
(2) 
_{Apr}
(17) 
_{May}
(22) 
_{Jun}
(11) 
_{Jul}
(11) 
_{Aug}
(6) 
_{Sep}
(7) 
_{Oct}

_{Nov}
(5) 
_{Dec}

2016 
_{Jan}
(1) 
_{Feb}
(3) 
_{Mar}
(4) 
_{Apr}
(8) 
_{May}
(8) 
_{Jun}
(11) 
_{Jul}
(2) 
_{Aug}

_{Sep}

_{Oct}

_{Nov}

_{Dec}

S  M  T  W  T  F  S 


1

2

3
(6) 
4

5

6

7

8

9

10

11
(1) 
12

13

14

15

16

17
(2) 
18
(1) 
19

20

21

22

23

24

25

26

27

28

29

30

31




From: Karl Rupp <rupp@iu...>  20121018 03:34:32

Hi again, I've checked the issue in more depth and found that this is related to the order of initialization. In simple words, the global vector is created before the OpenCL backend becomes available, thus causing the problem. There is a rather simple fix working in most cases: In viennacl/vector.hpp, around line 250, there is vector() : size_(0) { viennacl::linalg::kernels::vector<SCALARTYPE, ALIGNMENT>::init(); } Replacing that by vector() : size_(0) {} should make your code valid again. If required, similar modifications need to be applied to scalar and the various matrix classes. The problem will be resolved completely with release 1.4.0. Thanks for spotting this bug and best regards, Karli On 10/17/2012 12:55 AM, maillist@... wrote: > Hi, all. > > I wrote following code but this got an error > ============================================= > #include <viennacl/vector.hpp> > viennacl::vector<double> vectorGpu; > > int main() > { > vectorGPU = viennacl::vector<double>(10); > return 0; > } > ============================================= > > Debugger of VS2012 said "An access violation has occurred", stack trace is > as following: > ========================================================================================== > VectorAddWithViennaCLTest.exe!std::_Tree<std::_Tmap_traits<long,bool,std::less<long>,std::allocator<std::pair<long > const ,bool> >,0> >::_Lbound(const long & _Keyval) LINE 2092 C++ > VectorAddWithViennaCLTest.exe!std::_Tree<std::_Tmap_traits<long,bool,std::less<long>,std::allocator<std::pair<long > const ,bool> >,0> >::lower_bound(const long & _Keyval) LINE 1572 C++ > VectorAddWithViennaCLTest.exe!std::map<long,bool,std::less<long>,std::allocator<std::pair<long > const ,bool> > >::operator[](const long & _Keyval) LINE 192 C++ >> VectorAddWithViennaCLTest.exe!viennacl::ocl::backend<0>::current_context() > LINE 50 C++ > VectorAddWithViennaCLTest.exe!viennacl::ocl::current_context() LINE 182 C++ > VectorAddWithViennaCLTest.exe!viennacl::ocl::current_device() LINE 299 C++ > VectorAddWithViennaCLTest.exe!viennacl::ocl::DOUBLE_PRECISION_CHECKER<double>::apply() > LINE 46 C++ > VectorAddWithViennaCLTest.exe!viennacl::linalg::kernels::vector<double,1>::init() > LINE 387 C++ > VectorAddWithViennaCLTest.exe!viennacl::vector<double,1>::vector<double,1>() > LINE 239 C++ > VectorAddWithViennaCLTest.exe!`dynamic initializer for 'resultGpu''() > LINE 3 C++ > msvcr110d.dll!_initterm(void (void) * * pfbegin, void (void) * * pfend) > LINE 889 C > VectorAddWithViennaCLTest.exe!__tmainCRTStartup() LINE 460 C > VectorAddWithViennaCLTest.exe!mainCRTStartup() LINE 377 C > ========================================================================================== > > How can I fix this? > Vector must not be used as a global variable? > > >  > Everyone hates slow websites. So do we. > Make your web apps faster with AppDynamics > Download AppDynamics Lite for free today: > http://p.sf.net/sfu/appdyn_sfd2d_oct > _______________________________________________ > ViennaCLsupport mailing list > ViennaCLsupport@... > https://lists.sourceforge.net/lists/listinfo/viennaclsupport > 
From: Karl Rupp <rupp@iu...>  20121017 13:02:49

Hi, could you please try to instantiate a local vector first: > I wrote following code but this got an error > ============================================= > #include <viennacl/vector.hpp> > viennacl::vector<double> vectorGpu; > > int main() > { > viennacl::vector<double> dummy(10); > vectorGPU = viennacl::vector<double>(10); > return 0; > } > ============================================= Does the problem remain in such case? Best regards, Karli > > Debugger of VS2012 said "An access violation has occurred", stack trace is > as following: > ========================================================================================== > VectorAddWithViennaCLTest.exe!std::_Tree<std::_Tmap_traits<long,bool,std::less<long>,std::allocator<std::pair<long > const ,bool> >,0> >::_Lbound(const long & _Keyval) LINE 2092 C++ > VectorAddWithViennaCLTest.exe!std::_Tree<std::_Tmap_traits<long,bool,std::less<long>,std::allocator<std::pair<long > const ,bool> >,0> >::lower_bound(const long & _Keyval) LINE 1572 C++ > VectorAddWithViennaCLTest.exe!std::map<long,bool,std::less<long>,std::allocator<std::pair<long > const ,bool> > >::operator[](const long & _Keyval) LINE 192 C++ >> VectorAddWithViennaCLTest.exe!viennacl::ocl::backend<0>::current_context() > LINE 50 C++ > VectorAddWithViennaCLTest.exe!viennacl::ocl::current_context() LINE 182 C++ > VectorAddWithViennaCLTest.exe!viennacl::ocl::current_device() LINE 299 C++ > VectorAddWithViennaCLTest.exe!viennacl::ocl::DOUBLE_PRECISION_CHECKER<double>::apply() > LINE 46 C++ > VectorAddWithViennaCLTest.exe!viennacl::linalg::kernels::vector<double,1>::init() > LINE 387 C++ > VectorAddWithViennaCLTest.exe!viennacl::vector<double,1>::vector<double,1>() > LINE 239 C++ > VectorAddWithViennaCLTest.exe!`dynamic initializer for 'resultGpu''() > LINE 3 C++ > msvcr110d.dll!_initterm(void (void) * * pfbegin, void (void) * * pfend) > LINE 889 C > VectorAddWithViennaCLTest.exe!__tmainCRTStartup() LINE 460 C > VectorAddWithViennaCLTest.exe!mainCRTStartup() LINE 377 C > ========================================================================================== > > How can I fix this? > Vector must not be used as a global variable? > > >  > Everyone hates slow websites. So do we. > Make your web apps faster with AppDynamics > Download AppDynamics Lite for free today: > http://p.sf.net/sfu/appdyn_sfd2d_oct > _______________________________________________ > ViennaCLsupport mailing list > ViennaCLsupport@... > https://lists.sourceforge.net/lists/listinfo/viennaclsupport > 
From: <maillist@en...>  20121017 07:02:42

Hi, all. I wrote following code but this got an error ============================================= #include <viennacl/vector.hpp> viennacl::vector<double> vectorGpu; int main() { vectorGPU = viennacl::vector<double>(10); return 0; } ============================================= Debugger of VS2012 said "An access violation has occurred", stack trace is as following: ========================================================================================== VectorAddWithViennaCLTest.exe!std::_Tree<std::_Tmap_traits<long,bool,std::less<long>,std::allocator<std::pair<long const ,bool> >,0> >::_Lbound(const long & _Keyval) LINE 2092 C++ VectorAddWithViennaCLTest.exe!std::_Tree<std::_Tmap_traits<long,bool,std::less<long>,std::allocator<std::pair<long const ,bool> >,0> >::lower_bound(const long & _Keyval) LINE 1572 C++ VectorAddWithViennaCLTest.exe!std::map<long,bool,std::less<long>,std::allocator<std::pair<long const ,bool> > >::operator[](const long & _Keyval) LINE 192 C++ > VectorAddWithViennaCLTest.exe!viennacl::ocl::backend<0>::current_context() LINE 50 C++ VectorAddWithViennaCLTest.exe!viennacl::ocl::current_context() LINE 182 C++ VectorAddWithViennaCLTest.exe!viennacl::ocl::current_device() LINE 299 C++ VectorAddWithViennaCLTest.exe!viennacl::ocl::DOUBLE_PRECISION_CHECKER<double>::apply() LINE 46 C++ VectorAddWithViennaCLTest.exe!viennacl::linalg::kernels::vector<double,1>::init() LINE 387 C++ VectorAddWithViennaCLTest.exe!viennacl::vector<double,1>::vector<double,1>() LINE 239 C++ VectorAddWithViennaCLTest.exe!`dynamic initializer for 'resultGpu''() LINE 3 C++ msvcr110d.dll!_initterm(void (void) * * pfbegin, void (void) * * pfend) LINE 889 C VectorAddWithViennaCLTest.exe!__tmainCRTStartup() LINE 460 C VectorAddWithViennaCLTest.exe!mainCRTStartup() LINE 377 C ========================================================================================== How can I fix this? Vector must not be used as a global variable? 
From: Karl Rupp <rupp@iu...>  20121011 02:30:16

Hi Martin, > Thank you very much for your answers, > it helped my a lot, I finally make it running with good results. Great, I'm glad I could help... > However, I have found some interesting time complexities. You mentioned, > that the most complex part of QR decomposition and solving in least > squares sense is inplace_qr(), therefore it is parallelized on gpu. > > My running times looks as follow for N = 2300 > > inplace_qr on CPU : 12 sec and 849 milisec > inplace_qr on GPU : 1 sec and 491 milisec > recover QR : 34 sec and 892 milisec > Which compiler flags did you use? Have you set the NDEBUG preprocessor constant? Without it, you will get rather poor performance with uBLAS. > with N = 15000 > > inplace_qr on GPU : 15 sec and 704 milisec > recover QR : 870 sec and 796 milisec > The scaling of the GPU is mostly due to the smaller overheads at larger problem sizes, that's why you get an almost linear scaling. The QR recovery timings suggest that you haven't set NDEBUG... > It looks like in my case the most time expensive part is not the > inplace_qr(), but recovering of the Q and R matrices. > This is done on CPU. The question is, isn't it possible to run also this > function on GPU using your library ? Or this problem is hard to > parallelize ? The computation is not entirely trivial, but a GPU parallelization is nevertheless possible. As I'm aiming at providing a leastsquares examples with 1.4.0, there will be a GPUversion available soon. > And one more question, u said "1000 < N < 10000". The maximum size of > matrices is limited to maximum texture size supported by GPU, or GPU > memmory only? Is it possible to find (running some command) these limits > of my GPU ? You can get device information from the 'viennaclinfo' example. The limitation stems from the main memory available on the GPU, which is currently capped at 36 GB. Also, the maximum allocable buffer size on the GPU may be lower than the physical memory available. Best regards, Karli > >> Hi Martin, >> >> yes, ViennaCL can help you with this task for about 1000 < N < 10000, >> depending on your hardware. >> >>> 1) My first questions is, if the library is capable of solving >>> linear> system composed of matrices, or it has to be decomposed into >>> 3 linear >>> systems where x and b are vectors only. >> >> Yes, it is. However, solver operations on submatrices are not yet >> supported in the 1.3.1 release. >> >> >>> 2) When I am trying to solve this in least squares manner, it looks >>> like> follows: >> >> >> The first thing to note is that most time is spent in the QR >> factorization (for reasonably large problem sizes). Let's go through >> it step by step: >> >>> QR * x = b >> >> Have a look at examples/tutorial/qr.cpp. In your case, >> std::vector<ScalarType> betas = viennacl::linalg::inplace_qr(vcl_A); >> should be just what you need. However, at this point the matrices Q >> and R are both stored within A. >> >>> R * x = Q^T * b >> >> To accomplish this with the current version of ViennaCL, you need to >> set up R and Q explicitly on the CPU: >> >> viennacl::copy(vcl_A, ublas_A); //copy ViennaCL matrix to uBLAS >> Q.clear(); R.clear(); >> viennacl::linalg::recoverQ(ublas_A, hybrid_betas, Q, R); >> >> You may then get Q^T b directly on the CPU via >> boost::numeric::ublas::vector<T> QTb = prod(trans(Q), b); >> >> There will be further improvements on doing this part directly on the >> GPU with one of the next releases, probably already with the upcoming >> 1.4.0 release. However, I can't promise anything right now... >> >> >>> x = R^1 * Q^T * b >> >> Here it is important to keep in mind that R is a upper triangular >> matrix with unit diagonal. Thus, instead of forming the inverse R^{1} >> explicitly, you should launch a triangular solver. As the matrices Q >> and R are already on the CPU, just use uBLAS directly: >> ublas::solve(R, QTb, ublas::unit_upper_tag()); >> >> >>> Thank you very much for any given advice, >>> and I am sorry if I put anything wrong here, I am not really >>> mathematician, only a programmer that has to solve linear system ;) >> >> I hope that helps. I haven't tested the suggested code lines, so I >> hope I haven't missed any details. Also, don't worry about a lack of >> math skills, there are also enough (too many?) math guys with a lack >> of programming skills... ;) >> >> Best regards, >> Karli >> >> > > 
From: <rupp@iu...>  20121003 15:52:35

Hi Michael, > I am planning to do something similar to MAGMA [1] with FLENS [2]. > For the GPU > computations I also want to use ViennaCL. So I am extending FLENS for matrix > types that "live on the GPU". When using ViennaCL as backend these > are basically > wrappers for your matrix/vector types. Cool! :) > But if you are interested I could also could write an interface that allows > conversion between FLENS and ViennaCL matrix/vector types forth and > back (like > for Eigen). Oh yes, that would be great. As you know the internal datastructures in FLENS well, you can most likely avoid some of the unnecessary copies to a separate memory buffer floating around in some of the current viennacl::copy() routines. I'm looking forward to your contribution :) Best regards, Karli 
From: Michael Lehn <michael.lehn@un...>  20121003 15:37:26

Hi Karl and others, I am planning to do something similar to MAGMA [1] with FLENS [2]. For the GPU computations I also want to use ViennaCL. So I am extending FLENS for matrix types that "live on the GPU". When using ViennaCL as backend these are basically wrappers for your matrix/vector types. But if you are interested I could also could write an interface that allows conversion between FLENS and ViennaCL matrix/vector types forth and back (like for Eigen). Cheers, Michael [1] http://icl.cs.utk.edu/magma/index.html [2] http://flens.sf.net 
From: <rupp@iu...>  20121003 15:02:06

Hi Michael, oh, yes, my bad. R has nonunit diagonal, so you should use ublas::solve(R, QTb, ublas::upper_tag()); Instead, the Householder reflectors are scaled such that their 'diagonal' is unity. This, however, is no longer relevant if Q is recovered explicitly. Thanks for pointing that out. Best regards, Karli Quoting Michael Lehn <michael.lehn@...>: > Hi Karl, > >> >>> x = R^1 * Q^T * b >> >> Here it is important to keep in mind that R is a upper triangular >> matrix with unit diagonal. Thus, instead of forming the inverse R^{1} >> explicitly, you should launch a triangular solver. As the matrices Q >> and R are already on the CPU, just use uBLAS directly: >> ublas::solve(R, QTb, ublas::unit_upper_tag()); > > I think R is upper triangular with *non*unit diagonal. Or is this > special to ViennaCL? > > Cheers, > > Michael > > 
From: Michael Lehn <michael.lehn@un...>  20121003 14:54:37

Hi Karl, > >> x = R^1 * Q^T * b > > Here it is important to keep in mind that R is a upper triangular > matrix with unit diagonal. Thus, instead of forming the inverse R^{1} > explicitly, you should launch a triangular solver. As the matrices Q > and R are already on the CPU, just use uBLAS directly: > ublas::solve(R, QTb, ublas::unit_upper_tag()); I think R is upper triangular with *non*unit diagonal. Or is this special to ViennaCL? Cheers, Michael 
From: <rupp@iu...>  20121003 14:49:14

Hi Martin, yes, ViennaCL can help you with this task for about 1000 < N < 10000, depending on your hardware. > 1) My first questions is, if the library is capable of solving > linear> system composed of matrices, or it has to be decomposed into > 3 linear > systems where x and b are vectors only. Yes, it is. However, solver operations on submatrices are not yet supported in the 1.3.1 release. > 2) When I am trying to solve this in least squares manner, it looks > like> follows: The first thing to note is that most time is spent in the QR factorization (for reasonably large problem sizes). Let's go through it step by step: > QR * x = b Have a look at examples/tutorial/qr.cpp. In your case, std::vector<ScalarType> betas = viennacl::linalg::inplace_qr(vcl_A); should be just what you need. However, at this point the matrices Q and R are both stored within A. > R * x = Q^T * b To accomplish this with the current version of ViennaCL, you need to set up R and Q explicitly on the CPU: viennacl::copy(vcl_A, ublas_A); //copy ViennaCL matrix to uBLAS Q.clear(); R.clear(); viennacl::linalg::recoverQ(ublas_A, hybrid_betas, Q, R); You may then get Q^T b directly on the CPU via boost::numeric::ublas::vector<T> QTb = prod(trans(Q), b); There will be further improvements on doing this part directly on the GPU with one of the next releases, probably already with the upcoming 1.4.0 release. However, I can't promise anything right now... > x = R^1 * Q^T * b Here it is important to keep in mind that R is a upper triangular matrix with unit diagonal. Thus, instead of forming the inverse R^{1} explicitly, you should launch a triangular solver. As the matrices Q and R are already on the CPU, just use uBLAS directly: ublas::solve(R, QTb, ublas::unit_upper_tag()); > Thank you very much for any given advice, > and I am sorry if I put anything wrong here, I am not really > mathematician, only a programmer that has to solve linear system ;) I hope that helps. I haven't tested the suggested code lines, so I hope I haven't missed any details. Also, don't worry about a lack of math skills, there are also enough (too many?) math guys with a lack of programming skills... ;) Best regards, Karli 
From: Martin Madaras <martin.madaras@gm...>  20121003 14:00:56

Hello, I am currently solving a problem and I would like to use ViennaCL to accelarate the solving time. However, I am not sure if ViennaCL is capable of what I need to do. I have been trying for two days to manage it, but without results. I would be very thankful if you could help me with this questions/problems. I am trying to solve overdeterminated linear system in least squares manner using QR decomposition. The system is A * x = b, where A is not squared 2N x N matrix, x is N x 3 matrix and b is 2N x 3 matrix. 1) My first questions is, if the library is capable of solving linear system composed of matrices, or it has to be decomposed into 3 linear systems where x and b are vectors only. 2) When I am trying to solve this in least squares manner, it looks like follows: A * x = b QR * x = b R * x = Q^T * b x = R^1 * Q^T * b for this, I have to create inverse matrix R^1 Is it possible to compute an inverse matrix using ViennaCL ? Thank you very much for any given advice, and I am sorry if I put anything wrong here, I am not really mathematician, only a programmer that has to solve linear system ;) 