eddie
2013-02-21
Hi,
I wanted to use ViennaCL in a research project I am currently working on. I really like the interface and ease of use of use of the API. Never thought that OpenCL APIs could be so elegant. Unfortunatuly, I am strugling to get any meaningful results from the cg or the bgstab solvers. I get always either NaN or INF as possible solutions, which are obviously not the desired values.
In order to avoid any bugs of my side, I have tested the solver on source code that we use at other place. It is a smoothing operator for 3d triangle meshes which constructs a s.p.d. matrix, that is solved with a sparse cholesky solver. Hence the matrices and vertices should be ok. Testing this example with ViennaCL at the backend leads to NaN and INF. I have tested it with calling ViennaCL directly from my program, as well as, writing the matrix to a text file and load it by the solverbench-opencl example. I should add that in my example also the ublas solver tends to fail (crash). You can find the example at http://sdrv.ms/XAMcKC.
I am not an expert for solvers, hence it might be something obvious why it fails, but as far as I understood, the cg solver should converge for s.p.d. systems, which is the case in the example. Or am I running into numerical problems?
Thank's in advance
Eddie
Karl Rupp
2013-02-21
Hi,
we use the iterative solvers in ViennaCL for our everyday research, so they are not broken per-se, so there must be something else going on.
Unfortunately, I can't download your example file. Maybe the permissions are not set correctly?
Best regards,
Karli
eddie
2013-02-21
Hi Karli,
thanks for the quick response.
hm, strange. The rights for the download should be set to public. A collegue of mine tried the link at it worked for him. I had to double click in order to start the download. Maybe you have another try.
Best regards
Eddie
Karl Rupp
2013-02-21
Maybe it is because I'm using Firefox 18 on Linux? ;-) I even disabled some extensions, but it doesn't help. Could you please put it at some other site or email it directly to me (rupp _ADD_ iue tuwien ac at)? Thanks!
Karl Rupp
2013-02-22
Hi again,
finally I managed to download the file (seems like a firewall at my workplace caused the issue). I can confirm issues with Cholesky and ILUT preconditioners, while other combinations work. Maybe the system is only positive semidefinite because of boundary values? Have you tried to let MATLAB or Octave compute the eigenvalues and check if they are indeed all positive?
Best regards,
Karli
eddie
2013-02-25
Hi Karli,
thak you for you help so far and sorry for the download problems. I haven't been in the office last Friday, hence the late answer.
Just for clarification. Do you have problems as well when using the CG solver without preconditioning? Had you a try with the BiCGStab solver? This cause the same errors for me as well.
The point with MATLAB is a good try. Unfortunatuly my experience with MATLAB is very little, hence it may take some time. I will text again, as soon I have computed the eigenvalues in MATLAB
Best Regards
Eddie
Karl Rupp
2013-02-28
Hi,
I've replaced the matrix in examples/benchmarks/solver.cpp with your data and found some of the solvers (no nans, residuals okay) working, and some failing. If I remember correctly, BiCGStab with ILU0 worked (can't verify it right now). The problems observed indicate that the matrix is not SPD. I haven't confirmed with MATLAB/Octave, though.
Best regards,
Karli
eddie
2013-03-01
Hi Karli,
I just tested the matrix in MATLAB and the smallest eigenvalues I get is 0.2525. That means that the matrix is positive definite, which makes sense, because otherwise our Sparse Cholesky solver would fail as well.
What makes me really wonder is that I get meaningfull values for the residual, but only NaN or #INF for entries in the result vector. The output for ViennaCL and pure uBLAS examples is identical. Below, you can find some output. I have tested it with and without OpenCL enabled on a Windows 7 PC with VS10 and boost 1.53. I have an ATI 5890 and OpenCL is installed properly (tested with other applications).
I am happy for any hints.
Best Regards
Eduard
Outpuf from the examples:
----- CG solver (no preconditioner) via ViennaCL, compressed_matrix -------
Exec. time: 0.014779
Est. GFLOPS: 0.380837
Relative residual: 0.048229
Estimated rel. residual: 0.048229
Iterations: 100
Relative deviation from result: 1.#INF
…
----- BiCGStab solver (no preconditioner) via ViennaCL, compressed_matrix -------
Exec. time: 0.0303518
Est. GFLOPS: 0.378639
Relative residual: 0.00369145
Estimated rel. residual: 0.00369145
Iterations: 100
Relative deviation from result: 1.#INF
Karl Rupp
2013-03-02
Hi,
as is written in the manual, recent AMD drivers have some flaws which lead to some solvers to fail - at least on Linux. I don't know whether it affects your hardware as well, though. When I ran my check I added a few extra synchronizations to keep that under control (I was using a Trinity APU), but maybe something went wrong as well. I'll check with NVIDIA hardware and let you know.
Best regards,
Karli
eddie
2013-03-04
Hi,
Oh I should have added that I have tested it running on CPU as well, with the same result. I am not 100% sure whether the ATI card is the main problem.
Best Regards
Eduard
Karl Rupp
2013-03-04
Hi,
you're right, the issue is not related to AMD drivers. Instead, the issue seems to be due to the lack of pivoting in ViennaCL's ILU/ICHOL-preconditioners (most packages don't provide pivoting for these preconditioners). The test system happens to be such that during the incomplete factorization process, zeros (or values very close to zero) are generated on the diagonal. They lead to poor numerical properties for your system, which already has a fairly high condition number (I rechecked the eigenvalues, the condition number is in the range 10^6).
A remedy is to use reordering methods prior to assembly, which is usually also beneficial for cache reuse. ViennaCL has a few of them, but they are still experimental. I agree that better reporting of such issues in ViennaCL is desireable…
Best regards,
Karli
Karl Rupp
2013-03-04
oh, and I forgot to add that I don't see any NaN's with the unpreconditioned solvers, only with ILU/ICHOL. The NaN's you see for the unpreconditioned case might be due to the AMD driver problems I mentioned earlier.
Christopher Batty
2013-12-19
I'm running into what I imagine might be the same problem (though possibly not). I'm attempting to use ViennaCL's ILUT preconditioner with BiCGStab for a simulation that requires the solution of non-SPD systems. For most of the (very similar) systems I throw at it it works great, but for one particular system it just crashes when setting up the preconditioner. (I'm using Visual C++ 2012.)
As a point of comparison, Eigen's (preconditioned) BiCGStab apparently has no problem on the same linear system, although it takes rather longer. Therefore, as a fallback I'm reverting to Eigen if ViennaCL's preconditioner crashes. Of course, this is less than ideal.
So two questions:
1) Does this sound like the same issue, or something different?
2) Is there a future plan to incorporate pivoting into the preconditioners, to improve robustness?
Karl Rupp
2013-12-19
Hi Christopher,
when you say that 'it just crashes', what are the symptoms? NaNs in the matrix? Segfaults? Is this in the preconditioner setup, or in the solver cycle phase (i.e. when BiCGStab is running)? Any details are appreciated. Does changing the drop tolerance or the number of elements per row help?
As for 2), it depends on what gets requested by our users. Requests for ILU(T) with pivoting have been rare so far, so we rather focused on other frequent requests like sparse-times-dense matrix multiplications for the 1.5.0 release. I opened up an issue here: https://github.com/viennacl/viennacl-dev/issues/53
Best regards,
Karli
Christopher Batty
2013-12-30
Thanks for the quick response, and sorry for my delay. (I assumed SourceForge would send me some kind of email notification if there was a response, so I didn't check the forum until now.)
This was occurring during the preconditioner setup; it was throwing an exception, which I caught with a try/catch in order to fall back to Eigen. Unfortunately I neglected to save out the details of the specific test case, which arises only infrequently in my simulations, but if I get a chance I will attempt to reproduce it.