Learn how easy it is to sync an existing GitHub or Google Code repo to a SourceForge project! See Demo
Close
From: David Doria <daviddoria@gm...>  20110311 23:02:21

I implemented a Poisson solver with a flag to either use VNL or Eigen+UMFPACK. For a standard test image, it takes 1.5 minutes with VNL's sparse LU solver, but only 1.5 SECONDS with Eigen's interface to UMFPACK's sparse LU solver. Has anyone else noticed these drastic speed differences from VNL to other libraries? Thanks, David 
From: Gelas, Arnaud Joel Florent <Arnaud_G<elas@hm...>  20110312 02:05:54

David, Most of direct sparse solver use BLAS for underlying low level operation. BLAS like FFTW is autooptimizing on your machine. You'll see a bug difference if you compile BLAS yourself or if you use the binaries from your linux distribution for instance. One more important point is that recently the UMFPACK group released GOTOBLAS which appears to be much faster than BLAS (but released under GPL License). That could explain the timing difference. But also I am not sure if VNL makes extensively use of BLAS or not ? In the case of LDL^T decomposition or Cholesky decomposition, I have seen even higher difference in between VNL, TAUCS, and CHOLMOD... My 2cts, Arnaud ________________________________________ From: David Doria [daviddoria@...] Sent: Friday, March 11, 2011 6:02 PM To: VxlUsers Subject: [Vxlusers] Speed of sparse LU solver I implemented a Poisson solver with a flag to either use VNL or Eigen+UMFPACK. For a standard test image, it takes 1.5 minutes with VNL's sparse LU solver, but only 1.5 SECONDS with Eigen's interface to UMFPACK's sparse LU solver. Has anyone else noticed these drastic speed differences from VNL to other libraries? Thanks, David  Colocation vs. Managed Hosting A question and answer guide to determining the best fit for your organization  today and in the future. http://p.sf.net/sfu/internapsfd2d _______________________________________________ Vxlusers mailing list Vxlusers@... https://lists.sourceforge.net/lists/listinfo/vxlusers 
From: Ian Scott <scottim@im...>  20110312 18:24:51

It previous tests, core vnl BLASlike routines i.e. operator*(vnl_vector, vnl_matrix) were at least as fast as tuned BLAS libraries, since BLAS has to handle things like skip intervals, etc. On the other hand the LAPACK libraries provided in $VXLSRC/v3p just use standard BLAS libraries in $VXLSRC/v3p that haven't been tuned at all. This is the cause of the situation you found. It would be really useful if someone wanted to sort out the v3p CMake config so that you could use a systemprovided (and tuned) BLAS library. Ian. On 12/03/2011 02:05, Gelas, Arnaud Joel Florent wrote: > David, > > Most of direct sparse solver use BLAS for underlying low level > operation. BLAS like FFTW is autooptimizing on your machine. You'll > see a bug difference if you compile BLAS yourself or if you use the > binaries from your linux distribution for instance. One more > important point is that recently the UMFPACK group released GOTOBLAS > which appears to be much faster than BLAS (but released under GPL > License). > > That could explain the timing difference. But also I am not sure if > VNL makes extensively use of BLAS or not ? > > In the case of LDL^T decomposition or Cholesky decomposition, I have > seen even higher difference in between VNL, TAUCS, and CHOLMOD... > > My 2cts, Arnaud > > ________________________________________ From: David Doria > [daviddoria@...] Sent: Friday, March 11, 2011 6:02 PM To: > VxlUsers Subject: [Vxlusers] Speed of sparse LU solver > > I implemented a Poisson solver with a flag to either use VNL or > Eigen+UMFPACK. For a standard test image, it takes 1.5 minutes with > VNL's sparse LU solver, but only 1.5 SECONDS with Eigen's interface > to UMFPACK's sparse LU solver. Has anyone else noticed these drastic > speed differences from VNL to other libraries? > > Thanks, > > David > >  > > Colocation vs. Managed Hosting > A question and answer guide to determining the best fit for your > organization  today and in the future. > http://p.sf.net/sfu/internapsfd2d > _______________________________________________ Vxlusers mailing > list Vxlusers@... > https://lists.sourceforge.net/lists/listinfo/vxlusers > >  > > Colocation vs. Managed Hosting > A question and answer guide to determining the best fit for your > organization  today and in the future. > http://p.sf.net/sfu/internapsfd2d > _______________________________________________ Vxlusers mailing > list Vxlusers@... > https://lists.sourceforge.net/lists/listinfo/vxlusers > > 
From: David Doria <daviddoria@gm...>  20110312 22:45:48
Attachments:
Message as HTML

On Sat, Mar 12, 2011 at 1:24 PM, Ian Scott <scottim@...> wrote: > It previous tests, core vnl BLASlike routines i.e. operator*(vnl_vector, > vnl_matrix) were at least as fast as tuned BLAS libraries, since BLAS has to > handle things like skip intervals, etc. > > On the other hand the LAPACK libraries provided in $VXLSRC/v3p just use > standard BLAS libraries in $VXLSRC/v3p that haven't been tuned at all. This > is the cause of the situation you found. It would be really useful if > someone wanted to sort out the v3p CMake config so that you could use a > systemprovided (and tuned) BLAS library. > > Ian. > If someone does this, please let me know and I'll try the timing tests again and report the results. David 