Re: [Mplapack-devel] MPACK 0.8.0 RC1 : CUDA support for Rgemm in double-double precision.

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hi all

I have just uploaded MPACK 0.8.0RC2
http://sourceforge.net/projects/mplapack/files/mpack/mpack%200.8.0/mpack-0.8.0-RC2.tar.gz/download
https://sourceforge.net/projects/mplapack/files/mpack/mpack%200.8.0/mpack-0.8.0-RC2.tar.gz/download
.
MD5 (mpack-0.8.0-RC2.tar.gz) = c2aa0cf512a5dfcf2881c69f70953c02

Build fixes.

Build and make check have passed on
* Intel Composer 13.0.1 on Linux.
* Gcc on Linux (Ubuntu, RedHat)
* gcc47 on FreeBSD 
* gcc46 on MacOSX Lion
* CUDA 3.1, 3.2, 4.0, 4.2, 5.0 on Linux Host

Best,
 Nakata Maho

From: Maho NAKATA <ma...@ri...>
Subject: MPACK 0.8.0 RC1 : CUDA support for Rgemm in double-double precision.
Date: Thu, 29 Nov 2012 12:34:41 +0900 (JST)

> Hi all,
> 
> I have just uploaded MPACK 0.8.0RC1
> http://sourceforge.net/projects/mplapack/files/mpack/mpack%200.8.0/mpack-0.8.0-RC1.tar.gz/download
> .
> 
> CUDA version of Rgemm in double-double precision has been integrated.
> To enable CUDA version, pass configure to "--enable-cuda=yes"
> also, if multiple version of CUDA toolkit is installed or if you installed to a different
> directory than /usr/local/cuda/, you may want to
> speficfy like following
> --with-cudatoolkithome=/usr/local/cuda-5.0/ 
> .
> 
> From my experience, CUDA 4.0 gives usually best performance. 
> You don't want to use CUDA 3.1 and 3.2.
> CUDA 5.0 gives best when the size of matrix is multiple of 64, but much worse
> than 4.0, 3.2 and 3.1.
> 
> Thanks
> -- Nakata Maho http://accc.riken.jp/maho/ , JA OOO http://ja.openoffice.org/
> http://blog.goo.ne.jp/nakatamaho/ ,GPG: http://accc.riken.jp/maho/maho.pgp.txt