Re: [Lapackpp-devel] Lapackpp scaled matrix addition

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

On Tue, 15 May 2007, Christian Stimming wrote:

> Am Montag, 14. Mai 2007 16:59 schrieb Matti Varjokallio:
> > > >Am I correct in saying that currently the possibilities to compute a
> > > >scaled matrix sum in lapack++ are:
> > > >1) Do a basic double-loop
> > > >2) Blas_Mat_Mat_Mult with the second argument as identity matrix
> > > >3) Blas_Scale for the second argument and use +-operator which is
> > > >marked deprecated
> > > >
> > > > It seems that there is a Blas_Add_Mult -function only for vectors, but
> > > > not for matrices? Is there some specific reason for this or could a
> > > > wrapper for dgema be added?
> > >
> > > Oh, there exists dgema that does this? In that case it would be nice to
> > > have a wrapper for this, indeed. If you're up to submitting a patch, I'd
> > > happily include it into lapackpp; otherwise I might be able to do this
> > > maybe in June.
> >
> > Sorry it turned out that I was browsing through some Compaq-specific
> > extensions and thought it was plain LAPACK-interface. Also ESSL gives
> > dgeadd for this purpose, but there seems to be no such function in LAPACK.
> 
> Ah, ok. Yes, that's what I knew as well. Even though I needed this at times, 
> there wasn't any function available except for Blas_Mat_Mat_Mult with an 
> identity matrix.
> 
> > Would it make sense to use daxpy (scaled vector addition) to make
> > something like Blas_Add_Mult(LaGenMatDouble&, double, const
> > &LaGenMatDouble)?
> 
> I'm not so sure. We're talking about an element-wise sum of matrices? In that 
> case daxpy does not make sense because it always calculates the (scaled) 
> inner product of two vectors, if I recall correctly. 
> 
> I'm open for patches that implement this function "by hand" in lapackpp. 
> However I've regularly been surprised by the speed of the existing LAPACK 
> function. In cases like this I'd like to hear (with some convincing numbers) 
> that a hand-written function is indeed faster than Blas_Mat_Mat_Mult 
> with "weird" input arguments (identity etc.). Profiling numbers could be 
> obtained by e.g. valgrind's tool callgrind or others.
> 
> Regards,
> 
> Christian

Hi again,

Daxpy wasn't scaled inner product, but scaled vector addition. 
Blas_Add_Mult uses it internally. It seems to work 
for matrices as well, so I suggest:

void Blas_Add_Mult_Mat(LaGenMatDouble &A,
                       double alpha,
                       const LaGenMatDouble &B)
{
  assert(A.rows() == B.rows());
  assert(A.cols() == B.cols());
  assert(A.inc(0) == B.inc(0));
  assert(A.inc(1) == B.inc(1));

  integer n = A.rows()*A.cols();
  integer inca = A.inc(0), incb = B.inc(0);

  F77NAME(daxpy)(&n, &alpha, &B(0,0), &incb, &A(0,0), &inca);
}

...although I tried it only couple of occasions. I don't know if those 
inc-asserts are needed. Just naming it Blas_Add_Mult won't work because 
LaVectorDouble was inherited from LaGenMatDouble.

-Matti