Re: [Lapackpp-devel] inline, symmetry, transposition, other questions

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Am Sonntag, 19. November 2006 09:33 schrieb Dominik Wagenfuehr:
> > Please always use the code from CVS, now that you're actually working
> > with it. I already did some changes.
>
> You see, this is the first "project" I working with where several other
> people are workig with. Sorry again.

No problem. 

> > As for inline vs. non-inline: I want to reduce the "inline" functions
> > to a minimum, because they contradict the OO-paradigm of "hiding the
> > implementation" (and make debugging harder, among other things).
>
> Hm, maybe I'm more mathematican than programmer. I've never heard of
> these "inline" before and have left these entries as they were before. I
> have read what "inline" stands for but I cannot decide where it useful,
> because I would never use it on my own. But I try to guess the correct
> usage.

Ok, and it's of course a good approach to leave the stuff as it is as long as 
you  don't know the reasons for it. 

So for "inline" in general: Usually you don't need this and don't have to 
think about it - simply declare the functions in the .h file and define 
(implement) them in the .cc file. However, at runtime there is a certain 
fixed overhead for each time when a function is called. For some specific 
functions that are called a *lot* one wants to avoid that overhead, and the 
compiler can do that by directly inserting "the body" of the called function 
into the place where it has been called. This is called inlining. For this to 
work, the function's definition (not only the declaration) has to be in the 
header file, which contradicts the whole point of OOP "hiding the 
implementation" paradigms. Therefore it should be used only when one is 
really sure about the benefit. The obvious example is operator(): This is 
called a *lot* as soon as a client application uses a(i,j)=... inside of the 
O(n^2) loop. And its function definition contains very few statements - 
probably only one if-clause and one address calculation. In that case 
inlining that function may give a noticable performance improvement at 
runtime. In those classes that I've reviewed already, I made sure to keep 
only those methods inline. Everything else is just a normal function.

> > Also, please think about adding a test program (in
> > matrix/test/blablabla.cc) so that the correct functioning can be tested
> > automatically.
>
> I will take a look at the other files in there. I doesn't have a clue
> what I should test. ;)

Look into matrix/test/tgc.cc which should give a good idea about what to test 
for such a matrix class.

> > Actually I'm not so sure whether the function LaSymmBandMatFactorize()
> > is useful. If that functionality better lives inside some
> > LaSymmBandFactDouble class methods then I'd encourage you to remove the
> > function and move the code to suitable class methods.
>
> Me too. In my first changes I do not want to change too many things at
> once. But I haven't understood it either why not assigning / referencing
> a Matrix in the constructor and do the factorization with a class
> method. I will do this the next time.

I think there are several possibilities how such a factorization class can 
make sense. IMHO include/gfqrc.h is one such possibility. Not having a class 
at all but instead offer several functions (which will then be wrapper 
functions for the LAPACK factorizations) is also one such possibility. Your 
current sybfd.h is probably also fine. The current (very old) version of 
gfd.h, on the other hand, is no useful possibility.

> >> 2. What is the correct meaning of A.size(int d) ?
> >
> > No, I think size() should give the dimensions in terms of the
> > "mathematical" viewpoint, not in terms of the internal storage
>
> Okay, I will change this.

Fine.

> > Is sybmd.h really a *symmetric* banded matrix? Is this fact of symmetry
> > used anywhere inside this class?
> > Surely, in terms of optimized storage,
> > we would need to store only 1*p+1 instead of 2*p+1 elements...
>
> No and yes. ;) The memory space is not optimized for the same reason as
> LaSymmMatDouble uses the same memory as LaGenMatDouble. In each
> Lapack-function you can use the upper or lower part of a matrix to do
> the changes. Of course only one side is necessary in symmetric matrices
> but that is not interesting for lapack-functions. It must use the whole
> memory. Do you know how the memory is accessed by lapack or blas? Maybe
> we could change it.
>
> In other methods like copy or operator() the symmetry is used of course.

Ah! Oh! Ok! Suddenly I've understood the whole point of symmetric matrices in 
LAPACK! No, I think the rationale in LAPACK is that you will store two 
symmetric matrices U and L in the memory space of one non-symmetric matrix A 
- for U you only access the upper part of A, and for L you only access the 
lower part. That's why all these functions (e.g. dsysv) say that you can 
choose which part of A is being used as the symmetric matrix. 

However, that means that memory saving is only obtained when U references the 
same memory space as L. That can easily be done in Fortran or C, when the 
programmer allocates the matrix A by himself/herself anyway. However, in 
lapackpp when the LaSymmMatDouble class allocates the matrix A itself, it 
obviously cannot use the same allocated storage as another LaSymmMatDouble 
currently. So right now I'd say the classes are prepared to use the memory 
space of A for both one U and one L, but there are not class methods 
available to actually put both U and L into the same memory space A. And I 
also cannot think of an easy solution here that would be all of this: 1. easy 
for non-mathematics, 2. give the correct results, and 3. is powerful enough 
to actually achieve this memory optimization. Sigh.

> > Seems to me the matrix class itself rather implements a general band
> > matrix
>
> No. It is a banded matrix with same bandwith in subdiags and superdiags.
> :) So it's a type of special band matrix in terms of memory usage. But
> the class LaSymmBandMatDouble uses internally only half of the memory. I
> do not know much about Lapack storage but I think you could not and
> should not change it or Lapack-functions will fail. But I do not know it
> for sure.

Yes. Agreed.

> 1. In laslv.h you forget to
>
> #include LA_VECTOR_DOUBLE_H
>
> You will get an error including this file without including lavd.h
> yourself before.

Thanks. Applied to CVS.

> 2. Last time you suggested Blas_R1_Update for C := A' * A. Unfortunately
> this function is broken or does not work that way. The reason is that
> the lapack-function inside will alway perform
>
> C := alpha*A*A' + beta*C
>
> because "trans" is set to 'N'. You have no chance to do C := A' * A.
>
> The same with Blas_R2_Update. You only perform
>
> C := alpha*A*B' + alpha*B*A' + beta*C
>
> but alpha*A'*B + alpha*B'*A + beta*C is not acessable. You should add
> some new parameter for these functions or create different function as
> you have done with Blas_Mat_Mat_Mult and Blas_Mat_Trans_Mat_Mult.

Yes, I know the "transpose" argument isn't solved in a nice way in lapackpp. 
The current functions use the "Different function names for different 
arguments" approach. Alternatively, one could boolean argument(s) to control 
the transpositions. Actually I'd probably prefer the latter approach with 
explicit boolean arguments (with reasonable default values). So if you're 
working on "dsyrk" of Blas_R1_Update, you can just as well add an alternative 
function where the transpose setting is controlled by an additional boolean 
argument. And since you're adding a new function, you can even choose the 
name for yourself :-)  For the existing matrix multiplications I might add 
something like that in the future. (The existing functions will still be 
kept, of course.)

You're still invited to give me your sourceforge username and join the 
lapackpp project as a developer :-)

Regards,

Christian