|
From: <si...@EE...> - 2004-05-06 17:22:42
|
Hi Nicolas; Wow. You're doing some impressive work. I like your ideas. But ... there is one thing you may not be aware of. IIRC, someone correct me if I'm wrong, one of the main reasons why we did matlisp the way it is was to avoid writing such routines in lisp altogether. When I first contacted Ray, he had already worked out the generic foreign wrapper suitable for fortran code. I then wrote a script to generate the wrapper code automagically from the lapack files. So the idea was not to write basic matrix operatinos at all. Having said that, matlisp has evolved along the way, with contributions from select individuals like yourself. I think it may be time to consider more drastic functionality of the kind you're suggesting. I, unfortunately, personally would not be able to participate in the development but I think there are people out there who might be. My two cents, Tunc ----- Original Message ----- From: Nicolas Neuss <Nic...@iw...> Date: Thursday, May 6, 2004 5:57 am Subject: [Matlisp-users] Matlisp subset in pure CL > Hello, > > I have written some CL routines for elementary full matrix arithmetic > (vector operations, matrix multiplication and inversion). I need > this in > my PDE toolbox Femlisp (www.femlisp.org) for moderately sized full > matricesappearing as subblocks of sparse matrices. I first used > Matlisp but was > dissatisfied with: > > 1. The overhead of the FFI call which is present in both CMUCL and > ACL (we > discussed this here). > > 2. The overhead of each call to BLAS operations > > 3. Portability. > > Recently, I replaced Matlisp by my own Common Lisp code. It uses > matrixclasses which are parametrized on the element type and > methods which are > compiled at runtime adapted to the matrix class. I think this is an > interesting approach and intend to extend this technique also for > sparsematrices in future versions of Femlisp. I speculate that > this development > should bring my application in a speed range comparable with other FEM > toolboxes on unstructured grids. > > Example (P4, 2.4GHz), M+!-N adds two NxN-matrices: > > MATLISP (2_0beta-2003-10-14, with ATLAS) > > M+!-1: 0.27 MFLOPS > M+!-2: 1.25 MFLOPS > M+!-4: 4.99 MFLOPS > M+!-8: 15.25 MFLOPS > M+!-16: 71.39 MFLOPS > M+!-32: 186.41 MFLOPS > M+!-64: 353.20 MFLOPS > M+!-128: 432.96 MFLOPS > M+!-256: 78.03 MFLOPS > M+!-512: 72.94 MFLOPS > > FL.MATLISP: > M+!-1: 13.11 MFLOPS > M+!-2: 41.94 MFLOPS > M+!-4: 98.69 MFLOPS > M+!-8: 167.77 MFLOPS > M+!-16: 197.38 MFLOPS > M+!-32: 203.36 MFLOPS > M+!-64: 203.36 MFLOPS > M+!-128: 203.36 MFLOPS > M+!-256: 79.89 MFLOPS > M+!-512: 79.89 MFLOPS > > We see that the overhead for the M+!-operation has been largely > reduced,although the peak performance of ATLAS is twice as large. > > Now, what is the point of this message? I do not think that Matlisp > becomes unnecessary because of FL.MATLISP. In fact, FL.MATLISP > comprisesonly a small subset of BLAS/LAPACK now, and I do not > intend to implement > all of it or to achieve the ATLAS performance for BLAS Level 2 > operationsas GEMM! or GESV!. But FL.MATLISP achieves a lot with > comparatively little > code. And, although FL.MATLISP is not yet perfectly implemented, > I think > that there are several things which could be improved in Matlisp > in the > direction which I have taken: > > 1. I like my idea of parametrized matrix classes very much. I can > declare matrices of uniform element type as e.g. (standard- > matrix 'double-float) > which returns (and maybe generates at runtime) a class named > |(STANDARD-MATRIX DOUBLE-FLOAT)| or arbitrary others. Maybe > this would > be interesting also for Matlisp, while keeping the alias to > REAL-MATRIX. > > 2. My BLAS methods are defined on STANDARD-MATRIX. When called > with a > subclass, e.g. |(STANDARD-MATRIX DOUBLE-FLOAT)|, they compile a > method adapted to the subclass. Probably, Matlisp could be > implemented in a > similar way by suitably interfacing to DGEMM, ZGEMM, etc. > > It would also be a large advantage if FL.MATLISP and MATLISP could > interoperate. This is already possible at a lower level, because > I have > kept the Fortran indexing style. But it might be reasonable to > strive for > interoperability also on the CLOS level. > > So, if someone has any ideas in which direction one could pursue this > project I would like to hear them. My routines can be found in the > directory "femlisp:src;matlisp" (if you install Femlisp from > www.femlisp.org) in the FL.MATLISP package. It would also be > relativelyeasy to extract these routines into a separate project, > e.g. CL-MATLISP, if > there is large interest in such a move. > > Thank you for any suggestions, > > Nicolas. > > > ------------------------------------------------------- > This SF.Net email is sponsored by Sleepycat Software > Learn developer strategies Cisco, Motorola, Ericsson & Lucent use > to > deliver higher performing products faster, at low TCO. > http://www.sleepycat.com/telcomwpreg.php?From=osdnemail3 > _______________________________________________ > Matlisp-users mailing list > Mat...@li... > https://lists.sourceforge.net/lists/listinfo/matlisp-users > |
|
From: Nicolas N. <Nic...@iw...> - 2004-05-07 08:43:17
|
si...@EE... writes:
> Hi Nicolas;
>
> Wow. You're doing some impressive work. I like your ideas.
Thank you very much.
> But ... there is one thing you may not be aware of. IIRC, someone
> correct me if I'm wrong, one of the main reasons why we did matlisp the
> way it is was to avoid writing such routines in lisp altogether. When I
> first contacted Ray, he had already worked out the generic foreign
> wrapper suitable for fortran code. I then wrote a script to generate the
> wrapper code automagically from the lapack files. So the idea was not to
> write basic matrix operatinos at all.
I think that Matlisp is an important proof of concept, namely that CL
really can take full profit of foreign libraries. Some time ago, I spoke
with Folkmar Bornemann (a numerical analyst at the Technical University in
Munich) about Femlisp, and he advocated the use of Matlab/Femlab instead.
One of his reasons was that Matlab has access to the high-performance ATLAS
library. At that time, I did not know that Matlisp could use ATLAS, but
now I feel very comfortable that I can get this speed also from within CL
if I really should need it.
I see the BLAS development in Femlisp as another proof of concept. My
goals were:
1. It should be sufficient for Femlisp's needs thus making Femlisp more or
less running with every ANSI CL. This is possible because Femlisp uses
iterative solvers, which need only rather few basic operations for full
matrices.
2. The technique should be extensible to sparse matrices. In fact, this
will be the most important step which I plan to tackle in the next
months. As much as I know, also for Fortran there is no standard
library for sparse matrices availableq. Furthermore, Femlisp has very
special needs (dynamic modifications of matrix structure, etc.).
3. It should be as concise as possible in source code, and it should
nevertheless be as fast as you can get with CL. Especially, at this
point the dynamic features of CL fit wonderful in the picture.
Point 3 made it necessary that I deviated from Matlisp in several ways.
E.g. I used a different class representation with more general class names,
I removed optional parameters which have a performance drawback for the
method call, etc. However, the changes are not large and it should be easy
to transfer code between Matlisp and "CL-Matlisp". The most noticeable
change is that I did not use the [] read macro, but implemented a simpler
one dispatching on #m. Another one is that I generally use MREF/VREF
instead of MATRIX-REF.
> Having said that, matlisp has evolved along the way, with contributions
> from select individuals like yourself. I think it may be time to
> consider more drastic functionality of the kind you're suggesting. I,
> unfortunately, personally would not be able to participate in the
> development but I think there are people out there who might be.
>
> My two cents, Tunc
In any case, I think that we should be aware of each others development.
Maybe Matlisp can try to improve its method call overhead a little bit
along the lines shown in Femlisp. For example, you could check matrix
compatibility without a :BEFORE method, or you could remove optional
parameters.
+--
|Example: I have generic functions GEMM-NN!, GEMM-NT!, GEMM-TN!, GEMM-TT!
|and a dispatch function
|
| (defun gemm! (alpha x y beta z &optional (job :nn))
| "Dispatches on the optional job argument (member :nn :tn :nt :tt) and
| calls the corresponding generic function, e.g. GEMM-NN!."
| (ecase job
| (:nn (gemm-nn! alpha x y beta z))
| (:nt (gemm-nt! alpha x y beta z))
| (:tn (gemm-tn! alpha x y beta z))
| (:tt (gemm-tt! alpha x y beta z))))
|
|to keep the Matlisp interface.
+--
Maybe it is also possible with the help of CL vendors and developers to
reduce the FFI call overhead.
For the more distant future, I dream of a seamless transition between
Matlisp and CL-Matlisp, e.g. along the following lines:
1. Use the CL-implemented version, if there is any, and if the matrices
are small.
2. Alternatively, use an FFI-call if the external library is available or
if the matrices are large
3. Otherwise, throw an error.
However, this development is too early now, at least for me. While
implementing the BLAS operations sparse matrix, there will probably be
several important changes and I need a code which is as lightweight as
possible for the moment.
Yours, Nicolas.
|
|
From: Raymond T. <to...@rt...> - 2004-05-19 16:46:34
|
>>>>> "simsek" == simsek <si...@ee...> writes:
simsek> But ... there is one thing you may not be aware of. IIRC, someone
simsek> correct me if I'm wrong, one of the main reasons why we did matlisp
simsek> the way it is was to avoid writing such routines in lisp altogether.
simsek> When I first contacted Ray, he had already worked out the generic
simsek> foreign wrapper suitable for fortran code. I then wrote a script
simsek> to generate the wrapper code automagically from the lapack files.
simsek> So the idea was not to write basic matrix operatinos at all.
Yes, that was certainly my idea too. Having said that, let me also
say that I experimented a bit with a full Lisp implementation of some
of the basic routines, and at least with CMUCL, Lisp was no slower
than LAPACK for small matrices. But why redo the work when someone
had already done it, and better than I would do?
simsek> Having said that, matlisp has evolved along the way, with contributions
simsek> from select individuals like yourself. I think it may be time to
simsek> consider more drastic functionality of the kind you're suggesting.
I also think this is a good idea. It would be nice if this approach
could be made seamless with Matlisp, so the user wouldn't have to know
unless he wanted to. If it also means changing some Matlisp
functions, that would also be ok. We might want to ask other users
about that, though. :-)
We probably also want to do an official 2.0 release before adding such
features for the "3.0" release.
simsek> I, unfortunately, personally would not be able to participate in the development
simsek> but I think there are people out there who might be.
I can help some, but I don't use matlisp that much anymore.
Ray
|
|
From: Nicolas N. <Nic...@iw...> - 2004-05-21 13:05:57
|
Raymond Toy <to...@rt...> writes: > simsek> Having said that, matlisp has evolved along the way, with contributions > simsek> from select individuals like yourself. I think it may be time to > simsek> consider more drastic functionality of the kind you're suggesting. > > I also think this is a good idea. It would be nice if this approach > could be made seamless with Matlisp, so the user wouldn't have to know > unless he wanted to. If it also means changing some Matlisp > functions, that would also be ok. We might want to ask other users > about that, though. :-) > > We probably also want to do an official 2.0 release before adding such > features for the "3.0" release. > > simsek> I, unfortunately, personally would not be able to participate in the development > simsek> but I think there are people out there who might be. > > I can help some, but I don't use matlisp that much anymore. > > Ray When my sparse matrix code works, and the BLAS code generation has evolved to a satisfactory state, I'll come back on this... Yours, Nicolas. |
|
From: Ernst v. W. <ev...@in...> - 2004-05-21 19:16:41
|
Ah, yes, Nicolas, I would like to see that :-) Ernst > -----Original Message----- > From: mat...@li... > [mailto:mat...@li...]On Behalf Of Nicolas > Neuss > Sent: 21 May 2004 15:04 > To: mat...@li... > Subject: Re: [Matlisp-users] Matlisp subset in pure CL > > > Raymond Toy <to...@rt...> writes: > > > simsek> Having said that, matlisp has evolved along the > way, with contributions > > simsek> from select individuals like yourself. I think it > may be time to > > simsek> consider more drastic functionality of the kind > you're suggesting. > > > > I also think this is a good idea. It would be nice if this approach > > could be made seamless with Matlisp, so the user wouldn't have to know > > unless he wanted to. If it also means changing some Matlisp > > functions, that would also be ok. We might want to ask other users > > about that, though. :-) > > > > We probably also want to do an official 2.0 release before adding such > > features for the "3.0" release. > > > > simsek> I, unfortunately, personally would not be able to > participate in the development > > simsek> but I think there are people out there who might be. > > > > I can help some, but I don't use matlisp that much anymore. > > > > Ray > > When my sparse matrix code works, and the BLAS code generation has evolved > to a satisfactory state, I'll come back on this... > > Yours, Nicolas. > > > > > ------------------------------------------------------- > This SF.Net email is sponsored by: Oracle 10g > Get certified on the hottest thing ever to hit the market... Oracle 10g. > Take an Oracle 10g class now, and we'll give you the exam FREE. > http://ads.osdn.com/?ad_id=3149&alloc_id=8166&op=click > _______________________________________________ > Matlisp-users mailing list > Mat...@li... > https://lists.sourceforge.net/lists/listinfo/matlisp-users > |
|
From: Raymond T. <to...@rt...> - 2004-05-19 16:52:57
|
>>>>> "Nicolas" == Nicolas Neuss <Nic...@iw...> writes:
Nicolas> Maybe it is also possible with the help of CL vendors and developers to
Nicolas> reduce the FFI call overhead.
Speculation here, but the FFI call overhead itself is probably quite
low. The cost currently is turning off GC, extracting the addresses,
putting an unwind-protect around the FFI call, and setting up the FP
modes to what Fortran wants. If we had immovable foreign vectors and
just set the FP modes to what Fortran wants, then a lot of the
overhead would go away.
Nicolas> For the more distant future, I dream of a seamless transition between
Nicolas> Matlisp and CL-Matlisp, e.g. along the following lines:
Nicolas> 1. Use the CL-implemented version, if there is any, and if the matrices
Nicolas> are small.
And the nice thing is we can even get rid of the call overhead if the
function could be inlined. That's a huge win for small
matrices/vectors.
Ray
|
|
From: Nicolas N. <Nic...@iw...> - 2004-05-21 13:00:27
|
Raymond Toy <to...@rt...> writes: > Nicolas> 1. Use the CL-implemented version, if there is any, and if the matrices > Nicolas> are small. > > And the nice thing is we can even get rid of the call overhead if the > function could be inlined. That's a huge win for small > matrices/vectors. > > Ray I have thought about this, but I am completely uncertain how to do such a thing in a well-behaved way within ANSI CL. The best thing would probably be to pay Gerd Moellmann for implementing method inlining in CMUCL. But for the moment, the call overhead on small matrices is not (any more) the show-stopper for Femlisp. At least, there are other bottlenecks which will have to be optimized next. Yours, Nicolas. |
|
From: Raymond T. <to...@rt...> - 2004-05-21 16:37:23
|
>>>>> "Nicolas" == Nicolas Neuss <Nic...@iw...> writes:
Nicolas> Raymond Toy <to...@rt...> writes:
Nicolas> 1. Use the CL-implemented version, if there is any, and if the matrices
Nicolas> are small.
>>
>> And the nice thing is we can even get rid of the call overhead if the
>> function could be inlined. That's a huge win for small
>> matrices/vectors.
>>
>> Ray
Nicolas> I have thought about this, but I am completely uncertain how to do such a
Nicolas> thing in a well-behaved way within ANSI CL. The best thing would probably
Nicolas> be to pay Gerd Moellmann for implementing method inlining in CMUCL. But
According to the CMUCL User's manual, Section 2.23.4 says that
effective methods can be inlined. Perhaps that is good enough.
Ray
|
|
From: Nicolas N. <Nic...@iw...> - 2004-05-24 15:20:57
|
Raymond Toy <to...@rt...> writes: > Nicolas> I have thought about this, but I am completely uncertain how to do such a > Nicolas> thing in a well-behaved way within ANSI CL. The best thing would probably > Nicolas> be to pay Gerd Moellmann for implementing method inlining in CMUCL. But > > According to the CMUCL User's manual, Section 2.23.4 says that > effective methods can be inlined. Perhaps that is good enough. > > Ray Thanks for the pointer (I had to update my CMUCL manual to read it, probably I should switch to a more recent CMUCL version). I am not sure if it helps me because of: "Please note that this form of inlining has no noticeable effect for effective methods that consist of a primary method only, which doesn't have keyword arguments. In such cases, PCL uses the primary method directly for the effective method." Since I avoided method combination for BLAS routines anyhow, this might not change much. Anyway, I think I will be able to phrase my needs more precisely when I have some working code. Yours, Nicolas. |